Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a few small fixes. The only change to the core code is for a
minor race in ALSA OSS sequencer, and the rest are all device-specific
fixes (regression fixes and a usual quirk)"
* tag 'sound-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Add quirk flag for HEM devices to enable native DSD playback
ALSA: usb-audio: Fix broken resume due to UAC3 power state
ALSA: seq: oss: Fix racy open/close of MIDI devices
ASoC: tegra: Fix Master Volume Control
ALSA: hda/realtek: Add a quirk for Compaq N14JP6
firmware: cs_dsp: Log correct region name in bin error messages
|
|
KPTI keeps around two PGDs: one for userspace and another for the
kernel. Among other things, set_pgd() contains infrastructure to
ensure that updates to the kernel PGD are reflected in the user PGD
as well.
One side-effect of this is that set_pgd() expects to be passed whole
pages. Unfortunately, init_trampoline_kaslr() passes in a single entry:
'trampoline_pgd_entry'.
When KPTI is on, set_pgd() will update 'trampoline_pgd_entry' (an
8-Byte globally stored [.bss] variable) and will then proceed to
replicate that value into the non-existent neighboring user page
(located +4k away), leading to the corruption of other global [.bss]
stored variables.
Fix it by directly assigning 'trampoline_pgd_entry' and avoiding
set_pgd().
[ dhansen: tweak subject and changelog ]
Fixes: 0925dda5962e ("x86/mm/KASLR: Use only one PUD entry for real mode trampoline")
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/all/20230614163859.924309-1-lee@kernel.org/g
|
|
The tick period is aligned very early while the first clock_event_device is
registered. At that point the system runs in periodic mode and switches
later to one-shot mode if possible.
The next wake-up event is programmed based on the aligned value
(tick_next_period) but the delta value, that is used to program the
clock_event_device, is computed based on ktime_get().
With the subtracted offset, the device fires earlier than the exact time
frame. With a large enough offset the system programs the timer for the
next wake-up and the remaining time left is too small to make any boot
progress. The system hangs.
Move the alignment later to the setup of tick_sched timer. At this point
the system switches to oneshot mode and a high resolution clocksource is
available. At this point it is safe to align tick_next_period because
ktime_get() will now return accurate (not jiffies based) time.
[bigeasy: Patch description + testing].
Fixes: e9523a0d81899 ("tick/common: Align tick period with the HZ tick.")
Reported-by: Mathias Krause <minipli@grsecurity.net>
Reported-by: "Bhatnagar, Rishabh" <risbhat@amazon.com>
Suggested-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Mathias Krause <minipli@grsecurity.net>
Acked-by: SeongJae Park <sj@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/5a56290d-806e-b9a5-f37c-f21958b5a8c0@grsecurity.net
Link: https://lore.kernel.org/12c6f9a3-d087-b824-0d05-0d18c9bc1bf3@amazon.com
Link: https://lore.kernel.org/r/20230615091830.RxMV2xf_@linutronix.de
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"This fixes a spinlock-initialization regression in SRCU that causes
the SRCU notifier to fail.
The fix simply adds the initialization, but introduces a #ifdef
because there is no spinlock to initialize for the Tiny SRCU used in
!SMP builds.
Yes, it would be nice to abstract this somehow in order to hide it in
SRCU, but I still don't see a good way of doing this"
* tag 'urgent-rcu.2023.06.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
notifier: Initialize new struct srcu_usage field
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Palmer Dabbelt:
- A documentation patch describing how we use patchwork
* tag 'riscv-for-linus-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
Documentation: RISC-V: patch-acceptance: mention patchwork's role
|
|
Drop struct acpi_thermal_flags which is not really used (only one
flag in it is ever set, but it is never read) and call
acpi_execute_simple_method() directly to evaluate _SCP instead of
using acpi_thermal_set_cooling_mode(), which has no callers after
that change, so drop it.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Drop struct acpi_thermal_state which is not really used.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Because the only drivers that cared about button fixed events take care
of those events by themselves now, eliminate the code related to them
from acpi_device_install_notify_handler() and
acpi_device_remove_notify_handler().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Rework the ACPI tiny-power-button driver to install a notify handler or
a fixed event handler for the device it binds to by itself and drop its
notify callback.
This will allow acpi_device_install_notify_handler() and
acpi_device_remove_notify_handler() to be simplified going forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since the lid handling in acpi_button_notify() is special, introduce
acpi_lid_notify() specifically for handling lid notifications.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Rework the ACPI button driver to install notify handlers or fixed
event handlers for the devices it binds to by itself, reduce the
indentation level in its notify handler routine and drop its
notify callback.
This will allow acpi_device_install_notify_handler() and
acpi_device_remove_notify_handler() to be simplified going forward
and it will allow the driver to use different notify handlers for the
lid and for the power and sleep buttons.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com>
|
|
Ensure there is no path where we might attempt to save SME state after we
flush a task by updating the SVCR register state as well as updating our
in memory state. I haven't seen a specific case where this is happening or
seen a path where it might happen but for the cost of a single low overhead
instruction it seems sensible to close the potential gap.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230607-arm64-flush-svcr-v2-1-827306001841@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Commit f38d1a6d0025 ("PM: domains: Allocate governor data dynamically
based on a genpd governor") started to use the in-parameters in
genpd_add_device(), without first doing a verification of them.
This isn't really a big problem, as most callers do a verification already.
Therefore, let's drop the verification from genpd_add_device() and make
sure all the callers take care of it instead.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: f38d1a6d0025 ("PM: domains: Allocate governor data dynamically based on a genpd governor")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
amd-pstate passive mode driver is hyphenated. So make amd-pstate active
mode driver consistent with that rename "amd_pstate_epp" to
"amd-pstate-epp".
Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currently amd_pstate sets CPPC enable bit in MSR_AMD_CPPC_ENABLE only
for the CPU where the module_init happened. But MSR_AMD_CPPC_ENABLE is
per-socket. This causes CPPC enable bit to set for only one socket for
servers with more than one physical packages. To fix this write
MSR_AMD_CPPC_ENABLE per-socket.
Also, handle duplicate calls for cppc_enable, because it's called from
per-policy/per-core callbacks and can result in duplicate MSR writes.
Before the fix:
amd@amd:~$ sudo rdmsr -a 0xc00102b1 | uniq --count
192 0
192 1
After the fix:
amd@amd:~$ sudo rdmsr -a 0xc00102b1 | uniq --count
384 1
Suggested-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In a typical VM guest, the mwait instruction is not available, leaving
only the 'hlt' instruction (which causes a VMEXIT to the host).
So for this common case, intel_idle will detect the lack of mwait, and
fail to initialize (after which another idle method would step in which
will just use hlt always).
Other (non-common) cases exist; the table below shows the before/after
for these:
+------------+--------------------------+-------------------------+
| Hypervisor | Idle method before patch | Idle method after patch |
| exposes | | |
+============+==========================+=========================+
| nothing | default_idle fallback | intel_idle VM table |
| (common) | (straight "hlt") | |
+------------+--------------------------+-------------------------+
| mwait | intel_idle mwait table | intel_idle mwait table |
+------------+--------------------------+-------------------------+
| ACPI | ACPI C1 state ("hlt") | intel_idle VM table |
+------------+--------------------------+-------------------------+
This is only applicable to CPUs known by intel_idle. For the bare metal
case, unknown CPU models will use the ACPI tables (when available) to
get estimates for latency and break even point for longer idle states.
In guests, the common case is that ACPI tables are not available, but
even when they are available, they can't and don't provide the latency
information for the longer (mwait based) states. For this scenario
(unknown CPU model), the default_idle mode (no ACPI) or ACPI C1 (ACPI
avaible) will be used.
By providing capability to do this with the intel_idle driver, we can
do better than the fallback or ACPI table methods. While this current
change only gets us to the existing behavior, later patches in this
series will add new capabilities such as optimized TLB flushing.
In order to do this, a simplified version of the initialization
function for VM guests is created, and this will be called if the CPU
is recognized, but mwait is not supported, and we're in a VM guest.
One thing to note is that the max latency (and break even) of this C1
state is higher than the typical bare metal C1 state. Because hlt causes
a vmexit, and the cost of vmexit + hypervisor overhead + vmenter is
typically in the order of upto 5 microseconds... even if the hypervisor
does not actually goes into a hardware power saving state.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
[ rjw: Dropped redundant checks from should_verify_mwait() ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After grabbing q->sysfs_lock, q->elevator may become NULL because of
elevator switch.
Fix the NULL dereference on q->elevator by checking it with lock.
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230616132354.415109-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now that the direct I/O helpers have switched to use
iov_iter_extract_pages, these helpers are unused.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230614140341.521331-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now that all block direct I/O helpers use page pinning, this flag is
unused.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230614140341.521331-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Check for -EFAULT instead of wrapping the check in an ret < 0 block.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230614140341.521331-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
copy_splice_read calls into ->read_iter to read the data, which already
calls file_accessed.
Fixes: 33b3b041543e ("splice: Add a func to do a splice from an O_DIRECT file without ITER_PIPE")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230614140341.521331-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
for-6.5/block
Pull NVMe updates from Keith:
"nvme updates for Linux 6.5
- Various cleanups all around (Irvin, Chaitanya, Christophe)
- Better struct packing (Christophe JAILLET)
- Reduce controller error logs for optional commands (Keith)
- Support for >=64KiB block sizes (Daniel Gomez)
- Fabrics fixes and code organization (Max, Chaitanya, Daniel Wagner)"
* tag 'nvme-6.5-2023-06-16' of git://git.infradead.org/nvme: (27 commits)
nvme: forward port sysfs delete fix
nvme: skip optional id ctrl csi if it failed
nvme-core: use nvme_ns_head_multipath instead of ns->head->disk
nvmet-fcloop: Do not wait on completion when unregister fails
nvme-fabrics: open code __nvmf_host_find()
nvme-fabrics: error out to unlock the mutex
nvme: Increase block size variable size to 32-bit
nvme-fcloop: no need to return from void function
nvmet-auth: remove unnecessary break after goto
nvmet-auth: remove some dead code
nvme-core: remove redundant check from nvme_init_ns_head
nvme: move sysfs code to a dedicated sysfs.c file
nvme-fabrics: prevent overriding of existing host
nvme-fabrics: check hostid using uuid_equal
nvme-fabrics: unify common code in admin and io queue connect
nvmet: reorder fields in 'struct nvmefc_fcp_req'
nvmet: reorder fields in 'struct nvme_dhchap_queue_context'
nvmet: reorder fields in 'struct nvmf_ctrl_options'
nvme: reorder fields in 'struct nvme_ctrl'
nvmet: reorder fields in 'struct nvmet_sq'
...
|
|
Commits ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC
metadata") and fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in
two") changed the ORC format. Although ORC is internal to the kernel,
it's the only way for external tools to get reliable kernel stack traces
on x86-64. In particular, the drgn debugger [1] uses ORC for stack
unwinding, and these format changes broke it [2]. As the drgn
maintainer, I don't care how often or how much the kernel changes the
ORC format as long as I have a way to detect the change.
It suffices to store a version identifier in the vmlinux and kernel
module ELF files (to use when parsing ORC sections from ELF), and in
kernel memory (to use when parsing ORC from a core dump+symbol table).
Rather than hard-coding a version number that needs to be manually
bumped, Peterz suggested hashing the definitions from orc_types.h. If
there is a format change that isn't caught by this, the hashing script
can be updated.
This patch adds an .orc_header allocated ELF section containing the
20-byte hash to vmlinux and kernel modules, along with the corresponding
__start_orc_header and __stop_orc_header symbols in vmlinux.
1: https://github.com/osandov/drgn
2: https://github.com/osandov/drgn/issues/303
Fixes: ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC metadata")
Fixes: fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lkml.kernel.org/r/aef9c8dc43915b886a8c48509a12ec1b006ca1ca.1686690801.git.osandov@osandov.com
|
|
We had a late fix that modified nvme_sysfs_delete() after the staging
branch for the next merge window relocated the function to a new file.
Port commit 2eb94dd56a4a4 ("nvme: do not let the user delete a ctrl
before a complete") to the latest to avoid a potentially confusing merge
conflict.
Cc: Maurizio Lombardi <mlombard@redhat.com>
Cc: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
kthread_park and wait_woken have a similar race that
kthread_stop and wait_woken used to have before it was fixed in
commit cb6538e740d7 ("sched/wait: Fix a kthread race with
wait_woken()"). Extend that fix to also cover kthread_park.
[jstultz: Made changes suggested by Peter to optimize
memory loads]
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20230602212350.535358-1-jstultz@google.com
|
|
All callers of set_sched_topology() are within __init section. Mark
it __init too.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20230603073645.1173332-1-linmiaohe@huawei.com
|
|
cppcheck reports
kernel/sched/fair.c:7436:17: style: Local variable 'cpu_util' shadows outer function [shadowFunction]
unsigned long cpu_util;
^
Clean this up by renaming the variable to eff_util
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20230611122535.183654-1-trix@redhat.com
|
|
When counting a FRONTEND event, the MSR_PEBS_FRONTEND is not correctly
set on GNR and MTL p-core.
The umask value for the FRONTEND events is changed on GNR and MTL. The
new umask is missing in the extra_regs[] table.
Add a dedicated intel_gnr_extra_regs[] for GNR and MTL p-core.
Fixes: bc4000fdb009 ("perf/x86/intel: Add Granite Rapids")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20230615173242.3726364-1-kan.liang@linux.intel.com
|
|
Reiji reports that the arm64 implementation of arch_perf_update_userpage()
is now ignored and replaced by the dummy stub in core code.
This seems to happen since the PMUv3 driver was moved to driver/perf.
As it turns out, dropping the __weak attribute from the *prototype*
of the function solves the problem. You're right, this doesn't seem
to make much sense. And yet... It appears that both symbols get
flagged as weak, and that the first one to appear in the link order
wins:
$ nm drivers/perf/arm_pmuv3.o|grep arch_perf_update_userpage
0000000000001db0 W arch_perf_update_userpage
Dropping the attribute from the prototype restores the expected
behaviour, and arm64 is able to enjoy arch_perf_update_userpage()
again.
Fixes: 7755cec63ade ("arm64: perf: Move PMUv3 driver to drivers/perf")
Fixes: f1ec3a517b43 ("kernel/events: Add a missing prototype for arch_perf_update_userpage()")
Reported-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Reiji Watanabe <reijiw@google.com>
Link: https://lkml.kernel.org/r/20230616114831.3186980-1-maz@kernel.org
|
|
The ${atomic}_dec_if_positive() ops are unlike all the other conditional
atomic ops. Rather than returning a boolean success value, these return
the value that the atomic variable would be updated to, even when no
update is performed.
We missed this when adding kerneldoc comments, and the documentation for
${atomic}_dec_if_positive() erroneously states:
| Return: @true if @v was updated, @false otherwise.
Ideally we'd clean this up by aligning ${atomic}_dec_if_positive() with
the usual atomic op conventions: with ${atomic}_fetch_dec_if_positive()
for those who care about the value of the varaible, and
${atomic}_dec_if_positive() returning a boolean success value.
In the mean time, align the documentation with the current reality.
Fixes: ad8110706f381170 ("locking/atomic: scripts: generate kerneldoc comments")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20230615132734.1119765-1-mark.rutland@arm.com
|
|
Move allocation code down to avoid memory leak.
Fixes: 29f54745f245 ("iommu/amd: Add missing domain type checks")
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230608021933.856045-1-suhui@nfschina.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The Arm documentation has moved to Documentation/arch/arm; update
one devicetree reference to match.
Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
Cc: devicetree@vger.kernel.org
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
The Arm documentation has moved to Documentation/arch/arm; update the
last remaining references to match.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Samuel Holland <samuel@sholland.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # for pwm
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Make corrections to punctuation and grammar.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Christoffer Dall <cdall@linaro.org>
Cc: kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-5-rdunlap@infradead.org
|
|
Correct the path of a header file.
Change "guest to ... guest" to "guest to ... host" in one place.
Hyphenate "32-bit" systems.
Add a comma at one parenthetical phrase.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: Alexander Graf <agraf@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-4-rdunlap@infradead.org
|
|
Correct grammar and punctuation.
Use "read-only" for consistency.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-3-rdunlap@infradead.org
|
|
Module parameters are in sysfs, not debugfs, so change that.
Remove superfluous "that" following "Note:".
Hyphenate "system-wide" values.
Hyphenate "trade-off".
Don't treat "denial of service" as a verb.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-2-rdunlap@infradead.org
|
|
Module parameters are located in sysfs, not debugfs, so correct the
statement.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20230610054302.6223-1-rdunlap@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
This is based on an earlier blog post at people.kernel.org,
it describes the concepts about page tables that were hardest
for me to grasp when dealing with them for the first time,
such as the prevalent three-letter acronyms pfn, pgd, p4d,
pud, pmd and pte.
I don't know if this is what people want, but it's what I would
have wanted. The wording, introduction, choice of initial subjects
and choice of style is mine.
I discussed at one point with Mike Rapoport to bring this into
the kernel documentation, so here is a small proposal.
The current form is augmented in response to feedback from
Mike Rapoport, Matthew Wilcox, Jonathan Cameron, Kuan-Ying Lee,
Randy Dunlap and Bagas Sanjaya.
Cc: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Mike Rapoport <rppt@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://people.kernel.org/linusw/arm32-page-tables
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230614072548.996940-1-linus.walleij@linaro.org
|
|
The hierarchy of PCH PIC, PCH PCI MSI and EIONTC is as following:
PCH PIC ------->|
|---->EIOINTC
PCH PCI MSI --->|
so the irq_data list of irq_desc for IRQs on PCH PIC and PCH PCI MSI
is like this:
irq_desc->irq_data(domain: PCH PIC)->parent_data(domain: EIOINTC)
irq_desc->irq_data(domain: PCH PCI MSI)->parent_data(domain: EIOINTC)
In eiointc_resume(), the irq_data passed into eiointc_set_irq_affinity()
should be matched to EIOINTC domain instead of PCH PIC or PCH PCI MSI
domain, so fix it.
Fixes: a90335c2dfb4 ("irqchip/loongson-eiointc: Add suspend/resume support")
Reported-by: yangqiming <yangqiming@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230614115936.5950-6-lvjianmin@loongson.cn
|
|
LIOINTC doesn't require specific logic to work with wakeup IRQs,
and no irq_set_wake callback is needed. To allow registered IRQs
from LIOINTC to be used as a wakeup-source, and ensure irq_set_irq_wake()
works well, the flag IRQCHIP_SKIP_SET_WAKE should be added.
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Yinbo Zhu <zhuyinbo@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230614115936.5950-5-lvjianmin@loongson.cn
|
|
For the INT_POLARITY register of Loongson-2K series IRQ
controller, '0' indicates high level or rising edge triggered,
'1' indicates low level or falling edge triggered, and we
can find out the information from the Loongson 2K1000LA User
Manual v1.0, Table 9-2, Section 9.3 (中断寄存器描述 / Description
of the Interrupt Registers).
For Loongson-3 CPU series, setting INT_POLARITY register is not
supported and writting it has no effect.
So trigger polarity setting shouled be fixed for Loongson-2K CPU
series.
Fixes: 17343d0b4039 ("irqchip/loongson-liointc: Support to set IRQ type for ACPI path")
Cc: stable@vger.kernel.org
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
Co-developed-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230614115936.5950-4-lvjianmin@loongson.cn
|
|
In DeviceTree path, when ht_vec_base is not zero, the hwirq of PCH PIC
will be assigned incorrectly. Because when pch_pic_domain_translate()
adds the ht_vec_base to hwirq, the hwirq does not have the ht_vec_base
subtracted when calling irq_domain_set_info().
The ht_vec_base is designed for the parent irq chip/domain of the PCH PIC.
It seems not proper to deal this in callbacks of the PCH PIC domain and
let's put this back like the initial commit ef8c01eb64ca ("irqchip: Add
Loongson PCH PIC controller").
Fixes: bcdd75c596c8 ("irqchip/loongson-pch-pic: Add ACPI init support")
Cc: stable@vger.kernel.org
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Liu Peibao <liupeibao@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230614115936.5950-3-lvjianmin@loongson.cn
|
|
In an ACPI-based dual-bridge system, IRQ of each bridge's
PCH PIC sent to CPU is always a zero-based number, which
means that the IRQ on PCH PIC of each bridge is mapped into
vector range from 0 to 63 of upstream irqchip(e.g. EIOINTC).
EIOINTC N: [0 ... 63 | 64 ... 255]
-------- ----------
^ ^
| |
PCH PIC N |
PCH MSI N
For example, the IRQ vector number of sata controller on
PCH PIC of each bridge is 16, which is sent to upstream
irqchip of EIOINTC when an interrupt occurs, which will set
bit 16 of EIOINTC. Since hwirq of 16 on EIOINTC has been
mapped to a irq_desc for sata controller during hierarchy
irq allocation, the related mapped IRQ will be found through
irq_resolve_mapping() in the IRQ domain of EIOINTC.
So, the IRQ number set in HT vector register should be fixed
to be a zero-based number.
Cc: stable@vger.kernel.org
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Co-developed-by: liuyun <liuyun@loongson.cn>
Signed-off-by: liuyun <liuyun@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230614115936.5950-2-lvjianmin@loongson.cn
|
|
A new function is added on HiSilicon uncore UC PMU.
The UC PMU support to filter statistical information based on
the specified tx request uring channel. Make user configuration
through "uring_channel" parameter.
Document them to provide guidance on how to use them.
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonthan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230615125926.29832-4-hejunhao3@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
On HiSilicon Hip09 platform, there are 4 UC (unified cache) modules
on each chip CCL (CPU Cluster). UC is a cache that provides
coherence between NUMA and UMA domains. It is located between L2
and Memory System. Many PMU events are supported. Let's support
the UC PMU driver using the HiSilicon uncore PMU framework.
* rd_req_en : rd_req_en is the abbreviation of read request tracetag
enable and allows user to count only read operations. Details are listed
in the hisi-pmu document at Documentation/admin-guide/perf/hisi-pmu.rst
* srcid_en & srcid: Allows users to filter statistical information based
on specific CPU/ICL by srcid.
srcid_en depends on rd_req_en being enabled.
* uring_channel: Allows users to filter statistical information based on
the specified tx request uring channel.
uring_channel only supported events: [0x47 ~ 0x59].
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230615125926.29832-3-hejunhao3@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Compared to the original PA device, H60PA offers higher bandwidth.
The H60PA is a new device and we use HID to differentiate them.
The events supported by PAv3 and PAv2 are different. The PAv3 PMU
removed some events which are supported by PAv2 PMU. The older PA
PMU driver will probe v3 as v2. Therefore PA events displayed by
"perf list" cannot work properly. We add the HISI0275 HID for PAv3
PMU to distinguish different.
For each H60PA PMU, except for the overflow interrupt register, other
functions of the H60PA PMU are the same as the original PA PMU module.
It has 8-programable counters and each counter is free-running.
Interrupt is supported to handle counter (64-bits) overflow.
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230615125926.29832-2-hejunhao3@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
* irq/lpi-resend:
: .
: Patch series from James Gowans, working around an issue with
: GICv3 LPIs that can fire concurrently on multiple CPUs.
: .
irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIs
genirq: Allow fasteoi handler to resend interrupts on concurrent handling
genirq: Expand doc for PENDING and REPLAY flags
genirq: Use BIT() for the IRQD_* state flags
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
GICv3 LPIs are impacted by an architectural design issue: they do not
have a global active state and as such a given LPI can be delivered to
a new CPU after an affinity change while the previous instance of the
same LPI handler has not yet completed on the original CPU.
If LPIs had an active state, this second LPI would not be delivered
until the first CPU deactivated the initial LPI, just like SPIs.
To solve this issue, use the newly introduced IRQD_RESEND_WHEN_IN_PROGRESS
flag, ensuring that we do not lose an LPI being delivered during that window
by getting the GIC to resend it.
This workaround gets enabled for all LPIs, including the VPE doorbells.
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: James Gowans <jgowans@amazon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: KarimAllah Raslan <karahmed@amazon.com>
Cc: Yipeng Zou <zouyipeng@huawei.com>
Cc: Zhang Jianhua <chris.zjh@huawei.com>
[maz: massaged commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230608120021.3273400-4-jgowans@amazon.com
|
|
There is a class of interrupt controllers out there that, once they
have signalled a given interrupt number, will still signal incoming
instances of the *same* interrupt despite the original interrupt
not having been EOIed yet.
As long as the new interrupt reaches the *same* CPU, nothing bad
happens, as that CPU still has its interrupts globally disabled,
and we will only take the new interrupt once the interrupt has
been EOIed.
However, things become more "interesting" if an affinity change comes
in while the interrupt is being handled. More specifically, while
the per-irq lock is being dropped. This results in the affinity change
taking place immediately. At this point, there is nothing that prevents
the interrupt from firing on the new target CPU. We end-up with the
interrupt running concurrently on two CPUs, which isn't a good thing.
And that's where things become worse: the new CPU notices that the
interrupt handling is in progress (irq_may_run() return false), and
*drops the interrupt on the floor*.
The whole race looks like this:
CPU 0 | CPU 1
-----------------------------|-----------------------------
interrupt start |
handle_fasteoi_irq | set_affinity(CPU 1)
handler |
... | interrupt start
... | handle_fasteoi_irq -> early out
handle_fasteoi_irq return | interrupt end
interrupt end |
If the interrupt was an edge, too bad. The interrupt is lost, and
the system will eventually die one way or another. Not great.
A way to avoid this situation is to detect this problem at the point
we handle the interrupt on the new target. Instead of dropping the
interrupt, use the resend mechanism to force it to be replayed.
Also, in order to limit the impact of this workaround to the pathetic
architectures that require it, gate it behind a new irq flag aptly
named IRQD_RESEND_WHEN_IN_PROGRESS.
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: James Gowans <jgowans@amazon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: KarimAllah Raslan <karahmed@amazon.com>
Cc: Yipeng Zou <zouyipeng@huawei.com>
Cc: Zhang Jianhua <chris.zjh@huawei.com>
[maz: reworded commit mesage]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230608120021.3273400-3-jgowans@amazon.com
|