Age | Commit message (Collapse) | Author |
|
* kvm-arm64/nv-trap-forwarding: (30 commits)
: .
: This implements the so called "trap forwarding" infrastructure, which
: gets used when we take a trap from an L2 guest and that the L1 guest
: wants to see the trap for itself.
: .
KVM: arm64: nv: Add trap description for SPSR_EL2 and ELR_EL2
KVM: arm64: nv: Select XARRAY_MULTI to fix build error
KVM: arm64: nv: Add support for HCRX_EL2
KVM: arm64: Move HCRX_EL2 switch to load/put on VHE systems
KVM: arm64: nv: Expose FGT to nested guests
KVM: arm64: nv: Add switching support for HFGxTR/HDFGxTR
KVM: arm64: nv: Expand ERET trap forwarding to handle FGT
KVM: arm64: nv: Add SVC trap forwarding
KVM: arm64: nv: Add trap forwarding for HDFGxTR_EL2
KVM: arm64: nv: Add trap forwarding for HFGITR_EL2
KVM: arm64: nv: Add trap forwarding for HFGxTR_EL2
KVM: arm64: nv: Add fine grained trap forwarding infrastructure
KVM: arm64: nv: Add trap forwarding for CNTHCTL_EL2
KVM: arm64: nv: Add trap forwarding for MDCR_EL2
KVM: arm64: nv: Expose FEAT_EVT to nested guests
KVM: arm64: nv: Add trap forwarding for HCR_EL2
KVM: arm64: nv: Add trap forwarding infrastructure
KVM: arm64: Restructure FGT register switching
KVM: arm64: nv: Add FGT registers
KVM: arm64: Add missing HCR_EL2 trap bits
...
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Having carved a hole for SP_EL1, we are now missing the entries
for SPSR_EL2 and ELR_EL2. Add them back.
Reported-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
populate_nv_trap_config() uses xa_store_range(), which is only built
when XARRAY_MULTI is set, so select that symbol to prevent the build error.
aarch64-linux-ld: arch/arm64/kvm/emulate-nested.o: in function `populate_nv_trap_config':
emulate-nested.c:(.init.text+0x17c): undefined reference to `xa_store_range'
Fixes: e58ec47bf68d ("KVM: arm64: nv: Add trap forwarding infrastructure")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.linux.dev
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230816210949.17117-1-rdunlap@infradead.org
|
|
HCRX_EL2 has an interesting effect on HFGITR_EL2, as it conditions
the traps of TLBI*nXS.
Expand the FGT support to add a new Fine Grained Filter that will
get checked when the instruction gets trapped, allowing the shadow
register to override the trap as needed.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-29-maz@kernel.org
|
|
Although the nVHE behaviour requires HCRX_EL2 to be switched
on each switch between host and guest, there is nothing in
this register that would affect a VHE host.
It is thus possible to save/restore this register on load/put
on VHE systems, avoiding unnecessary sysreg access on the hot
path. Additionally, it avoids unnecessary traps when running
with NV.
To achieve this, simply move the read/writes to the *_common()
helpers, which are called on load/put on VHE, and more eagerly
on nVHE.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-28-maz@kernel.org
|
|
Now that we have FGT support, expose the feature to NV guests.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-27-maz@kernel.org
|
|
Now that we can evaluate the FGT registers, allow them to be merged
with the hypervisor's own configuration (in the case of HFG{RW}TR_EL2)
or simply set for HFGITR_EL2, HDGFRTR_EL2 and HDFGWTR_EL2.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-26-maz@kernel.org
|
|
We already handle ERET being trapped from a L1 guest in hyp context.
However, with FGT, we can also have ERET being trapped from L2, and
this needs to be reinjected into L1.
Add the required exception routing.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-25-maz@kernel.org
|
|
HFGITR_EL2 allows the trap of SVC instructions to EL2. Allow these
traps to be forwarded. Take this opportunity to deny any 32bit activity
when NV is enabled.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-24-maz@kernel.org
|
|
... and finally, the Debug version of FGT, with its *enormous*
list of trapped registers.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-23-maz@kernel.org
|
|
Similarly, implement the trap forwarding for instructions affected
by HFGITR_EL2.
Note that the TLBI*nXS instructions should be affected by HCRX_EL2,
which will be dealt with down the line. Also, ERET* and SVC traps
are handled separately.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-22-maz@kernel.org
|
|
Implement the trap forwarding for traps described by HFGxTR_EL2,
reusing the Fine Grained Traps infrastructure previously implemented.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-21-maz@kernel.org
|
|
Fine Grained Traps are fun. Not.
Implement the fine grained trap forwarding, reusing the Coarse Grained
Traps infrastructure previously implemented.
Each sysreg/instruction inserted in the xarray gets a FGT group
(vaguely equivalent to a register number), a bit number in that register,
and a polarity.
It is then pretty easy to check the FGT state at handling time, just
like we do for the coarse version (it is just faster).
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-20-maz@kernel.org
|
|
Describe the CNTHCTL_EL2 register, and associate it with all the sysregs
it allows to trap.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-19-maz@kernel.org
|
|
Describe the MDCR_EL2 register, and associate it with all the sysregs
it allows to trap.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-18-maz@kernel.org
|
|
Now that we properly implement FEAT_EVT (as we correctly forward
traps), expose it to guests.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-17-maz@kernel.org
|
|
Describe the HCR_EL2 register, and associate it with all the sysregs
it allows to trap.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-16-maz@kernel.org
|
|
A significant part of what a NV hypervisor needs to do is to decide
whether a trap from a L2+ guest has to be forwarded to a L1 guest
or handled locally. This is done by checking for the trap bits that
the guest hypervisor has set and acting accordingly, as described by
the architecture.
A previous approach was to sprinkle a bunch of checks in all the
system register accessors, but this is pretty error prone and doesn't
help getting an overview of what is happening.
Instead, implement a set of global tables that describe a trap bit,
combinations of trap bits, behaviours on trap, and what bits must
be evaluated on a system register trap.
Although this is painful to describe, this allows to specify each
and every control bit in a static manner. To make it efficient,
the table is inserted in an xarray that is global to the system,
and checked each time we trap a system register while running
a L2 guest.
Add the basic infrastructure for now, while additional patches will
implement configuration registers.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-15-maz@kernel.org
|
|
As we're about to majorly extend the handling of FGT registers,
restructure the code to actually save/restore the registers
as required. This is made easy thanks to the previous addition
of the EL2 registers, allowing us to use the host context for
this purpose.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-14-maz@kernel.org
|
|
Add the 5 registers covering FEAT_FGT. The AMU-related registers
are currently left out as we don't have a plan for them. Yet.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-13-maz@kernel.org
|
|
We're still missing a handfull of HCR_EL2 trap bits. Add them.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-12-maz@kernel.org
|
|
As we blindly reset some HFGxTR_EL2 bits to 0, we also randomly trap
unsuspecting sysregs that have their trap bits with a negative
polarity.
ACCDATA_EL1 is one such register that can be accessed by the guest,
causing a splat on the host as we don't have a proper handler for
it.
Adding such handler addresses the issue, though there are a number
of other registers missing as the current architecture documentation
doesn't describe them yet.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-11-maz@kernel.org
|
|
In order to allow us to have shared code for managing fine grained traps
for KVM guests add it as a detected feature rather than relying on it
being a dependency of other features.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
[maz: converted to ARM64_CPUID_FIELDS()]
Link: https://lore.kernel.org/r/20230301-kvm-arm64-fgt-v4-1-1bf8d235ac1f@kernel.org
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-10-maz@kernel.org
|
|
As we're about to implement full support for FEAT_FGT, add the
full HDFGRTR_EL2 and HDFGWTR_EL2 layouts.
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-9-maz@kernel.org
|
|
HFGITR_EL2 traps a bunch of instructions for which we don't have
encodings yet. Add them.
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-8-maz@kernel.org
|
|
The HDFGxTR_EL2 registers trap a (huge) set of debug and trace
related registers. Add their encodings (and only that, because
we really don't care about what these registers actually do at
this stage).
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-7-maz@kernel.org
|
|
Add the encodings for the AT operation that are usable from NS.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-6-maz@kernel.org
|
|
Add all the TLBI encodings that are usable from Non-Secure.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-5-maz@kernel.org
|
|
Add the missing DC *VA encodings.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-4-maz@kernel.org
|
|
We only describe a few of the ERX*_EL1 registers. Add the missing
ones (ERXPFGF_EL1, ERXPFGCTL_EL1, ERXPFGCDN_EL1, ERXMISC2_EL1 and
ERXMISC3_EL1).
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-3-maz@kernel.org
|
|
Add the missing VA-based CMOs encodings.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-2-maz@kernel.org
|
|
* kvm-arm64/6.6/generic-vcpu:
: .
: Cleanup the obsolete vcpu target abstraction, courtesy of Oliver.
: From the cover letter:
:
: "kvm_vcpu_init::target is quite useless at this point. We don't do any
: uarch-specific emulation in the first place, and require userspace
: select the 'generic' vCPU target on all but a few implementations.
:
: Small series to (1) clean up usage of the target value in the kernel and
: (2) switch to the 'generic' target on implementations that previously
: had their own target values. The implementation-specific values are
: still tolerated, though, to avoid UAPI breakage."
: .
KVM: arm64: Always return generic v8 as the preferred target
KVM: arm64: Replace vCPU target with a configuration flag
KVM: arm64: Remove pointless check for changed init target
KVM: arm64: Delete pointless switch statement in kvm_reset_vcpu()
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Swapping the ring buffer for snapshotting (for things like irqsoff)
can crash if the ring buffer is being resized. Disable swapping when
this happens. The missed swap will be reported to the tracer
- Report error if the histogram fails to be created due to an error in
adding a histogram variable, in event_hist_trigger_parse()
- Remove unused declaration of tracing_map_set_field_descr()
* tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/histograms: Return an error if we fail to add histogram to hist_vars list
ring-buffer: Do not swap cpu_buffer during resize process
tracing: Remove unused extern declaration tracing_map_set_field_descr()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix stale help text in gconfig
- Support *.S files in compile_commands.json
- Flatten KBUILD_CFLAGS
- Fix external module builds with Rust so that temporary files are
created in the modules directories instead of the kernel tree
* tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: rust: avoid creating temporary files
kbuild: flatten KBUILD_CFLAGS
gen_compile_commands: add assembly files to compilation database
kconfig: gconfig: correct program name in help text
kconfig: gconfig: drop the Show Debug Info help text
|
|
`rustc` outputs by default the temporary files (i.e. the ones saved
by `-Csave-temps`, such as `*.rcgu*` files) in the current working
directory when `-o` and `--out-dir` are not given (even if
`--emit=x=path` is given, i.e. it does not use those for temporaries).
Since out-of-tree modules are compiled from the `linux` tree,
`rustc` then tries to create them there, which may not be accessible.
Thus pass `--out-dir` explicitly, even if it is just for the temporary
files.
Similarly, do so for Rust host programs too.
Reported-by: Raphael Nestler <raphael.nestler@gmail.com>
Closes: https://github.com/Rust-for-Linux/linux/issues/1015
Reported-by: Andrea Righi <andrea.righi@canonical.com>
Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs
Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs
Fixes: 295d8398c67e ("kbuild: specify output names separately for each emission type from rustc")
Cc: stable@vger.kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Tested-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Avoid pKVM finalization if KVM initialization fails
- Add missing BTI instructions in the hypervisor, fixing an early
boot failure on BTI systems
- Handle MMU notifiers correctly for non hugepage-aligned memslots
- Work around a bug in the architecture where hypervisor timer
controls have UNKNOWN behavior under nested virt
- Disable preemption in kvm_arch_hardware_enable(), fixing a kernel
BUG in cpu hotplug resulting from per-CPU accessor sanity checking
- Make WFI emulation on GICv4 systems robust w.r.t. preemption,
consistently requesting a doorbell interrupt on vcpu_put()
- Uphold RES0 sysreg behavior when emulating older PMU versions
- Avoid macro expansion when initializing PMU register names,
ensuring the tracepoints pretty-print the sysreg
s390:
- Two fixes for asynchronous destroy
x86 fixes will come early next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: pv: fix index value of replaced ASCE
KVM: s390: pv: simplify shutdown and fix race
KVM: arm64: Fix the name of sys_reg_desc related to PMU
KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
KVM: arm64: Add missing BTI instructions
KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
checkpoint code"
* tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
ext4: correct inline offset when handling xattrs in inode body
jbd2: remove __journal_try_to_free_buffer()
jbd2: fix a race when checking checkpoint buffer busy
jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
jbd2: remove journal_clean_one_cp_list()
jbd2: remove t_checkpoint_io_list
jbd2: recheck chechpointing non-dirty buffer
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client fix from Steve French:
"Add minor debugging improvement.
The change improves ability to read a network trace to debug problems
on encrypted connections which are very common (e.g. using wireshark
or tcpdump).
That works today with tools like 'smbinfo keys /mnt/file' but requires
passing in a filename on the mount (see e.g. [1]), but it often makes
more sense to just pass in the mount point path (ie a directory not a
filename).
So this fix was needed to debug some types of problems (an obvious
example is on an encrypted connection failing operations on an empty
share or with no files in the root of the directory) - so you can
simply pass in the 'smbinfo keys <mntpoint>' and get the information
that wireshark needs"
Link: https://wiki.samba.org/index.php/Wireshark_Decryption [1]
* tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number for cifs.ko
cifs: allow dumping keys for directories too
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
Two fixes for asynchronous destroy
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.5, part #1
- Avoid pKVM finalization if KVM initialization fails
- Add missing BTI instructions in the hypervisor, fixing an early boot
failure on BTI systems
- Handle MMU notifiers correctly for non hugepage-aligned memslots
- Work around a bug in the architecture where hypervisor timer controls
have UNKNOWN behavior under nested virt.
- Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG
in cpu hotplug resulting from per-CPU accessor sanity checking.
- Make WFI emulation on GICv4 systems robust w.r.t. preemption,
consistently requesting a doorbell interrupt on vcpu_put()
- Uphold RES0 sysreg behavior when emulating older PMU versions
- Avoid macro expansion when initializing PMU register names, ensuring
the tracepoints pretty-print the sysreg.
|
|
list
Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if
they have referenced variables") added a check to fail histogram creation
if save_hist_vars() failed to add histogram to hist_vars list. But the
commit failed to set ret to failed return code before jumping to
unregister histogram, fix it.
Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com
Cc: stable@vger.kernel.org
Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables")
Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
When ring_buffer_swap_cpu was called during resize process,
the cpu buffer was swapped in the middle, resulting in incorrect state.
Continuing to run in the wrong state will result in oops.
This issue can be easily reproduced using the following two scripts:
/tmp # cat test1.sh
//#! /bin/sh
for i in `seq 0 100000`
do
echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
sleep 0.5
echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
sleep 0.5
done
/tmp # cat test2.sh
//#! /bin/sh
for i in `seq 0 100000`
do
echo irqsoff > /sys/kernel/debug/tracing/current_tracer
sleep 1
echo nop > /sys/kernel/debug/tracing/current_tracer
sleep 1
done
/tmp # ./test1.sh &
/tmp # ./test2.sh &
A typical oops log is as follows, sometimes with other different oops logs.
[ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
[ 231.713375] Modules linked in:
[ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15
[ 231.716750] Hardware name: linux,dummy-virt (DT)
[ 231.718152] Workqueue: events update_pages_handler
[ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 231.721171] pc : rb_update_pages+0x378/0x3f8
[ 231.722212] lr : rb_update_pages+0x25c/0x3f8
[ 231.723248] sp : ffff800082b9bd50
[ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
[ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
[ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
[ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
[ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
[ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
[ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
[ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
[ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
[ 231.744196] Call trace:
[ 231.744892] rb_update_pages+0x378/0x3f8
[ 231.745893] update_pages_handler+0x1c/0x38
[ 231.746893] process_one_work+0x1f0/0x468
[ 231.747852] worker_thread+0x54/0x410
[ 231.748737] kthread+0x124/0x138
[ 231.749549] ret_from_fork+0x10/0x20
[ 231.750434] ---[ end trace 0000000000000000 ]---
[ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 233.721696] Mem abort info:
[ 233.721935] ESR = 0x0000000096000004
[ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits
[ 233.722596] SET = 0, FnV = 0
[ 233.722805] EA = 0, S1PTW = 0
[ 233.723026] FSC = 0x04: level 0 translation fault
[ 233.723458] Data abort info:
[ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
[ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 233.726720] Modules linked in:
[ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15
[ 233.727777] Hardware name: linux,dummy-virt (DT)
[ 233.728225] Workqueue: events update_pages_handler
[ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 233.729054] pc : rb_update_pages+0x1a8/0x3f8
[ 233.729334] lr : rb_update_pages+0x154/0x3f8
[ 233.729592] sp : ffff800082b9bd50
[ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
[ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
[ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
[ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
[ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
[ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
[ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
[ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
[ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
[ 233.734418] Call trace:
[ 233.734593] rb_update_pages+0x1a8/0x3f8
[ 233.734853] update_pages_handler+0x1c/0x38
[ 233.735148] process_one_work+0x1f0/0x468
[ 233.735525] worker_thread+0x54/0x410
[ 233.735852] kthread+0x124/0x138
[ 233.736064] ret_from_fork+0x10/0x20
[ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
[ 233.736959] ---[ end trace 0000000000000000 ]---
After analysis, the seq of the error is as follows [1-5]:
int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
int cpu_id)
{
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//1. get cpu_buffer, aka cpu_buffer(A)
...
...
schedule_work_on(cpu,
&cpu_buffer->update_pages_work);
//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
// update_pages_handler, do the update process, set 'update_done' in
// complete(&cpu_buffer->update_done) and to wakeup resize process.
//---->
//3. Just at this moment, ring_buffer_swap_cpu is triggered,
//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
//ring_buffer_swap_cpu is called as the 'Call trace' below.
Call trace:
dump_backtrace+0x0/0x2f8
show_stack+0x18/0x28
dump_stack+0x12c/0x188
ring_buffer_swap_cpu+0x2f8/0x328
update_max_tr_single+0x180/0x210
check_critical_timing+0x2b4/0x2c8
tracer_hardirqs_on+0x1c0/0x200
trace_hardirqs_on+0xec/0x378
el0_svc_common+0x64/0x260
do_el0_svc+0x90/0xf8
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb8
el0_sync+0x180/0x1c0
//<----
/* wait for all the updates to complete */
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
//for example, cpu_buffer(A)->update_done will leave be set 1, and will
//not 'wait_for_completion' at the next resize round.
if (!cpu_buffer->nr_pages_to_update)
continue;
if (cpu_online(cpu))
wait_for_completion(&cpu_buffer->update_done);
cpu_buffer->nr_pages_to_update = 0;
}
...
}
//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
//Continuing to run in the wrong state, then oops occurs.
Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn
Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Since commit 08d43a5fa063 ("tracing: Add lock-free tracing_map"),
this is never used, so can be removed.
Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Make it slightly easier to see which compiler options are added and
removed (and not worry about column limit too!).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Like C source files, tooling can find it useful to have the assembly
source file compilation recorded.
The .S extension appears to used across all architectures.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
During allocations, while looking for preallocations(PA) in the per
inode rbtree, we can't do a direct traversal of the tree because
ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
and that can cause direct traversal to skip some entries. This was
leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
our request and ultimately tried to create a new PA that would overlap
with the missed one.
To makes sure we handle that case while still keeping the performance of
the rbtree, we make use of the fact that the only pa that could possibly
overlap the original goal start is the one that satisfies the below
conditions:
1. It must have it's logical start immediately to the left of
(ie less than) original logical start.
2. It must not be deleted
To find this pa we use the following traversal method:
1. Descend into the rbtree normally to find the immediate neighboring
PA. Here we keep descending irrespective of if the PA is deleted or if
it overlaps with our request etc. The goal is to find an immediately
adjacent PA.
2. If the found PA is on right of original goal, use rb_prev() to find
the left adjacent PA.
3. Check if this PA is deleted and keep moving left with rb_prev() until
a non deleted PA is found.
4. This is the PA we are looking for. Now we can check if it can satisfy
the original request and proceed accordingly.
This approach also takes care of having deleted PAs in the tree.
(While we are at it, also fix a possible overflow bug in calculating the
end of a PA)
[1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/
Cc: stable@kernel.org # 6.4
Fixes: 3872778664e3 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
In ext4_mb_choose_next_group_best_avail(), we want the start order to be
1 less than goal length and the min_order to be, at max, 1 more than the
original length. This commit fixes an off by one issue that arose due to
the fact that 1 << fls(n) > (n).
After all the processing:
order = 1 order below goal len
min_order = maximum of the three:-
- order - trim_order
- 1 order below B2C(s_stripe)
- 1 order above original len
Cc: stable@kernel.org
Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
When run on a file system where the inline_data feature has been
enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
to emit error messages indicating that inline directory entries are
corrupted. This occurs because the inline offset used to locate
inline directory entries in the inode body is not updated when an
xattr in that shared region is deleted and the region is shifted in
memory to recover the space it occupied. If the deleted xattr precedes
the system.data attribute, which points to the inline directory entries,
that attribute will be moved further up in the region. The inline
offset continues to point to whatever is located in system.data's former
location, with unfortunate effects when used to access directory entries
or (presumably) inline data in the inode body.
Cc: stable@kernel.org
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Reinstate support for little endian ELFv1 binaries, which it turns
out still exist in the wild.
- Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it
lead to dead code generation and seemed to trigger compiler bugs in
some edge cases.
- Fix a deadlock in the pseries VAS code, between live migration and
the driver's mmap handler.
- Disable KCOV instrumentation in the powerpc KASAN code.
Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren
Myneni, Russell Currey, and Uwe Kleine-König.
* tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"
powerpc/kasan: Disable KCOV in KASAN code
powerpc/512x: lpbfifo: Convert to platform remove callback returning void
powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files
Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"
powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
|