Age | Commit message (Collapse) | Author |
|
Building kvm module out-of-source with,
make -C $SRC O=$BIN M=arch/x86/kvm
fails to find "irq.h" as the include dir passed to cflags-y does not
prefix the source dir. Fix this by prefixing $(srctree) to the include
dir path.
Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
Message-Id: <20210324124347.18336-1-sidcha@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
X86_FEATURE_PERFCTR_CORE
MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
amd_pmu_set_msr() doesn't fail.
In case of a counter (CTRn), no big harm is done as we only increase
internal PMC's value but in case of an eventsel (CTLn), we go deep into
perf internals with a non-existing counter.
Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
and this also seems to contradict architectural behavior which is #GP
(I did check one old Opteron host) but changing this status quo is a bit
scarier.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210323084515.1346540-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
kvm_write_tsc() was renamed and made static since commit 0c899c25d754
("KVM: x86: do not attempt TSC synchronization on guest writes"). Remove
its unused declaration.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20210326070334.12310-1-dongli.zhang@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
kvm_msr_ignored_check function never uses vcpu argument. Clean up the
function and invokers.
Signed-off-by: Haiwei Li <lihaiwei@tencent.com>
Message-Id: <20210313051032.4171-1-lihaiwei.kernel@gmail.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.12, take #3
- Fix GICv3 MMIO compatibility probing
- Prevent guests from using the ARMv8.4 self-hosted tracing extension
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fix from Thomas Bogendoerfer:
- Fix compile error with option MIPS_ELF_APPENDED_DTB
* tag 'mips-fixes_5.12_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: kernel: setup.c: fix compilation error
|
|
With ath79_defconfig enabling CONFIG_MIPS_ELF_APPENDED_DTB gives a
compilation error. This patch fixes it.
Build log:
...
CC kernel/locking/percpu-rwsem.o
../arch/mips/kernel/setup.c:46:39: error: conflicting types for
'__appended_dtb'
const char __section(".appended_dtb") __appended_dtb[0x100000];
^~~~~~~~~~~~~~
In file included from ../arch/mips/kernel/setup.c:34:
../arch/mips/include/asm/bootinfo.h:118:13: note: previous declaration
of '__appended_dtb' was here
extern char __appended_dtb[];
^~~~~~~~~~~~~~
CC fs/attr.o
make[4]: *** [../scripts/Makefile.build:271: arch/mips/kernel/setup.o]
Error 1
...
Root cause seems to be:
Fixes: b83ba0b9df56 ("MIPS: of: Introduce helper function to get DTB")
Signed-off-by: Mauri Sandberg <sandberg@mailfence.com>
Reviewed-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: trivial@kernel.org
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Setting the vmmc supplies is crucial since otherwise the supplying
regulators get disabled and the SD interfaces are no longer powered
which leads to system failures if the system is booted from that SD
interface.
Fixes: 1e44d3f880d5 ("ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module")
Signed-off-by: Stefan Riedmueller <s.riedmueller@phytec.de>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
To prevent another incidental removal of the IRQ2 ignore logic in the
IO/APIC code going unnoticed add a sanity check. Add some commentry at the
other place which ignores IRQ2 while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210318192819.795280387@linutronix.de
|
|
Pull xtensa fixes from Max Filippov:
- fix build with separate exception vectors when they are placed too
far from the rest of the kernel
- fix uaccess-related livelock in do_page_fault.
* tag 'xtensa-20210329' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix uaccess-related livelock in do_page_fault
xtensa: move coprocessor_flush to the .text section
|
|
If a uaccess (e.g. get_user()) triggers a fault and there's a
fault signal pending, the handler will return to the uaccess without
having performed a uaccess fault fixup, and so the CPU will immediately
execute the uaccess instruction again, whereupon it will livelock
bouncing between that instruction and the fault handler.
https://lore.kernel.org/lkml/20210121123140.GD48431@C02TD0UTHF1T.local/
Cc: stable@vger.kernel.org
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
|
|
The following problem has been reported by George Kennedy:
Since commit 7fef431be9c9 ("mm/page_alloc: place pages to tail
in __free_pages_core()") the following use after free occurs
intermittently when ACPI tables are accessed.
BUG: KASAN: use-after-free in ibft_init+0x134/0xc49
Read of size 4 at addr ffff8880be453004 by task swapper/0/1
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1-7a7fd0d #1
Call Trace:
dump_stack+0xf6/0x158
print_address_description.constprop.9+0x41/0x60
kasan_report.cold.14+0x7b/0xd4
__asan_report_load_n_noabort+0xf/0x20
ibft_init+0x134/0xc49
do_one_initcall+0xc4/0x3e0
kernel_init_freeable+0x5af/0x66b
kernel_init+0x16/0x1d0
ret_from_fork+0x22/0x30
ACPI tables mapped via kmap() do not have their mapped pages
reserved and the pages can be "stolen" by the buddy allocator.
Apparently, on the affected system, the ACPI table in question is
not located in "reserved" memory, like ACPI NVS or ACPI Data, that
will not be used by the buddy allocator, so the memory occupied by
that table has to be explicitly reserved to prevent the buddy
allocator from using it.
In order to address this problem, rearrange the initialization of the
ACPI tables on x86 to locate the initial tables earlier and reserve
the memory occupied by them.
The other architectures using ACPI should not be affected by this
change.
Link: https://lore.kernel.org/linux-acpi/1614802160-29362-1-git-send-email-george.kennedy@oracle.com/
Reported-by: George Kennedy <george.kennedy@oracle.com>
Tested-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
|
|
Before this patch, someone who wants to use VMAP_STACK when
KASAN_GENERIC enabled must explicitly select KASAN_VMALLOC.
>From Will's suggestion [1]:
> I would _really_ like to move to VMAP stack unconditionally, and
> that would effectively force KASAN_VMALLOC to be set if KASAN is in use
Because VMAP_STACK now depends on either HW_TAGS or KASAN_VMALLOC if
KASAN enabled, in order to make VMAP_STACK selected unconditionally,
we bind KANSAN_GENERIC and KASAN_VMALLOC together.
Note that SW_TAGS supports neither VMAP_STACK nor KASAN_VMALLOC now,
so this is the first step to make VMAP_STACK selected unconditionally.
Bind KANSAN_GENERIC and KASAN_VMALLOC together is supposed to cost more
memory at runtime, thus the alternative is using SW_TAGS KASAN instead.
[1]: https://lore.kernel.org/lkml/20210204150100.GE20815@willie-the-truck/
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Link: https://lore.kernel.org/r/20210324040522.15548-6-lecopzer.chen@mediatek.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
After KASAN_VMALLOC works in arm64, we can randomize module region
into vmalloc area now.
Test:
VMALLOC area ffffffc010000000 fffffffdf0000000
before the patch:
module_alloc_base/end ffffffc008b80000 ffffffc010000000
after the patch:
module_alloc_base/end ffffffdcf4bed000 ffffffc010000000
And the function that insmod some modules is fine.
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Link: https://lore.kernel.org/r/20210324040522.15548-5-lecopzer.chen@mediatek.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
We can backed shadow memory in vmalloc area after vmalloc area
isn't populated at kasan_init(), thus make KASAN_VMALLOC selectable.
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210324040522.15548-4-lecopzer.chen@mediatek.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Arm64 provides defined macro for KERNEL_START and KERNEL_END,
thus replace them by the abstration instead of using _text and _end.
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210324040522.15548-3-lecopzer.chen@mediatek.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Linux support KAsan for VMALLOC since commit 3c5c3cfb9ef4da9
("kasan: support backing vmalloc space with real shadow memory")
Like how the MODULES_VADDR does now, just not to early populate
the VMALLOC_START between VMALLOC_END.
Before:
MODULE_VADDR: no mapping, no zero shadow at init
VMALLOC_VADDR: backed with zero shadow at init
After:
MODULE_VADDR: no mapping, no zero shadow at init
VMALLOC_VADDR: no mapping, no zero shadow at init
Thus the mapping will get allocated on demand by the core function
of KASAN_VMALLOC.
----------- vmalloc_shadow_start
| |
| |
| | <= non-mapping
| |
| |
|-----------|
|///////////|<- kimage shadow with page table mapping.
|-----------|
| |
| | <= non-mapping
| |
------------- vmalloc_shadow_end
|00000000000|
|00000000000| <= Zero shadow
|00000000000|
------------- KASAN_SHADOW_END
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210324040522.15548-2-lecopzer.chen@mediatek.com
[catalin.marinas@arm.com: add a build check on VMALLOC_START != MODULES_END]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In commit eb631bb5bf5b
("arm64: Support arch_irq_work_raise() via self IPIs") a new
function "arch_irq_work_raise" was added without a prototype.
In commit d914d4d49745
("arm64: Implement panic_smp_self_stop()") a new
function "panic_smp_self_stop" was added without a prototype.
We get the following warnings on W=1:
arch/arm64/kernel/smp.c:842:6: warning: no previous prototype
for ‘arch_irq_work_raise’ [-Wmissing-prototypes]
arch/arm64/kernel/smp.c:862:6: warning: no previous prototype
for ‘panic_smp_self_stop’ [-Wmissing-prototypes]
Fix the warnings by:
1. Adding the prototype for 'arch_irq_work_raise' in irq_work.h
2. Adding the prototype for 'panic_smp_self_stop' in smp.h
Signed-off-by: Chen Lifu <chenlifu@huawei.com>
Link: https://lore.kernel.org/r/20210329034343.183974-1-chenlifu@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Fix address of the pad control register
(IOMUXC_SW_PAD_CTL_PAD_SD1_DATA0) for SD1_DATA0_GPIO2_IO2. This seems
to be a typo but it leads to an exception when pinctrl is applied due to
wrong memory address access.
Signed-off-by: Oliver Stäbler <oliver.staebler@bytesatwork.ch>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Fixes: c1c9d41319c3 ("dt-bindings: imx: Add pinctrl binding doc for imx8mm")
Fixes: 748f908cc882 ("arm64: add basic DTS for i.MX8MQ")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Bus locks degrade performance for the whole system, not just for the CPU
that requested the bus lock. Two CPU features "#AC for split lock" and
"#DB for bus lock" provide hooks so that the operating system may choose
one of several mitigation strategies.
#AC for split lock is already implemented. Add code to use the #DB for
bus lock feature to cover additional situations with new options to
mitigate.
split_lock_detect=
#AC for split lock #DB for bus lock
off Do nothing Do nothing
warn Kernel OOPs Warn once per task and
Warn once per task and and continues to run.
disable future checking
When both features are
supported, warn in #AC
fatal Kernel OOPs Send SIGBUS to user.
Send SIGBUS to user
When both features are
supported, fatal in #AC
ratelimit:N Do nothing Limit bus lock rate to
N per second in the
current non-root user.
Default option is "warn".
Hardware only generates #DB for bus lock detect when CPL>0 to avoid
nested #DB from multiple bus locks while the first #DB is being handled.
So no need to handle #DB for bus lock detected in the kernel.
#DB for bus lock is enabled by bus lock detection bit 2 in DEBUGCTL MSR
while #AC for split lock is enabled by split lock detection bit 29 in
TEST_CTRL MSR.
Both breakpoint and bus lock in the same instruction can trigger one #DB.
The bus lock is handled before the breakpoint in the #DB handler.
Delivery of #DB for bus lock in userspace clears DR6[11], which is set by
the #DB handler right after reading DR6.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210322135325.682257-3-fenghua.yu@intel.com
|
|
A bus lock is acquired through either a split locked access to writeback
(WB) memory or any locked access to non-WB memory. This is typically >1000
cycles slower than an atomic operation within a cache line. It also
disrupts performance on other cores.
Some CPUs have the ability to notify the kernel by a #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel to
enforce user application throttling or mitigation. Both breakpoint and bus
lock can trigger the #DB trap in the same instruction and the ordering of
handling them is the kernel #DB handler's choice.
The CPU feature flag to be shown in /proc/cpuinfo will be "bus_lock_detect".
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210322135325.682257-2-fenghua.yu@intel.com
|
|
cpu_current_top_of_stack is currently stored in TSS.sp1. TSS is exposed
through the cpu_entry_area which is visible with user CR3 when PTI is
enabled and active.
This makes it a coveted fruit for attackers. An attacker can fetch the
kernel stack top from it and continue next steps of actions based on the
kernel stack.
But it is actualy not necessary to be stored in the TSS. It is only
accessed after the entry code switched to kernel CR3 and kernel GS_BASE
which means it can be in any regular percpu variable.
The reason why it is in TSS is historical (pre PTI) because TSS is also
used as scratch space in SYSCALL_64 and therefore cache hot.
A syscall also needs the per CPU variable current_task and eventually
__preempt_count, so placing cpu_current_top_of_stack next to them makes it
likely that they end up in the same cache line which should avoid
performance regressions. This is not enforced as the compiler is free to
place these variables, so these entry relevant variables should move into
a data structure to make this enforceable.
The seccomp_benchmark doesn't show any performance loss in the "getpid
native" test result. Actually, the result changes from 93ns before to 92ns
with this change when KPTI is disabled. The test is very stable and
although the test doesn't show a higher degree of precision it gives enough
confidence that moving cpu_current_top_of_stack does not cause a
regression.
[ tglx: Removed unneeded export. Massaged changelog ]
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210125173444.22696-2-jiangshanlai@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two fixes:
- Fix build failure on Ubuntu with new GCC packages that turn
on -fcf-protection
- Fix SME memory encryption PTE encoding bug - AFAICT the code
worked on 4K page sizes (level 1) but had the wrong shift at
higher page level orders (level 2 and higher)"
* tag 'x86-urgent-2021-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Turn off -fcf-protection for realmode targets
x86/mem_encrypt: Correct physical address calculation in __set_clr_pte_enc()
|
|
When the TSC frequency is known because it is retrieved from the
hypervisor, skip TSC refined calibration by setting X86_FEATURE_TSC_KNOWN_FREQ.
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210105004752.131069-1-amakhalov@vmware.com
|
|
In __cpu_setup we conditionally manipulate the TCR_EL1 value in x10
after previously using x10 as a scratch register for unrelated temporary
variables.
To make this a bit clearer, let's move the TCR_EL1 value into a named
register `tcr`. To simplify the register allocation, this is placed in
the highest available caller-saved scratch register, tcr.
Following the example of `mair`, we initialise the register with the
default value prior to any feature discovery, and write it to MAIR_EL1
after all feature discovery is complete, which allows us to simplify the
featuere discovery code.
The existing `mte_tcr` register is no longer needed, and is replaced by
the use of x10 as a temporary, matching the rest of the MTE feature
discovery assembly in __cpu_setup. As x20 is no longer used, the
function is now AAPCS compliant, as we've generally aimed for in our
assembly functions.
There should be no functional change as as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210326180137.43119-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In __cpu_setup we conditionally manipulate the MAIR_EL1 value in x5
before later reusing x5 as a scratch register for unrelated temporary
variables.
To make this a bit clearer, let's move the MAIR_EL1 value into a named
register `mair`. To simplify the register allocation, this is placed in
the highest available caller-saved scratch register, x17. As it is no
longer clobbered by other usage, we can write the value to MAIR_EL1 at
the end of the function as we do for TCR_EL1 rather than part-way though
feature discovery.
There should be no functional change as as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210326180137.43119-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Currently start_backtrace() is a static inline function in the header.
Since it really shouldn't be sufficiently performance critical that we
actually need to have it inlined move it into a C file, this will save
anyone else scratching their head about why it is defined in the header.
As far as I can see it's only there because it was factored out of the
various callers.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210319174022.33051-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
SGX driver can accurately track how enclave pages are used. This
enables SECS to be specifically targeted and EREMOVE'd only after all
child pages have been EREMOVE'd. This ensures that SGX driver will
never encounter SGX_CHILD_PRESENT in normal operation.
Virtual EPC is different. The host does not track how EPC pages are
used by the guest, so it cannot guarantee EREMOVE success. It might,
for instance, encounter a SECS with a non-zero child count.
Add a definition of SGX_CHILD_PRESENT. It will be used exclusively by
the SGX virtualization driver to handle recoverable EREMOVE errors when
saniziting EPC pages after they are freed.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/050b198e882afde7e6eba8e6a0d4da39161dbb5a.1616136308.git.kai.huang@intel.com
|
|
EREMOVE takes a page and removes any association between that page and
an enclave. It must be run on a page before it can be added into another
enclave. Currently, EREMOVE is run as part of pages being freed into the
SGX page allocator. It is not expected to fail, as it would indicate a
use-after-free of EPC pages. Rather than add the page back to the pool
of available EPC pages, the kernel intentionally leaks the page to avoid
additional errors in the future.
However, KVM does not track how guest pages are used, which means that
SGX virtualization use of EREMOVE might fail. Specifically, it is
legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to
KVM guest, because KVM/kernel doesn't track SECS pages.
To allow SGX/KVM to introduce a more permissive EREMOVE helper and
to let the SGX virtualization code use the allocator directly, break
out the EREMOVE call from the SGX page allocator. Rename the original
sgx_free_epc_page() to sgx_encl_free_epc_page(), indicating that
it is used to free an EPC page assigned to a host enclave. Replace
sgx_free_epc_page() with sgx_encl_free_epc_page() in all call sites so
there's no functional change.
At the same time, improve the error message when EREMOVE fails, and
add documentation to explain to the user what that failure means and
to suggest to the user what to do when this bug happens in the case it
happens.
[ bp: Massage commit message, fix typos and sanitize text, simplify. ]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/20210325093057.122834-1-kai.huang@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Too many fixes have accumulated in the soc tree, so this is a fairly
large set. As usual, most of the fixes are for devicetree files, but
there are also notable code changes for imx and omap regressions as
well as some maintainer file updates.
imx:
- Fix an Ethernet issue on imx6ul-14x14-evk board that is caused by
independent PHY reset.
- Add missing `dma-coherent` property for LayerScape device trees to
fix a kernel BUG report.
- Use IRQCHIP_DECLARE for AVIC driver to fix a boot issue on i.MX25
with fw_devlink=on.
- Add missing I2C pinctrl entry for imx8mp-phyboard-pollux-rdk board
to fix the broken I2C GPIO recovery support.
- Add `fsl,use-minimum-ecc` property for imx6ull-myir-mys-6ulx-eval
device tree to fix UBI filesystem mount failure.
at91:
- wrong phy address that blocks Ethernet use on boards with sama5d27
SoM1
- restrictive pin possibilities for sam9x60
omap:
- Fix ocp interconnect bus access error reporting for omap_l3_noc by
setting IRQF_NO_THREAD
- Fix changed mmc slot order regression by adding mmc aliases for
am335x
- Fix dra7 reboot regression caused by invalid pcie reset map
- Fix smartreflex init regression caused by dropped legacy data
- Fix ti-sysc driver warning on unbind if reset is not deasserted
- Fix flakey reset deassert for dra7 iva
stm32:
- MAINTAINER file updates
broadcom:
- brcmstb SoC ID build fix
- MAINTAINER file updates"
* tag 'soc-fixes-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: Add Alain Volmat as STM32 I2C/SMBUS maintainer
MAINTAINERS: Remove Vincent Abriou for STM/STI DRM drivers.
MAINTAINERS: Update some st.com email addresses to foss.st.com
ARM: dts: imx6ull: fix ubi filesystem mount failed
ARM: imx6ul-14x14-evk: Do not reset the Ethernet PHYs independently
arm64: dts: imx8mp-phyboard-pollux-rdk: Add missing pinctrl entry
arm64: dts: ls1012a: mark crypto engine dma coherent
arm64: dts: ls1043a: mark crypto engine dma coherent
arm64: dts: ls1046a: mark crypto engine dma coherent
ARM: imx: avic: Convert to using IRQCHIP_DECLARE
ARM: dts: at91: sam9x60: fix mux-mask to match product's datasheet
ARM: dts: at91: sam9x60: fix mux-mask for PA7 so it can be set to A, B and C
ARM: dts: at91-sama5d27_som1: fix phy address to 7
soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva
bus: ti-sysc: Fix warning on unbind if reset is not deasserted
ARM: OMAP2+: Fix smartreflex init regression after dropping legacy data
soc: ti: omap-prm: Fix reboot issue with invalid pcie reset map for dra7
MAINTAINERS: rectify BROADCOM PMB (POWER MANAGEMENT BUS) DRIVER
ARM: dts: am33xx: add aliases for mmc interfaces
bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"This contains a small series with a more elegant fix of a problem
which was originally fixed in rc2"
* tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
Revert "xen: fix p2m size in dom0 for disabled memory hotplug case"
xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG
|
|
H_PROTECT expects the flag value to include flags:
AVPN, pp0, pp1, pp2, key0-key4, Noexec, CMO Option flags
This patch updates hpte_updatepp() to fetch the storage key value from
the linux page table and use the same in H_PROTECT hcall.
native_hpte_updatepp() is not updated because the kernel doesn't clear
the existing storage key value there. The kernel also doesn't use
hpte_updatepp() callback for updating storage keys.
This fixes the below kernel crash observed with KUAP enabled.
BUG: Unable to handle kernel data access on write at 0xc009fffffc440000
Faulting instruction address: 0xc0000000000b7030
Key fault AMR: 0xfcffffffffffffff IAMR: 0xc0000077bc498100
Found HPTE: v = 0x40070adbb6fffc05 r = 0x1ffffffffff1194
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
...
CFAR: c000000000010100 DAR: c009fffffc440000 DSISR: 02200000 IRQMASK: 0
...
NIP memset+0x68/0x104
LR pcpu_alloc+0x54c/0xb50
Call Trace:
pcpu_alloc+0x55c/0xb50 (unreliable)
blk_stat_alloc_callback+0x94/0x150
blk_mq_init_allocated_queue+0x64/0x560
blk_mq_init_queue+0x54/0xb0
scsi_mq_alloc_queue+0x30/0xa0
scsi_alloc_sdev+0x1cc/0x300
scsi_probe_and_add_lun+0xb50/0x1020
__scsi_scan_target+0x17c/0x790
scsi_scan_channel+0x90/0xe0
scsi_scan_host_selected+0x148/0x1f0
do_scan_async+0x2c/0x2a0
async_run_entry_fn+0x78/0x220
process_one_work+0x264/0x540
worker_thread+0xa8/0x600
kthread+0x190/0x1a0
ret_from_kernel_thread+0x5c/0x6c
With KUAP enabled the kernel uses storage key 3 for all its
translations. But as shown by the debug print, in this specific case we
have the hash page table entry created with key value 0.
Found HPTE: v = 0x40070adbb6fffc05 r = 0x1ffffffffff1194
and DSISR indicates a key fault.
This can happen due to parallel fault on the same EA by different CPUs:
CPU 0 CPU 1
fault on X
H_PAGE_BUSY set
fault on X
finish fault handling and
clear H_PAGE_BUSY
check for H_PAGE_BUSY
continue with fault handling.
This implies CPU1 will end up calling hpte_updatepp for address X and
the kernel updated the hash pte entry with key 0
Fixes: d94b827e89dc ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation")
Reported-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Debugged-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210326070755.304625-1-aneesh.kumar@linux.ibm.com
|
|
When cross compiling x86 on an ARM machine with clang, there are several
errors along the lines of:
arch/x86/include/asm/string_64.h:27:10: error: invalid output constraint '=&c' in asm
This happens because the compressed boot Makefile reassigns KBUILD_CFLAGS
and drops the clang flags that set the target architecture ('--target=')
and the path to the GNU cross tools ('--prefix='), meaning that the host
architecture is targeted.
These flags are available as $(CLANG_FLAGS) from the main Makefile so
add them to the compressed boot folder's KBUILD_CFLAGS so that cross
compiling works as expected.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20210326000435.4785-3-nathan@kernel.org
|
|
When cross-compiling with Clang, the `$(CLANG_FLAGS)' variable
contains additional flags needed to build C and assembly sources
for the target platform. Normally this variable is automatically
included in `$(KBUILD_CFLAGS)' via the top-level Makefile.
The x86 real-mode makefile builds `$(REALMODE_CFLAGS)' from a
plain assignment and therefore drops the Clang flags. This causes
Clang to not recognize x86-specific assembler directives:
arch/x86/realmode/rm/header.S:36:1: error: unknown directive
.type real_mode_header STT_OBJECT ; .size real_mode_header, .-real_mode_header
^
Explicit propagation of `$(CLANG_FLAGS)' to `$(REALMODE_CFLAGS)',
which is inherited by real-mode make rules, fixes cross-compilation
with Clang for x86 targets.
Relevant flags:
* `--target' sets the target architecture when cross-compiling. This
flag must be set for both compilation and assembly (`KBUILD_AFLAGS')
to support architecture-specific assembler directives.
* `-no-integrated-as' tells clang to assemble with GNU Assembler
instead of its built-in LLVM assembler. This flag is set by default
unless `LLVM_IAS=1' is set, because the LLVM assembler can't yet
parse certain GNU extensions.
Signed-off-by: John Millikin <john@john-millikin.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Link: https://lkml.kernel.org/r/20210326000435.4785-2-nathan@kernel.org
|
|
Enhanced Privileged Access Never (EPAN) allows Privileged Access Never
to be used with Execute-only mappings.
Absence of such support was a reason for 24cecc377463 ("arm64: Revert
support for execute-only user mappings"). Thus now it can be revisited
and re-enabled.
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210312173811.58284-2-vladimir.murzin@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Li Wang reported that clock_gettime(CLOCK_MONOTONIC_RAW, ...) returns
incorrect values when time is provided via vdso instead of system call:
vdso_ts_nsec = 4484351380985507, vdso_ts.tv_sec = 4484351, vdso_ts.tv_nsec = 380985507
sys_ts_nsec = 1446923235377, sys_ts.tv_sec = 1446, sys_ts.tv_nsec = 923235377
Within the s390 specific vdso function __arch_get_hw_counter() reads
tod clock steering values from the arch_data member of the passed in
vdso_data structure.
Problem is that only for the CS_HRES_COARSE vdso_data arch_data is
initialized and gets updated. The CS_RAW specific vdso_data does not
contain any valid tod_clock_steering information, which explains the
different values.
Fix this by initializing and updating all vdso_datas.
Reported-by: Li Wang <liwang@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Fixes: 1ba2d6c0fd4e ("s390/vdso: simplify __arch_get_hw_counter()")
Link: https://lore.kernel.org/linux-s390/YFnxr1ZlMIOIqjfq@osiris
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The s390 specific vdso function __arch_get_hw_counter() is supposed to
consider tod clock steering.
If a tod clock steering event happens and the tod clock is set to a
new value __arch_get_hw_counter() will not return the real tod clock
value but slowly drift it from the old delta until the returned value
finally matches the real tod clock value again.
Unfortunately the type of tod_steering_delta unsigned while it is
supposed to be signed. It depends on if tod_steering_delta is negative
or positive in which direction the vdso code drifts the clock value.
Worst case is now that instead of drifting the clock slowly it will
jump into the opposite direction by a factor of two.
Fix this by simply making tod_steering_delta signed.
Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO")
Cc: <stable@vger.kernel.org> # 5.10
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
When converting the vdso assembler code to C it was forgotten to
actually copy the tod_steering_delta value to vdso_data page.
Which in turn means that tod clock steering will not work correctly.
Fix this by simply copying the value whenever it is updated.
Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO")
Cc: <stable@vger.kernel.org> # 5.10
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Merge misc fixes from Andrew Morton:
"14 patches.
Subsystems affected by this patch series: mm (hugetlb, kasan, gup,
selftests, z3fold, kfence, memblock, and highmem), squashfs, ia64,
gcov, and mailmap"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mailmap: update Andrey Konovalov's email address
mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
mm: memblock: fix section mismatch warning again
kfence: make compatible with kmemleak
gcov: fix clang-11+ support
ia64: fix format strings for err_inject
ia64: mca: allocate early mca with GFP_ATOMIC
squashfs: fix xattr id and id lookup sanity checks
squashfs: fix inode lookup sanity checks
z3fold: prevent reclaim/free race for headless pages
selftests/vm: fix out-of-tree build
mm/mmu_notifiers: ensure range_end() is paired with range_start()
kasan: fix per-page tags for non-page_alloc pages
hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Minor fixes all over, ranging from typos to tests to errata
workarounds:
- Fix possible memory hotplug failure with KASLR
- Fix FFR value in SVE kselftest
- Fix backtraces reported in /proc/$pid/stack
- Disable broken CnP implementation on NVIDIA Carmel
- Typo fixes and ACPI documentation clarification
- Fix some W=1 warnings"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kernel: disable CNP on Carmel
arm64/process.c: fix Wmissing-prototypes build warnings
kselftest/arm64: sve: Do not use non-canonical FFR register value
arm64: mm: correct the inside linear map range during hotplug check
arm64: kdump: update ppos when reading elfcorehdr
arm64: cpuinfo: Fix a typo
Documentation: arm64/acpi : clarify arm64 support of IBFT
arm64: stacktrace: don't trace arch_stack_walk()
arm64: csum: cast to the proper type
|
|
The spec_bar() macro was introduced in
commit bd4fb6d270bc ("arm64: Add support for SB barrier and patch in over DSB; ISB sequences")
as a way for C to insert a speculation barrier and was then
used in one single place: set_fs().
Later on
commit 3d2403fd10a1 ("arm64: uaccess: remove set_fs()")
deleted set_fs() altogether and as noted in the commit
on the new path the regular sb() assembly macro will
be used.
Delete the remnant.
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210325141304.1607595-1-linus.walleij@linaro.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
features, since adding a new leaf for only two bits would be wasteful.
As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
guest, and to do so correctly needs to query hardware and kernel support
for SGX1 and SGX2.
Suppress both SGX1 and SGX2 from /proc/cpuinfo. SGX1 basically means
SGX, and for SGX2 there is no concrete use case of using it in
/proc/cpuinfo.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/d787827dbfca6b3210ac3e432e3ac1202727e786.1616136308.git.kai.huang@intel.com
|
|
Move SGX_LC feature bit to CPUID dependency table to make clearing all
SGX feature bits easier. Also remove clear_sgx_caps() since it is just
a wrapper of setup_clear_cpu_cap(X86_FEATURE_SGX) now.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/5d4220fd0a39f52af024d3fa166231c1d498dd10.1616136308.git.kai.huang@intel.com
|
|
Fix warning with %lx / u64 mismatch:
arch/ia64/kernel/err_inject.c: In function 'show_resources':
arch/ia64/kernel/err_inject.c:62:22: warning:
format '%lx' expects argument of type 'long unsigned int',
but argument 3 has type 'u64' {aka 'long long unsigned int'}
62 | return sprintf(buf, "%lx", name[cpu]); \
| ^~~~~~~
Link: https://lkml.kernel.org/r/20210313104312.1548232-1-slyfox@gentoo.org
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The sleep warning happens at early boot right at secondary CPU
activation bootup:
smp: Bringing up secondary CPUs ...
BUG: sleeping function called from invalid context at mm/page_alloc.c:4942
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.12.0-rc2-00007-g79e228d0b611-dirty #99
..
Call Trace:
show_stack+0x90/0xc0
dump_stack+0x150/0x1c0
___might_sleep+0x1c0/0x2a0
__might_sleep+0xa0/0x160
__alloc_pages_nodemask+0x1a0/0x600
alloc_page_interleave+0x30/0x1c0
alloc_pages_current+0x2c0/0x340
__get_free_pages+0x30/0xa0
ia64_mca_cpu_init+0x2d0/0x3a0
cpu_init+0x8b0/0x1440
start_secondary+0x60/0x700
start_ap+0x750/0x780
Fixed BSP b0 value from CPU 1
As I understand interrupts are not enabled yet and system has a lot of
memory. There is little chance to sleep and switch to GFP_ATOMIC should
be a no-op.
Link: https://lkml.kernel.org/r/20210315085045.204414-1-slyfox@gentoo.org
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Building kernel/sys_ni.c with W=1 emits tons of -Wmissing-prototypes warnings:
$ make W=1 kernel/sys_ni.o
[ snip ]
CC kernel/sys_ni.o
./arch/x86/include/asm/syscall_wrapper.h:83:14: warning: no previous prototype for '__ia32_sys_io_setup' [-Wmissing-prototypes]
...
The problem is in __COND_SYSCALL(), the __SYS_STUB0() and __SYS_STUBx() macros
defined a few lines above already have forward declarations.
Let's do likewise for __COND_SYSCALL() to fix the warnings.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210301131533.64671-2-masahiroy@kernel.org
|
|
An endpoint is not a device and it is recommended to use clocks property
in device node. RT5658 Codec binding already specifies the usage of
clocks property. Thus move the clocks from endpoint to device node.
Fixes: 5b4f6323096a ("arm64: tegra: Audio graph sound card for Jetson AGX Xavier")
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
We haven't needed the test_irqs_unmasked macro since commit:
105fc3352077bba5 ("arm64: entry: move el1 irq/nmi logic to C")
... and as we convert more of the entry logic to C it is decreasingly
likely we'll need it in future, so let's remove the unused macro.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210323181201.18889-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Since commit 30fdfb929e82 ("PCI: Add a call to pci_assign_irq() in
pci_device_probe()"), the PCI code will call the IRQ mapping function
whenever a PCI driver is probed. If these are marked as __init, this
causes an oops if a PCI driver is loaded or bound after the kernel has
initialised.
Fixes: 30fdfb929e82 ("PCI: Add a call to pci_assign_irq() in pci_device_probe()")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
for_each_mem_range() uses a loop variable, yet looking into code it is
not just iteration counter but more complex entity which encodes
information about memblock. Thus condition i == 0 looks fragile.
Indeed, it broke boot of R-class platforms since it never took i == 0
path (due to i was set to 1). Fix that with restoring original flag
check.
Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|