summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-04-08arm64: dts: qcom: msm8916: Add GICv2 hypervisor registers/interruptStephan Gerhold
The ARM Cortex-A53 CPU cores and QGIC2 interrupt controller (an implementation of the ARM GIC 2.0 specification) used in MSM8916 support virtualization, e.g. for KVM on Linux. However, so far it was not possible to make use of this functionality, because Qualcomm's proprietary "hyp" firmware blocks the EL2 mode of the CPU and only allows booting Linux in EL1. However, on devices without (firmware) secure boot there is no need to rely on all of Qualcomm's firmware. The "hyp" firmware on MSM8916 seems simple enough that it can be replaced with an open-source alternative created only based on trial and error - with some similar EL2/EL1 initialization code adapted from Linux and U-Boot. qhypstub [1] is such an open-source firmware for MSM8916 that can be used as drop-in replacement for Qualcomm's "hyp" firmware. It does not implement any hypervisor functionality. Instead, it allows booting Linux/KVM (or other hypervisors) in EL2. With Linux booting in EL2, KVM seems to be working just fine on MSM8916. However, so far it is not possible to make use of the virtualization features in the GICv2. To use KVM's VGICv2 code, the QGIC2 device tree node needs additional resources (according to binding documentation): - The CPU interface region (second reg) must be at least 8 KiB large to access the GICC_DIR register (mapped at 0x1000 offset) - Virtual control/CPU interface register base and size - Hypervisor maintenance interrupt Fortunately, the public APQ8016E TRM [2] provides the required information: - The CPU interface region (at 0x0B002000) actually has a size of 8 KiB - Virtual control/CPU interface register is at 0x0B001000/0x0B004000 - Hypervisor maintenance interrupt is "PPI #0" Note: This is a bit strange since almost all other ARM SoCs use GIC_PPI 9 for this. However, I have verified that this is indeed the interrupt that fires when bits are set in GICH_HCR. Add the additional resources to the QGIC2 device tree node in msm8916.dtsi. There is no functional difference when Linux is started in EL1 since the additional resources are ignored in that case. With these changes (and qhypstub), KVM seems to be fully working on the DragonBoard 410c (apq8016-sbc) and BQ Aquaris X5 (longcheer-l8910). [1]: https://github.com/msm8916-mainline/qhypstub [2]: https://developer.qualcomm.com/download/sd410/snapdragon-410e-technical-reference-manual.pdf Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20210407163648.4708-1-stephan@gerhold.net Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-04-08nl80211: fix beacon head validationJohannes Berg
If the beacon head attribute (NL80211_ATTR_BEACON_HEAD) is too short to even contain the frame control field, we access uninitialized data beyond the buffer. Fix this by checking the minimal required size first. We used to do this until S1G support was added, where the fixed data portion has a different size. Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 1d47f1198d58 ("nl80211: correctly validate S1G beacon head") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-08dt-bindings: timer: nuvoton,npcm7xx: Add wpcm450-timerJonathan Neuschäfer
Add a compatible string for WPCM450, which has essentially the same timer controller. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210320181610.680870-6-j.neuschaefer@gmx.net
2021-04-08clocksource/drivers/arm_arch_timer: Add __ro_after_init and __initJisheng Zhang
Some functions are not needed after booting, so mark them as __init to move them to the .init section. Some global variables are never modified after init, so can be __ro_after_init. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210330140444.4fb2a7cb@xhacker.debian
2021-04-08clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940Tony Lindgren
There is a timer wrap issue on dra7 for the ARM architected timer. In a typical clock configuration the timer fails to wrap after 388 days. To work around the issue, we need to use timer-ti-dm percpu timers instead. Let's configure dmtimer3 and 4 as percpu timers by default, and warn about the issue if the dtb is not configured properly. Let's do this as a single patch so it can be backported to v5.8 and later kernels easily. Note that this patch depends on earlier timer-ti-dm systimer posted mode fixes, and a preparatory clockevent patch "clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue". For more information, please see the errata for "AM572x Sitara Processors Silicon Revisions 1.1, 2.0": https://www.ti.com/lit/er/sprz429m/sprz429m.pdf The concept is based on earlier reference patches done by Tero Kristo and Keerthy. Cc: Keerthy <j-keerthy@ti.com> Cc: Tero Kristo <kristo@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210323074326.28302-3-tony@atomide.com
2021-04-08ACPI: dock: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including following types: WARNING: Missing a blank line after declarations ERROR: spaces required around that ':' WARNING: Statements should start on a tabstop Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: sysfs: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including following types: WARNING: Missing a blank line after declarations WARNING: Block comments should align the * on each line ERROR: open brace '{' following function definitions go on the next line Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: PM: add a missed blank line after declarationsXiaofei Tan
Add a missed blank line after declarations, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: custom_method: fix a coding style issueXiaofei Tan
Fix the following coding style issue reported by checkpatch.pl ERROR: "foo * bar" should be "foo *bar" FILE: drivers/acpi/custom_method.c:22: +static ssize_t cm_write(struct file *file, const char __user * user_buf, Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: CPPC: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including the following types: WARNING: Missing a blank line after declarations WARNING: unnecessary whitespace before a quoted newline ERROR: spaces required around that '>=' ERROR: switch and case should be at the same indent Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: button: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including the following types: WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line ERROR: code indent should use tabs where possible Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: battery: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including the following types: WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line ERROR: code indent should use tabs where possible WARNING: Missing a blank line after declarations ERROR: spaces required around that '?' (ctx:WxV) WARNING: Block comments should align the * on each line Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: acpi_pad: add a missed blank line after declarationsXiaofei Tan
Add a missed blank line after declarations, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: LPSS: add a missed blank line after declarationsXiaofei Tan
Add a missed blank line after declarations, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: ipmi: remove useless return statement for void functionXiaofei Tan
Remove useless return statement for void function, reported by checkpatch.pl. WARNING: void function return statements are not generally useful FILE: drivers/acpi/acpi_ipmi.c:482: + return; +} Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: processor: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including the following types: ERROR: code indent should use tabs where possible WARNING: Block comments use a trailing */ on a separate line WARNING: Missing a blank line after declarations WARNING: labels should not be indented Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: APD: fix a block comment align issueXiaofei Tan
Fix the following coding style issue reported by checkpatch.pl: WARNING: Block comments should align the * on each line +/** +* Create platform device during acpi scan attach handle. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08bpf, x86: Validate computation of branch displacements for x86-32Piotr Krysiuk
The branch displacement logic in the BPF JIT compilers for x86 assumes that, for any generated branch instruction, the distance cannot increase between optimization passes. But this assumption can be violated due to how the distances are computed. Specifically, whenever a backward branch is processed in do_jit(), the distance is computed by subtracting the positions in the machine code from different optimization passes. This is because part of addrs[] is already updated for the current optimization pass, before the branch instruction is visited. And so the optimizer can expand blocks of machine code in some cases. This can confuse the optimizer logic, where it assumes that a fixed point has been reached for all machine code blocks once the total program size stops changing. And then the JIT compiler can output abnormal machine code containing incorrect branch displacements. To mitigate this issue, we assert that a fixed point is reached while populating the output image. This rejects any problematic programs. The issue affects both x86-32 and x86-64. We mitigate separately to ease backporting. Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-04-08bpf, x86: Validate computation of branch displacements for x86-64Piotr Krysiuk
The branch displacement logic in the BPF JIT compilers for x86 assumes that, for any generated branch instruction, the distance cannot increase between optimization passes. But this assumption can be violated due to how the distances are computed. Specifically, whenever a backward branch is processed in do_jit(), the distance is computed by subtracting the positions in the machine code from different optimization passes. This is because part of addrs[] is already updated for the current optimization pass, before the branch instruction is visited. And so the optimizer can expand blocks of machine code in some cases. This can confuse the optimizer logic, where it assumes that a fixed point has been reached for all machine code blocks once the total program size stops changing. And then the JIT compiler can output abnormal machine code containing incorrect branch displacements. To mitigate this issue, we assert that a fixed point is reached while populating the output image. This rejects any problematic programs. The issue affects both x86-32 and x86-64. We mitigate separately to ease backporting. Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-04-08spi: fsl: add missing iounmap() on error in of_fsl_spi_probe()Yang Yingliang
Add the missing iounmap() before return from of_fsl_spi_probe() in the error handling case. Fixes: 0f0581b24bd0 ("spi: fsl: Convert to use CS GPIO descriptors") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20210401140350.1677925-1-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-08ARM/spi: spear: Drop PL022 num_chipselectLinus Walleij
A previous refactoring moved the chip select number handling to the SPI core and we missed a leftover platform data user in the ST spear platform. The spear is not using this chipselect or PL022 for anything and should be using device tree like the rest of the platform so just delete the offending platform data. Cc: Viresh Kumar <vireshk@kernel.org> Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210408075045.3435046-1-linus.walleij@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-08platform/surface: aggregator: move to use request_irq by IRQF_NO_AUTOEN flagTian Tao
disable_irq() after request_irq() still has a time gap in which interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will disable IRQ auto-enable because of requesting. this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which is being merged: https://lore.kernel.org/patchwork/patch/1388765/ Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/1617778852-26492-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-04-08platform/mellanox: mlxreg-hotplug: move to use request_irq by IRQF_NO_AUTOEN ↵Tian Tao
flag disable_irq() after request_irq() still has a time gap in which interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will disable IRQ auto-enable because of requesting. this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which is being merged: https://lore.kernel.org/patchwork/patch/1388765/ Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/1617785983-28878-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-04-08Merge tag 'irq-no-autoen-2021-03-25' into review-hansHans de Goede
Tag for the input subsystem to pick up
2021-04-08ACPI: AC: fix some coding style issuesXiaofei Tan
Fix some coding style issues reported by checkpatch.pl, including the following types: ERROR: "foo * bar" should be "foo *bar" ERROR: code indent should use tabs where possible WARNING: Block comments use a trailing */ on a separate line WARNING: braces {} are not necessary for single statement blocks WARNING: void function return statements are not generally useful WARNING: CVS style keyword markers, these will _not_ be updated Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issueTony Lindgren
There is a timer wrap issue on dra7 for the ARM architected timer. In a typical clock configuration the timer fails to wrap after 388 days. To work around the issue, we need to use timer-ti-dm timers instead. Let's prepare for adding support for percpu timers by adding a common dmtimer_clkevt_init_common() and call it from dmtimer_clockevent_init(). This patch makes no intentional functional changes. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210323074326.28302-2-tony@atomide.com
2021-04-08spi: Fix use-after-free with devm_spi_alloc_*William A. Kennington III
We can't rely on the contents of the devres list during spi_unregister_controller(), as the list is already torn down at the time we perform devres_find() for devm_spi_release_controller. This causes devices registered with devm_spi_alloc_{master,slave}() to be mistakenly identified as legacy, non-devm managed devices and have their reference counters decremented below 0. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 660 at lib/refcount.c:28 refcount_warn_saturate+0x108/0x174 [<b0396f04>] (refcount_warn_saturate) from [<b03c56a4>] (kobject_put+0x90/0x98) [<b03c5614>] (kobject_put) from [<b0447b4c>] (put_device+0x20/0x24) r4:b6700140 [<b0447b2c>] (put_device) from [<b07515e8>] (devm_spi_release_controller+0x3c/0x40) [<b07515ac>] (devm_spi_release_controller) from [<b045343c>] (release_nodes+0x84/0xc4) r5:b6700180 r4:b6700100 [<b04533b8>] (release_nodes) from [<b0454160>] (devres_release_all+0x5c/0x60) r8:b1638c54 r7:b117ad94 r6:b1638c10 r5:b117ad94 r4:b163dc10 [<b0454104>] (devres_release_all) from [<b044e41c>] (__device_release_driver+0x144/0x1ec) r5:b117ad94 r4:b163dc10 [<b044e2d8>] (__device_release_driver) from [<b044f70c>] (device_driver_detach+0x84/0xa0) r9:00000000 r8:00000000 r7:b117ad94 r6:b163dc54 r5:b1638c10 r4:b163dc10 [<b044f688>] (device_driver_detach) from [<b044d274>] (unbind_store+0xe4/0xf8) Instead, determine the devm allocation state as a flag on the controller which is guaranteed to be stable during cleanup. Fixes: 5e844cc37a5c ("spi: Introduce device-managed SPI controller allocation") Signed-off-by: William A. Kennington III <wak@google.com> Link: https://lore.kernel.org/r/20210407095527.2771582-1-wak@google.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-08spi: spi-zynqmp-gqspi: Fix runtime PM imbalance in zynqmp_qspi_probeDinghao Liu
When platform_get_irq() fails, a pairing PM usage counter increment is needed to keep the counter balanced. It's the same for the following error paths. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20210408092559.3824-1-dinghao.liu@zju.edu.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-08bus: mhi: pci_generic: Constify mhi_controller_config struct definitionsManivannan Sadhasivam
"mhi_controller_config" struct is not modified inside "mhi_pci_dev_info" struct. So constify the instances. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2021-04-08bus: mhi: pci_generic: Introduce Foxconn T99W175 supportJarvis Jiang
Add support for T99W175 modems, this modem series is based on SDX55 qcom chip. The modem is mainly based on MBIM protocol for both the data and control path. This patch adds support for below modems: - T99W175(based on sdx55), Both for eSIM and Non-eSIM - DW5930e(based on sdx55), With eSIM, It's also T99W175 - DW5930e(based on sdx55), Non-eSIM, It's also T99W175 This patch was tested with Ubuntu 20.04 X86_64 PC as host Signed-off-by: Jarvis Jiang <jarvis.w.jiang@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20210408095524.3559-1-jarvis.w.jiang@gmail.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2021-04-08drm/vc4: crtc: Reduce PV fifo threshold on hvs4Dom Cobley
Experimentally have found PV on hvs4 reports fifo full error with expected settings and does not with one less This appears as: [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:82:crtc-3] flip_done timed out with bit 10 of PV_STAT set "HVS driving pixels when the PV FIFO is full" Fixes: c8b75bca92cb ("drm/vc4: Add KMS support for Raspberry Pi.") Signed-off-by: Dom Cobley <popcornmix@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-3-maxime@cerno.tech
2021-04-08drm/vc4: plane: Remove redundant assignmentMaxime Ripard
The vc4_plane_atomic_async_update function assigns twice in a row the src_h field in the drm_plane_state structure to the same value. Remove the second one. Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-2-maxime@cerno.tech
2021-04-08nl80211: fix potential leak of ACL paramsJohannes Berg
In case nl80211_parse_unsol_bcast_probe_resp() results in an error, need to "goto out" instead of just returning to free possibly allocated data. Fixes: 7443dcd1f171 ("nl80211: Unsolicited broadcast probe response support") Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671ff0b237726d4851b5b0f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-08cfg80211: check S1G beacon compat element lengthJohannes Berg
We need to check the length of this element so that we don't access data beyond its end. Fix that. Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5ea49d2daa3cd2459d11@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-08x86/cpu: Resort and comment Intel modelsPeter Zijlstra
The INTEL_FAM6 list has become a mess again. Try and bring some sanity back into it. Where previously we had one microarch per year and a number of SKUs within that, this no longer seems to be the case. We now get different uarch names that share a 'core' design. Add the core name starting at skylake and reorder to keep the cores in chronological order. Furthermore, Intel marketed the names {Amber, Coffee, Whiskey} Lake, but those are in fact steppings of Kaby Lake, add comments for them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/YE+HhS8i0gshHD3W@hirez.programming.kicks-ass.net
2021-04-08arm64: entry: Enable random_kstack_offset supportKees Cook
Allow for a randomized stack offset on a per-syscall basis, with roughly 5 bits of entropy. (And include AAPCS rationale AAPCS thanks to Mark Rutland.) In order to avoid unconditional stack canaries on syscall entry (due to the use of alloca()), also disable stack protector to avoid triggering needless checks and slowing down the entry path. As there is no general way to control stack protector coverage with a function attribute[1], this must be disabled at the compilation unit level. This isn't a problem here, though, since stack protector was not triggered before: examining the resulting syscall.o, there are no changes in canary coverage (none before, none now). [1] a working __attribute__((no_stack_protector)) has been added to GCC and Clang but has not been released in any version yet: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=346b302d09c1e6db56d9fe69048acb32fbb97845 https://reviews.llvm.org/rG4fbf84c1732fca596ad1d6e96015e19760eb8a9b Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210401232347.2791257-6-keescook@chromium.org
2021-04-08lkdtm: Add REPORT_STACK for checking stack offsetsKees Cook
For validating the stack offset behavior, report the offset from a given process's first seen stack address. Add s script to calculate the results to the LKDTM kselftests. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-7-keescook@chromium.org
2021-04-08x86/entry: Enable random_kstack_offset supportKees Cook
Allow for a randomized stack offset on a per-syscall basis, with roughly 5-6 bits of entropy, depending on compiler and word size. Since the method of offsetting uses macros, this cannot live in the common entry code (the stack offset needs to be retained for the life of the syscall, which means it needs to happen at the actual entry point). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-5-keescook@chromium.org
2021-04-08stack: Optionally randomize kernel stack offset each syscallKees Cook
This provides the ability for architectures to enable kernel stack base address offset randomization. This feature is controlled by the boot param "randomize_kstack_offset=on/off", with its default value set by CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. This feature is based on the original idea from the last public release of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt All the credit for the original idea goes to the PaX team. Note that the design and implementation of this upstream randomize_kstack_offset feature differs greatly from the RANDKSTACK feature (see below). Reasoning for the feature: This feature aims to make harder the various stack-based attacks that rely on deterministic stack structure. We have had many such attacks in past (just to name few): https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html As Linux kernel stack protections have been constantly improving (vmap-based stack allocation with guard pages, removal of thread_info, STACKLEAK), attackers have had to find new ways for their exploits to work. They have done so, continuing to rely on the kernel's stack determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT were not relevant. For example, the following recent attacks would have been hampered if the stack offset was non-deterministic between syscalls: https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf (page 70: targeting the pt_regs copy with linear stack overflow) https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html (leaked stack address from one syscall as a target during next syscall) The main idea is that since the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack, even with address exposures, as the stack base will change on the next syscall. Also, since randomization is performed after placing pt_regs, the ptrace-based approach[1] to discover the randomized offset during a long-running syscall should not be possible. Design description: During most of the kernel's execution, it runs on the "thread stack", which is pretty deterministic in its structure: it is fixed in size, and on every entry from userspace to kernel on a syscall the thread stack starts construction from an address fetched from the per-cpu cpu_current_top_of_stack variable. The first element to be pushed to the thread stack is the pt_regs struct that stores all required CPU registers and syscall parameters. Finally the specific syscall function is called, with the stack being used as the kernel executes the resulting request. The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the syscall processing, and to change it every time a process issues a syscall. The source of randomness is currently architecture-defined (but x86 is using the low byte of rdtsc()). Future improvements for different entropy sources is possible, but out of scope for this patch. Further more, to add more unpredictability, new offsets are chosen at the end of syscalls (the timing of which should be less easy to measure from userspace than at syscall entry time), and stored in a per-CPU variable, so that the life of the value does not stay explicitly tied to a single task. As suggested by Andy Lutomirski, the offset is added using alloca() and an empty asm() statement with an output constraint, since it avoids changes to assembly syscall entry code, to the unwinder, and provides correct stack alignment as defined by the compiler. In order to make this available by default with zero performance impact for those that don't want it, it is boot-time selectable with static branches. This way, if the overhead is not wanted, it can just be left turned off with no performance impact. The generated assembly for x86_64 with GCC looks like this: ... ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax # 12380 <kstack_offset> ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax ffffffff81003983: 48 83 c0 0f add $0xf,%rax ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax ffffffff8100398c: 48 29 c4 sub %rax,%rsp ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax ... As a result of the above stack alignment, this patch introduces about 5 bits of randomness after pt_regs is spilled to the thread stack on x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for stack alignment). The amount of entropy could be adjusted based on how much of the stack space we wish to trade for security. My measure of syscall performance overhead (on x86_64): lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null randomize_kstack_offset=y Simple syscall: 0.7082 microseconds randomize_kstack_offset=n Simple syscall: 0.7016 microseconds So, roughly 0.9% overhead growth for a no-op syscall, which is very manageable. And for people that don't want this, it's off by default. There are two gotchas with using the alloca() trick. First, compilers that have Stack Clash protection (-fstack-clash-protection) enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to any dynamic stack allocations. While the randomization offset is always less than a page, the resulting assembly would still contain (unreachable!) probing routines, bloating the resulting assembly. To avoid this, -fno-stack-clash-protection is unconditionally added to the kernel Makefile since this is the only dynamic stack allocation in the kernel (now that VLAs have been removed) and it is provably safe from Stack Clash style attacks. The second gotcha with alloca() is a negative interaction with -fstack-protector*, in that it sees the alloca() as an array allocation, which triggers the unconditional addition of the stack canary function pre/post-amble which slows down syscalls regardless of the static branch. In order to avoid adding this unneeded check and its associated performance impact, architectures need to carefully remove uses of -fstack-protector-strong (or -fstack-protector) in the compilation units that use the add_random_kstack() macro and to audit the resulting stack mitigation coverage (to make sure no desired coverage disappears). No change is visible for this on x86 because the stack protector is already unconditionally disabled for the compilation unit, but the change is required on arm64. There is, unfortunately, no attribute that can be used to disable stack protector for specific functions. Comparison to PaX RANDKSTACK feature: The RANDKSTACK feature randomizes the location of the stack start (cpu_current_top_of_stack), i.e. including the location of pt_regs structure itself on the stack. Initially this patch followed the same approach, but during the recent discussions[2], it has been determined to be of a little value since, if ptrace functionality is available for an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write different offsets in the pt_regs struct, observe the cache behavior of the pt_regs accesses, and figure out the random stack offset. Another difference is that the random offset is stored in a per-cpu variable, rather than having it be per-thread. As a result, these implementations differ a fair bit in their implementation details and results, though obviously the intent is similar. [1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/ [2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/ [3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html Co-developed-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
2021-04-08init_on_alloc: Optimize static branchesKees Cook
The state of CONFIG_INIT_ON_ALLOC_DEFAULT_ON (and ...ON_FREE...) did not change the assembly ordering of the static branches: they were always out of line. Use the new jump_label macros to check the CONFIG settings to default to the "expected" state, which slightly optimizes the resulting assembly code. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20210401232347.2791257-3-keescook@chromium.org
2021-04-08jump_label: Provide CONFIG-driven build state defaultsKees Cook
As shown in the comment in jump_label.h, choosing the initial state of static branches changes the assembly layout. If the condition is expected to be likely it's inline, and if unlikely it is out of line via a jump. A few places in the kernel use (or could be using) a CONFIG to choose the default state, which would give a small performance benefit to their compile-time declared default. Provide the infrastructure to do this. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210401232347.2791257-2-keescook@chromium.org
2021-04-08KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_spPaolo Bonzini
Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller will skip the TLB flush, which is wrong. There are two ways to fix it: - since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to use "flush |= ..." - or we can chain the flush argument through kvm_tdp_mmu_zap_sp down to __kvm_tdp_mmu_zap_gfn_range. Note that kvm_tdp_mmu_zap_sp will neither yield nor flush, so flush would never go from true to false. This patch does the former to simplify application to stable kernels, and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush. Cc: seanjc@google.com Fixes: 048f49809c526 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping") Cc: <stable@vger.kernel.org> # 5.10.x: 048f49809c: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping Cc: <stable@vger.kernel.org> # 5.10.x: 33a3164161: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages Cc: <stable@vger.kernel.org> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-08clocksource/drivers/dw_apb_timer_of: Add handling for potential memory leakDinh Nguyen
Add calls to disable the clock and unmap the timer base address in case of any failures. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210322121844.2271041-1-dinguyen@kernel.org
2021-04-08clocksource/drivers/npcm: Add support for WPCM450Jonathan Neuschäfer
Add a compatible string for WPCM450, which has essentially the same timer controller. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210320181610.680870-11-j.neuschaefer@gmx.net
2021-04-08clocksource/drivers/sh_cmt: Don't use CMTOUT_IE with R-Car Gen2/3Wolfram Sang
CMTOUT_IE is only supported for older SoCs. Newer SoCs shall not set this bit. So, add a version check. Reported-by: Phong Hoang <phong.hoang.wz@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210309094448.31823-1-wsa+renesas@sang-engineering.com
2021-04-08clocksource/drivers/pistachio: Fix trivial typoDrew Fustini
Fix trivial typo, rename local variable from 'overflw' to 'overflow' in pistachio_clocksource_read_cycles(). Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Drew Fustini <drew@beagleboard.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210305090315.384547-1-drew@beagleboard.org
2021-04-08clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe()Wei Yongjun
In case of error, the function device_node_to_regmap() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: ca7b72b5a5f2 ("clocksource: Add driver for the Ingenic JZ47xx OST") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210308123031.2285083-1-weiyongjun1@huawei.com
2021-04-08clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stoppedTony Lindgren
To avoid spurious timer interrupts when KTIME_MAX is used, we need to configure set_state_oneshot_stopped(). Although implementing this is optional, it still affects things like power management for the extra timer interrupt. For more information, please see commit 8fff52fd5093 ("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state") and commit cf8c5009ee37 ("clockevents/drivers/arm_arch_timer: Implement ->set_state_oneshot_stopped()"). Fixes: 52762fbd1c47 ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support") Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210304072135.52712-4-tony@atomide.com
2021-04-08clocksource/drivers/timer-ti-dm: Fix posted mode status check orderTony Lindgren
When the timer is configured in posted mode, we need to check the write- posted status register (TWPS) before writing to the register. We now check TWPS after the write starting with commit 52762fbd1c47 ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support"). For example, in the TRM for am571x the following is documented in chapter "22.2.4.13.1.1 Write Posting Synchronization Mode": "For each register, a status bit is provided in the timer write-posted status (TWPS) register. In this mode, it is mandatory that software check this status bit before any write access. If a write is attempted to a register with a previous access pending, the previous access is discarded without notice." The regression happened when I updated the code to use standard read/write accessors for the driver instead of using __omap_dm_timer_load_start(). We have__omap_dm_timer_load_start() check the TWPS status correctly using __omap_dm_timer_write(). Fixes: 52762fbd1c47 ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support") Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210304072135.52712-2-tony@atomide.com
2021-04-08dt-bindings: timer: renesas,cmt: Document R8A77961Niklas Söderlund
Add missing bindings for M3-W+. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210211143344.352588-1-niklas.soderlund+renesas@ragnatech.se