Age | Commit message (Collapse) | Author |
|
Subtests are added to exercise the patched code which handles
- LLONG_MIN/-1
- INT_MIN/-1
- LLONG_MIN%-1
- INT_MIN%-1
where -1 could be an immediate or in a register.
Without the previous patch, all these cases will crash the kernel on
x86_64 platform.
Additional tests are added to use small values (e.g. -5/-1, 5%-1, etc.)
in order to exercise the additional logic with patched insns.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240913150332.1188102-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Zac Ecob reported a problem where a bpf program may cause kernel crash due
to the following error:
Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
The failure is due to the below signed divide:
LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808.
LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808,
but it is impossible since for 64-bit system, the maximum positive
number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will
cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is
LLONG_MIN.
Further investigation found all the following sdiv/smod cases may trigger
an exception when bpf program is running on x86_64 platform:
- LLONG_MIN/-1 for 64bit operation
- INT_MIN/-1 for 32bit operation
- LLONG_MIN%-1 for 64bit operation
- INT_MIN%-1 for 32bit operation
where -1 can be an immediate or in a register.
On arm64, there are no exceptions:
- LLONG_MIN/-1 = LLONG_MIN
- INT_MIN/-1 = INT_MIN
- LLONG_MIN%-1 = 0
- INT_MIN%-1 = 0
where -1 can be an immediate or in a register.
Insn patching is needed to handle the above cases and the patched codes
produced results aligned with above arm64 result. The below are pseudo
codes to handle sdiv/smod exceptions including both divisor -1 and divisor 0
and the divisor is stored in a register.
sdiv:
tmp = rX
tmp += 1 /* [-1, 0] -> [0, 1]
if tmp >(unsigned) 1 goto L2
if tmp == 0 goto L1
rY = 0
L1:
rY = -rY;
goto L3
L2:
rY /= rX
L3:
smod:
tmp = rX
tmp += 1 /* [-1, 0] -> [0, 1]
if tmp >(unsigned) 1 goto L1
if tmp == 1 (is64 ? goto L2 : goto L3)
rY = 0;
goto L2
L1:
rY %= rX
L2:
goto L4 // only when !is64
L3:
wY = wY // only when !is64
L4:
[1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/
Reported-by: Zac Ecob <zacecob@protonmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240913150326.1187788-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
"nvidia,tegra186-ccplex-cluster" is also documented in
arm/tegra/nvidia,tegra-ccplex-cluster.yaml. As it covers Tegra234 as
well, drop nvidia,tegra186-ccplex-cluster.yaml.
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240910234422.1042486-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The compatible strings for mt6795 clocks are also documented in other
schemas:
"mediatek,mt6795-apmixedsys" in clock/mediatek,apmixedsys.yaml
"mediatek,mt6795-topckgen" in clock/mediatek,topckgen.yaml
"mediatek,mt6795-pericfg" in clock/mediatek,pericfg.yaml
"mediatek,mt6795-infracfg" in clock/mediatek,infracfg.yaml
The only difference is #reset-cells is not allowed in some of these,
but that aligns with actual users in .dts files.
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240910234238.1028422-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Resources definition can become simpler and more organised by using the
dedicated helpers.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240912221605.27089-3-vassilisamir@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240912221605.27089-2-vassilisamir@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
"make dtbs_check":
arch/arm64/boot/dts/renesas/r8a77951-salvator-xs.dtb: clock-generator@6a: 'idt,shutdown' is a required property
From schema: Documentation/devicetree/bindings/clock/idt,versaclock5.yaml
arch/arm64/boot/dts/renesas/r8a77951-salvator-xs.dtb: clock-generator@6a: 'idt,output-enable-active' is a required property
From schema: Documentation/devicetree/bindings/clock/idt,versaclock5.yaml
Versaclock 5 clock generators can have their configuration stored in
One-Time Programmable (OTP) memory. Hence there is no need to specify
DT properties for manual configuration if the OTP has been programmed
before. Likewise, the Linux driver does not touch the SD/OE bits if the
corresponding properties are not specified, cfr. commit d83e561d43bc71e5
("clk: vc5: Add properties for configuring SD/OE behavior").
Reflect this in the bindings by making the "idt,shutdown" and
"idt,output-enable-active" properties not required, just like the
various "idt,*" properties in the per-output child nodes.
Fixes: 275e4e2dc0411508 ("dt-bindings: clk: vc5: Add properties for configuring the SD/OE pin")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://lore.kernel.org/r/68037ad181991fe0b792f6d003e3e9e538d5ffd7.1673452118.git.geert+renesas@glider.be
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The description of the function now explicitly states that it's
an *exact* match for the given string (i.e. not a submatch). It also
better states all the possible return values.
Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Link: https://lore.kernel.org/r/20240911204938.9172-1-mikisabate@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
__free() provides a scoped of_node_put() functionality to put the
device_node automatically, and we don't need to call of_node_put()
directly. Let's simplify the code a bit with the use of __free().
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Link: https://lore.kernel.org/r/20240830020626.115933-4-zhangzekun11@huawei.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Add a compatible for the SA8255p platform's KPSS watchdog.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Nikunj Kela <quic_nkela@quicinc.com>
Link: https://lore.kernel.org/r/20240910165926.2408630-1-quic_nkela@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Add an entry to fsl,imx8qm-irqsteer.
This fixes the following dt-schema warning:
failed to match any schema with compatible: ['fsl,imx8qm-irqsteer', 'fsl,imx-irqsteer']
Signed-off-by: Fabio Estevam <festevam@denx.de>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240701204106.160128-1-festevam@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Extended SPI and extended PPI interrupts are in the range [0-1023] and
[0-127] respectively, supported by GICv3.1.
Qualcomm SA8255p platform uses extended SPI for SCMI 'a2p' doorbells.
Signed-off-by: Nikunj Kela <quic_nkela@quicinc.com>
Link: https://lore.kernel.org/r/20240910162637.2382656-1-quic_nkela@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
LPC32XX SoCs use pl080 dma controller which have few request signals
multiplexed between peripherals. This binding describes how devices can
use the multiplexed request signals.
Signed-off-by: Piotr Wojtaszczyk <piotr.wojtaszczyk@timesys.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240627150046.258795-3-piotr.wojtaszczyk@timesys.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
"maxim,max1237" is already documented in iio/adc/maxim,max1238.yaml, so
drop it from trivial-devices.yaml.
Link: https://lore.kernel.org/r/20240903-dt-trivial-devices-v1-4-ad684c754b9c@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Drop LM75 compatible devices which are already documented in lm75.yaml.
Link: https://lore.kernel.org/r/20240903-dt-trivial-devices-v1-3-ad684c754b9c@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The correct vendor prefix for Analog Devices is "adi", not "ad". Both
forms are in use. Add the "adi,ad7414" version and deprecate the
"ad,ad7414" version.
Keep them together even though it breaks strict alphabetical ordering.
Link: https://lore.kernel.org/r/20240903-dt-trivial-devices-v1-2-ad684c754b9c@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
"at,24c08" does not have a correct vendor prefix. The correct compatible
string would be "atmel,24c08" which is already documented in at24.yaml.
It is also unused anywhere, so just drop it.
"st,24c256" is already documented in at24.yaml, so drop it as well.
Link: https://lore.kernel.org/r/20240903-dt-trivial-devices-v1-1-ad684c754b9c@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
commit 53ed3233e6b5 ("dt-bindings: input: qcom,pm8921-keypad: convert to
YAML format") resulted in a renaming of the output .txt file from
qcom,pm8xxx-keypad.txt to qcom,pm8921-keypad.yaml.
This patch makes a corresponding update to the link to that .txt file
in wakeup-source.txt.
Flagged by make htmldocs:
Warning: Documentation/devicetree/bindings/power/wakeup-source.txt references a file that doesn't exist: Documentation/devicetree/bindings/input/qcom,pm8xxx-keypad.txt
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240908-keypad-wakeup-ref-v1-1-762e4641468a@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Add compatible for pdc interrupt controller representing support on
SA8255p.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Nikunj Kela <quic_nkela@quicinc.com>
Link: https://lore.kernel.org/r/20240905191510.3775179-1-quic_nkela@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Corrected several typos in Documentation/devicetree/bindings files.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Link: https://lore.kernel.org/r/20240905151943.2792056-1-eleanor15x@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The members "start" and "end" of struct resource are of type
"resource_size_t" which can be 32bit wide.
Values read from OF however are always 64bit wide.
Refactor the diff overflow checks into a helper function.
Also extend the checks to validate each calculation step.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20240906-of-address-overflow-v1-1-19567aaa61da@linutronix.de
[robh: Fix to not return error on 0 sized resource]
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Provide the s390 specific vdso getrandom() architecture backend.
_vdso_rng_data required data is placed within the _vdso_data vvar page,
by using a hardcoded offset larger than vdso_data.
As required the chacha20 implementation does not write to the stack.
The implementation follows more or less the arm64 implementations and
makes use of vector instructions. It has a fallback to the getrandom()
system call for machines where the vector facility is not installed.
The check if the vector facility is installed, as well as an
optimization for machines with the vector-enhancements facility 2, is
implemented with alternatives, avoiding runtime checks.
Note that __kernel_getrandom() is implemented without the vdso user
wrapper which would setup a stack frame for odd cases (aka very old
glibc variants) where the caller has not done that. All callers of
__kernel_getrandom() are required to setup a stack frame, like the C ABI
requires it.
The vdso testcases vdso_test_getrandom and vdso_test_chacha pass.
Benchmark on a z16:
$ ./vdso_test_getrandom bench-single
vdso: 25000000 times in 0.493703559 seconds
syscall: 25000000 times in 6.584025337 seconds
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few last minute fixes for v6.11, they're all individually
unremarkable and only last minute due to when they came in"
* tag 'spi-fix-v6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: nxp-fspi: fix the KASAN report out-of-bounds bug
spi: geni-qcom: Fix incorrect free_irq() sequence
spi: geni-qcom: Undo runtime PM changes at driver exit time
|
|
When CONFIG_TRACEPOINTS=y but CONFIG_PAGE_POOL=n, we end up with this
build failure that is reported by the 0-day bot:
ld: vmlinux.o: in function `mp_dmabuf_devmem_alloc_netmems':
>> (.text+0xc37286): undefined reference to `__tracepoint_page_pool_state_hold'
>> ld: (.text+0xc3729a): undefined reference to `__SCT__tp_func_page_pool_state_hold'
>> ld: vmlinux.o:(__jump_table+0x10c48): undefined reference to `__tracepoint_page_pool_state_hold'
>> ld: vmlinux.o:(.static_call_sites+0xb824): undefined reference to `__SCK__tp_func_page_pool_state_hold'
The root cause is that in this configuration, traces are enabled but the
page_pool specific trace_page_pool_state_hold is not registered.
There is no reason to build the dmabuf memory provider when
CONFIG_PAGE_POOL is not present, as it's really a provider to the
page_pool.
In fact the whole NET_DEVMEM is RX path-only at the moment, so we can
make the entire config dependent on the PAGE_POOL.
Note that this may need to be revisited after/while devmem TX is
added, as devmem TX likely does not need CONFIG_PAGE_POOL. For now this
build fix is sufficient.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409131239.ysHQh4Tv-lkp@intel.com/
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://patch.msgid.link/20240913060746.2574191-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the function soc_tplg_dai_config, the logical jump
of 'goto err' is redundant, so remove it.
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Link: https://patch.msgid.link/20240908140259.3859-1-tangbin@cmss.chinamobile.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- Revert of earlier fix sent for non-continuous port map programming
which caused regression on Intel platforms
* tag 'soundwire-6.11-fixes_2' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps"
|
|
This file does not compile because <linux/mfd/sm5703.h> is missing.
In KConfig, it depends on MFD_SM5703.
Both MFD_SM5703 and the missing include rely on another patch that never
got merged. The last iteration related to this patch is [1].
So remove this dead-code and undo commit e8858ba89ca3 ("regulator:
sm5703-regulator: Add regulators support for SM5703 MFD")
[1]: https://lore.kernel.org/lkml/20220423085319.483524-5-markuss.broks@gmail.com/
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/0f5da91a05e7343d290c88e3c583b674cf6219ac.1725910247.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Pull drm fixes from Dave Airlie:
"Regular fixes pull, the amdgpu JPEG engine fixes are probably the
biggest, they look to block some register accessing, otherwise there
are just minor fixes and regression fixes all over.
nouveau had a regression report going back a few kernels that finally
got fixed, Not entirely happy with so many changes so late, but they
all seem quite benign apart from the jpeg one.
dma-buf/heaps:
- fix off by one in CMA heap fault handler
syncobj:
- fix syncobj leak in drm_syncobj_eventfd_ioctl
amdgpu:
- Avoid races between set_drr() functions and dc_state_destruct()
- Fix regerssion related to zpos
- Fix regression related to overlay cursor
- SMU 14.x updates
- JPEG fixes
- Silence an UBSAN warning
amdkfd:
- Fetch cacheline size from IP discovery
i915:
- Prevent a possible int overflow in wq offsets
xe:
- Remove a double include
- Fix null checks and UAF
- Fix access_ok check in user_fence_create
- Fix compat IS_DISPLAY_STEP() range
- OA fix
- Fixes in show_meminfo
nouveau:
- fix GP10x regression on boot
stm:
- add COMMON_CLK dep
rockchip:
- iommu api change
tegra:
- iommu api change"
* tag 'drm-fixes-2024-09-13' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
drm/xe/client: add missing bo locking in show_meminfo()
drm/xe/client: fix deadlock in show_meminfo()
drm/xe/oa: Enable Xe2+ PES disaggregation
drm/xe/display: fix compat IS_DISPLAY_STEP() range end
drm/xe: Fix access_ok check in user_fence_create
drm/xe: Fix possible UAF in guc_exec_queue_process_msg
drm/xe: Remove fence check from send_tlb_invalidation
drm/xe/gt: Remove double include
drm/amd/display: Add all planes on CRTC to state for overlay cursor
drm/amdgpu/atomfirmware: Silence UBSAN warning
drm/amd/amdgpu: apply command submission parser for JPEG v1
drm/amd/amdgpu: apply command submission parser for JPEG v2+
drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
drm/amd/pm: update the features set on smu v14.0.2/3
drm/amd/display: Do not reset planes based on crtc zpos_changed
drm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()
drm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()
drm/amdkfd: Add cache line size info
drm/tegra: Use iommu_paging_domain_alloc()
drm/rockchip: Use iommu_paging_domain_alloc()
...
|
|
_regulator_get() contains a lot of common code doing checks prior to the
regulator lookup and housekeeping work after the lookup. Almost all the
code could be shared with a OF-specific variant of _regulator_get().
Split out the common parts so that they can be reused. The OF-specific
version of _regulator_get() will be added in a subsequent patch.
No functional changes were made.
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20240911072751.365361-4-wenst@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add calibration related kcontrol for speaker impedance calibration and
speaker leakage check for Chromebook.
Signed-off-by: Shenghao Ding <shenghao-ding@ti.com>
Link: https://patch.msgid.link/20240911232739.1509-1-shenghao-ding@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Merge series from Vijendar Mukunda <Vijendar.Mukunda@amd.com>:
This patch series moves common Soundwire endpoint parsing and dai
creation logic to common placeholder from Intel generic SoundWire
machine driver code to make it generic. AMD SoundWire machine driver
code is refactored to use these functions for SoundWire endpoint
parsing and dai creation logic.
Link: https://github.com/thesofproject/linux/pull/5171
|
|
The vdso.h header file, which is included at many places, includes
generated header files. This can easily lead to recursive header file
inclusions if the vdso code is changed.
Therefore move the vdso symbol code, which requires the generated
header files, to a separate header file, and include it at the two
locations which require it.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Implement the infrastructure required to allow alternatives in vdso code.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Provide find_section() helper function which can be used to find a
section by name, similar to other architectures.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Let test_facility() generate a branch instruction if the tested facility is
a constant, and where the result cannot be evaluated during compile
time. The branch instruction defaults to "false" and is patched to nop
(branch not taken) if the tested facility is available.
This avoids runtime checks and is similar to x86's static_cpu_has() and
arm64's alternative_has_cap_likely().
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Patch all alternatives which depend on facilities from the decompressor.
There is no technical reason which enforces to split patching of such
alternatives to the decompressor and the kernel.
This simplifies alternative handling a bit, since one alternative type is
removed.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Disable compile time optimizations of test_facility() for the
decompressor. The decompressor should not contain any optimized code
depending on the architecture level set the kernel image is compiled
for to avoid unexpected operation exceptions.
Add a __DECOMPRESSOR check to test_facility() to enforce that
facilities are always checked during runtime for the decompressor.
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Running vdso_test_correctness on s390x (aka s390 64 bit) emits a warning:
Warning: failed to find clock_gettime64 in vDSO
This is caused by the "#elif defined (__s390__)" check in vdso_config.h
which the defines VDSO_32BIT.
If __s390x__ is defined also __s390__ is defined. Therefore the correct
check must make sure that only __s390__ is defined.
Therefore add the missing !defined(__s390x__). Also use common
__s390x__ define instead of __s390X__.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The vDSO self tests fail on s390x for a vDSO linked with the GNU linker
ld as follows:
# ./vdso_test_gettimeofday
Floating point exception (core dumped)
On s390x the ELF hash table entries are 64 bits instead of 32 bits in
size (see Glibc sysdeps/unix/sysv/linux/s390/bits/elfclass.h).
Fixes: 40723419f407 ("kselftest: Enable vDSO test on non x86 platforms")
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Extend getrandom() vDSO implementation to VDSO64.
Tested on QEMU on both ppc64_defconfig and ppc64le_defconfig.
Results from a Power9 (PowerNV):
~ # ./vdso_test_getrandom bench-single
vdso: 25000000 times in 0.787943615 seconds
libc: 25000000 times in 14.101887252 seconds
syscall: 25000000 times in 14.047475082 seconds
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
To be consistent with other VDSO functions, the function is called
__kernel_getrandom()
__arch_chacha20_blocks_nostack() fonction is implemented basically
with 32 bits operations. It performs 4 QUARTERROUND operations in
parallele. There are enough registers to avoid using the stack:
On input:
r3: output bytes
r4: 32-byte key input
r5: 8-byte counter input/output
r6: number of 64-byte blocks to write to output
During operation:
stack: pointer to counter (r5) and non-volatile registers (r14-131)
r0: counter of blocks (initialised with r6)
r4: Value '4' after key has been read, used for indexing
r5-r12: key
r14-r15: block counter
r16-r31: chacha state
At the end:
r0, r6-r12: Zeroised
r5, r14-r31: Restored
Performance on powerpc 885 (using kernel selftest):
~# ./vdso_test_getrandom bench-single
vdso: 25000000 times in 62.938002291 seconds
libc: 25000000 times in 535.581916866 seconds
syscall: 25000000 times in 531.525042806 seconds
Performance on powerpc 8321 (using kernel selftest):
~# ./vdso_test_getrandom bench-single
vdso: 25000000 times in 16.899318858 seconds
libc: 25000000 times in 131.050596522 seconds
syscall: 25000000 times in 129.794790389 seconds
This first patch adds support for VDSO32. As selftests cannot easily
be generated only for VDSO32, and because the following patch brings
support for VDSO64 anyway, this patch opts out all code in
__arch_chacha20_blocks_nostack() so that vdso_test_chacha will not
fail to compile and will not crash on PPC64/PPC64LE, allthough the
selftest itself will fail.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
In order to avoid two much duplication when we add new VDSO
functionnalities in C like getrandom, refactor common CFLAGS.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Commit 08c18b63d965 ("powerpc/vdso32: Add missing _restgpr_31_x to fix
build failure") added _restgpr_31_x to the vdso for gettimeofday, but
the work on getrandom shows that we will need more of those functions.
Remove _restgpr_31_x and link in crtsavres.o so that we get all
save/restore functions when optimising the kernel for size.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Commit 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always
lazily freeable mappings") only adds VM_DROPPABLE for 64 bits
architectures.
In order to also use the getrandom vDSO implementation on powerpc/32,
use VM_ARCH_1 for VM_DROPPABLE on powerpc/32. This is possible because
VM_ARCH_1 is used for VM_SAO on powerpc and VM_SAO is only for
powerpc/64. It is used in combination with PROT_SAO in some parts of
code that are restricted to CONFIG_PPC64 through #ifdefs, it is
therefore possible to define VM_SAO for CONFIG_PPC64 only.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
When running in a non-root time namespace, the global VDSO data page
is replaced by a dedicated namespace data page and the global data
page is mapped next to it. Detailed explanations can be found at
commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support").
When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq
and __kernel_sync_dicache don't work anymore because they read 0
instead of the data they need.
To address that, clock_mode has to be read. When it is set to
VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page
and the global data is located on the following page.
Add a macro called get_realdatapage which reads clock_mode and add
PAGE_SIZE to the pointer provided by get_datapage macro when
clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro
instead of get_datapage macro except for time functions as they handle
it internally.
Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces")
Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Closes: https://lore.kernel.org/all/ZtnYqZI-nrsNslwy@zx2c4.com/
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
It's not correct to use $(top_srcdir) for generated header files, for
builds that are done out of tree via O=, and $(objtree) isn't valid in
the selftests context. Instead, just obviate the need for these
generated header files by defining empty stubs in tools/include, which
is the same thing that's done for rwlock.h.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Hook up the generic vDSO implementation to the aarch64 vDSO data page.
The _vdso_rng_data required data is placed within the _vdso_data vvar
page, by using a offset larger than the vdso_data.
The vDSO function requires a ChaCha20 implementation that does not write
to the stack, and that can do an entire ChaCha20 permutation. The one
provided uses NEON on the permute operation, with a fallback to the
syscall for chips that do not support AdvSIMD.
This also passes the vdso_test_chacha test along with
vdso_test_getrandom. The vdso_test_getrandom bench-single result on
Neoverse-N1 shows:
vdso: 25000000 times in 0.783884250 seconds
libc: 25000000 times in 8.780275399 seconds
syscall: 25000000 times in 8.786581518 seconds
A small fixup to arch/arm64/include/asm/mman.h was required to avoid
pulling kernel code into the vDSO, similar to what's already done in
arch/arm64/include/asm/rwonce.h.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Currently alternative_has_cap_unlikely() can be used in VDSO code, but
alternative_has_cap_likely() cannot as it references alt_cb_patch_nops,
which is not available when linking the VDSO. This is unfortunate as it
would be useful to have alternative_has_cap_likely() available in VDSO
code.
The use of alt_cb_patch_nops was added in commit:
d926079f17bf8aa4 ("arm64: alternatives: add shared NOP callback")
... as removing duplicate NOPs within the kernel Image saved areasonable
amount of space.
Given the VDSO code will have nowhere near as many alternative branches
as the main kernel image, this isn't much of a concern, and a few extra
nops isn't a massive problem.
Change alternative_has_cap_likely() to only use alt_cb_patch_nops for
the main kernel image, and allow duplicate NOPs in VDSO code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The chacha vDSO selftest doesn't check the way the counter is handled by
__arch_chacha20_blocks_nostack(). It indirectly checks that the counter
is writen on exit and read back on new entry, but it doesn't check that
the format is correct. When implementing this function on powerpc, I
missed a case where the counter was writen and read in wrong byte order.
Also, the counter uses two words, but the tests with a zero counter and
uses a small amount of blocks, so at the end the upper part of the
counter is always 0, so it is not checked.
Add a verification of counter's content in addition to the verification
of the output.
Also add two tests where the counter crosses the u32 upper limit. The
first test verifies that the function properly writes back the upper
word, the second test verifies that the function properly reads back the
upper word.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Without -O2, the generated code for testing chacha function is awful.
GCC even implements rol32() as a function of 20 instructions instead of
just using the rotlwi instruction.
~# time ./vdso_test_chacha
TAP version 13
1..1
ok 1 chacha: PASS
real 0m 37.16s
user 0m 36.89s
sys 0m 0.26s
Several other selftests directory add -O2, and the kernel is also
always built with optimisation active. Do the same for vDSO selftests.
With this patch the time is reduced by approximately 15%.
~# time ./vdso_test_chacha
TAP version 13
1..1
ok 1 chacha: PASS
real 0m 32.09s
user 0m 31.86s
sys 0m 0.22s
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|