summaryrefslogtreecommitdiff
path: root/lib/raid6
AgeCommit message (Collapse)Author
2023-12-11s390/fpu: get rid of MACHINE_HAS_VXHeiko Carstens
Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx() which is a short readable wrapper for "test_facility(129)". Facility bit 129 is set if the vector facility is present. test_facility() returns also true for all bits which are set in the architecture level set of the cpu that the kernel is compiled for. This means that test_facility(129) is a compile time constant which returns true for z13 and later, since the vector facility bit is part of the z13 kernel ALS. In result the compiled code will have less runtime checks, and less code. Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-09-11lib/raid6: Drop IA64 supportArd Biesheuvel
Drop Itanium support from the RAID6 code, and along with it, the 16x and 32x unrolled versions, which were only used by IA64. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-06raid6: Add LoongArch SIMD recovery implementationWANG Xuerui
Similar to the syndrome calculation, the recovery algorithms also work on 64 bytes at a time to align with the L1 cache line size of current and future LoongArch cores (that we care about). Which means unrolled-by-4 LSX and unrolled-by-2 LASX code. The assembly is originally based on the x86 SSSE3/AVX2 ports, but register allocation has been redone to take advantage of LSX/LASX's 32 vector registers, and instruction sequence has been optimized to suit (e.g. LoongArch can perform per-byte srl and andi on vectors, but x86 cannot). Performance numbers measured by instrumenting the raid6test code, on a 3A5000 system clocked at 2.5GHz: > lasx 2data: 354.987 MiB/s > lasx datap: 350.430 MiB/s > lsx 2data: 340.026 MiB/s > lsx datap: 337.318 MiB/s > intx1 2data: 164.280 MiB/s > intx1 datap: 187.966 MiB/s Because recovery algorithms are chosen solely based on priority and availability, lasx is marked as priority 2 and lsx priority 1. At least for the current generation of LoongArch micro-architectures, LASX should always be faster than LSX whenever supported, and have similar power consumption characteristics (because the only known LASX-capable uarch, the LA464, always compute the full 256-bit result for vector ops). Acked-by: Song Liu <song@kernel.org> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06raid6: Add LoongArch SIMD syndrome calculationWANG Xuerui
The algorithms work on 64 bytes at a time, which is the L1 cache line size of all current and future LoongArch cores (that we care about), as confirmed by Huacai. The code is based on the generic int.uc algorithm, unrolled 4 times for LSX and 2 times for LASX. Further unrolling does not meaningfully improve the performance according to experiments. Performance numbers measured during system boot on a 3A5000 @ 2.5GHz: > raid6: lasx gen() 12726 MB/s > raid6: lsx gen() 10001 MB/s > raid6: int64x8 gen() 2876 MB/s > raid6: int64x4 gen() 3867 MB/s > raid6: int64x2 gen() 2531 MB/s > raid6: int64x1 gen() 1945 MB/s Comparison of xor() speeds (from different boots but meaningful anyway): > lasx: 11226 MB/s > lsx: 6395 MB/s > int64x4: 2147 MB/s Performance as measured by raid6test: > raid6: lasx gen() 25109 MB/s > raid6: lsx gen() 13233 MB/s > raid6: int64x8 gen() 4164 MB/s > raid6: int64x4 gen() 6005 MB/s > raid6: int64x2 gen() 5781 MB/s > raid6: int64x1 gen() 4119 MB/s > raid6: using algorithm lasx gen() 25109 MB/s > raid6: .... xor() 14439 MB/s, rmw enabled Acked-by: Song Liu <song@kernel.org> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-15raid6: test: only check for Altivec if building on powerpc hostsWANG Xuerui
Altivec is only available for powerpc hosts, so only check for its availability when the host is powerpc, to avoid error messages being shown on architectures other than x86, arm or powerpc. Signed-off-by: WANG Xuerui <git@xen0n.name> Link: https://lore.kernel.org/r/20230731104911.411964-6-kernel@xen0n.name Signed-off-by: Song Liu <song@kernel.org>
2023-08-15raid6: test: make sure all intermediate and artifact files are .gitignoredWANG Xuerui
Currently when the raid6test utility is built, the resulting binary and an int.uc file are not being ignored, which can get inadvertently committed as a result when one works on the raid6 code. Ignore them to make `git status` clean at all times. Signed-off-by: WANG Xuerui <git@xen0n.name> Link: https://lore.kernel.org/r/20230731104911.411964-5-kernel@xen0n.name Signed-off-by: Song Liu <song@kernel.org>
2023-08-15raid6: test: cosmetic cleanups for the test MakefileWANG Xuerui
Use tabs/spaces consistently: hard tabs for marking recipe lines only, spaces for everything else. Also, the OPTFLAGS declaration actually included the tabs preceding the line comment, making compiler invocation lines unnecessarily long. As the entire block of declarations are meant for ad-hoc customization (otherwise they would probably make use of `?=` instead of `=`), move the "Adjust as desired" comment above the block too to fix the long invocation lines. Signed-off-by: WANG Xuerui <git@xen0n.name> Link: https://lore.kernel.org/r/20230731104911.411964-4-kernel@xen0n.name Signed-off-by: Song Liu <song@kernel.org>
2023-08-15raid6: guard the tables.c include of <linux/export.h> with __KERNEL__WANG Xuerui
The export directives for the tables are already emitted with __KERNEL__ guards, but the <linux/export.h> include is not, causing errors when building the raid6test program. Guard this include too to fix the raid6test build. Signed-off-by: WANG Xuerui <git@xen0n.name> Link: https://lore.kernel.org/r/20230731104911.411964-3-kernel@xen0n.name Signed-off-by: Song Liu <song@kernel.org>
2023-08-15raid6: remove the <linux/export.h> include from recov.cWANG Xuerui
There is no exported symbol left in recov.c, so the include is now unnecessary, and breaks the raid6test build. Remove it. Signed-off-by: WANG Xuerui <git@xen0n.name> Link: https://lore.kernel.org/r/20230731104911.411964-2-kernel@xen0n.name Signed-off-by: Song Liu <song@kernel.org>
2023-06-13raid6: neon: add missing prototypesArnd Bergmann
The raid6 syndrome functions are generated for different sizes and have no generic prototype, while in the inner functions have a prototype in a header that cannot be included from the correct file. In both cases, the compiler warns about missing prototypes: lib/raid6/recov_neon_inner.c:27:6: warning: no previous prototype for '__raid6_2data_recov_neon' [-Wmissing-prototypes] lib/raid6/recov_neon_inner.c:77:6: warning: no previous prototype for '__raid6_datap_recov_neon' [-Wmissing-prototypes] lib/raid6/neon1.c:56:6: warning: no previous prototype for 'raid6_neon1_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon1.c:86:6: warning: no previous prototype for 'raid6_neon1_xor_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon2.c:56:6: warning: no previous prototype for 'raid6_neon2_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon2.c:97:6: warning: no previous prototype for 'raid6_neon2_xor_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon4.c:56:6: warning: no previous prototype for 'raid6_neon4_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon4.c:119:6: warning: no previous prototype for 'raid6_neon4_xor_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon8.c:56:6: warning: no previous prototype for 'raid6_neon8_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon8.c:163:6: warning: no previous prototype for 'raid6_neon8_xor_syndrome_real' [-Wmissing-prototypes] Add a new header file that contains the prototypes for both to avoid the warnings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230517132220.937200-1-arnd@kernel.org
2022-12-13Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - NVMe pull requests via Christoph: - Support some passthrough commands without CAP_SYS_ADMIN (Kanchan Joshi) - Refactor PCIe probing and reset (Christoph Hellwig) - Various fabrics authentication fixes and improvements (Sagi Grimberg) - Avoid fallback to sequential scan due to transient issues (Uday Shankar) - Implement support for the DEAC bit in Write Zeroes (Christoph Hellwig) - Allow overriding the IEEE OUI and firmware revision in configfs for nvmet (Aleksandr Miloserdov) - Force reconnect when number of queue changes in nvmet (Daniel Wagner) - Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi Grimberg, Christoph Hellwig, Christophe JAILLET) - Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni) - Use the common tagset helpers in nvme-pci driver (Christoph Hellwig) - Cleanup the nvme-pci removal path (Christoph Hellwig) - Use kstrtobool() instead of strtobool (Christophe JAILLET) - Allow unprivileged passthrough of Identify Controller (Joel Granados) - Support io stats on the mpath device (Sagi Grimberg) - Minor nvmet cleanup (Sagi Grimberg) - MD pull requests via Song: - Code cleanups (Christoph) - Various fixes - Floppy pull request from Denis: - Fix a memory leak in the init error path (Yuan) - Series fixing some batch wakeup issues with sbitmap (Gabriel) - Removal of the pktcdvd driver that was deprecated more than 5 years ago, and subsequent removal of the devnode callback in struct block_device_operations as no users are now left (Greg) - Fix for partition read on an exclusively opened bdev (Jan) - Series of elevator API cleanups (Jinlong, Christoph) - Series of fixes and cleanups for blk-iocost (Kemeng) - Series of fixes and cleanups for blk-throttle (Kemeng) - Series adding concurrent support for sync queues in BFQ (Yu) - Series bringing drbd a bit closer to the out-of-tree maintained version (Christian, Joel, Lars, Philipp) - Misc drbd fixes (Wang) - blk-wbt fixes and tweaks for enable/disable (Yu) - Fixes for mq-deadline for zoned devices (Damien) - Add support for read-only and offline zones for null_blk (Shin'ichiro) - Series fixing the delayed holder tracking, as used by DM (Yu, Christoph) - Series enabling bio alloc caching for IRQ based IO (Pavel) - Series enabling userspace peer-to-peer DMA (Logan) - BFQ waker fixes (Khazhismel) - Series fixing elevator refcount issues (Christoph, Jinlong) - Series cleaning up references around queue destruction (Christoph) - Series doing quiesce by tagset, enabling cleanups in drivers (Christoph, Chao) - Series untangling the queue kobject and queue references (Christoph) - Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye, Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph) * tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits) blktrace: Fix output non-blktrace event when blk_classic option enabled block: sed-opal: Don't include <linux/kernel.h> sed-opal: allow using IOC_OPAL_SAVE for locking too blk-cgroup: Fix typo in comment block: remove bio_set_op_attrs nvmet: don't open-code NVME_NS_ATTR_RO enumeration nvme-pci: use the tagset alloc/free helpers nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers nvme: consolidate setting the tagset flags nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set block: bio_copy_data_iter nvme-pci: split out a nvme_pci_ctrl_is_dead helper nvme-pci: return early on ctrl state mismatch in nvme_reset_work nvme-pci: rename nvme_disable_io_queues nvme-pci: cleanup nvme_suspend_queue nvme-pci: remove nvme_pci_disable nvme-pci: remove nvme_disable_admin_queue nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl nvme: use nvme_wait_ready in nvme_shutdown_ctrl ...
2022-12-06s390/vx: add vx-insn.h wrapper include fileHeiko Carstens
The vector instruction macros can also be used in inline assemblies. For this the magic asm(".include \"asm/vx-insn.h\"\n"); must be added to C files in order to avoid that the pre-processor eliminates the __ASSEMBLY__ guarded macros. This however comes with the problem that changes to asm/vx-insn.h do not cause a recompile of C files which have only this magic statement instead of a proper include statement. This can be observed with the arch/s390/kernel/fpu.c file. In order to fix this problem and also to avoid that the include must be specified twice, add a wrapper include header file which will do all necessary steps. This way only the vx-insn.h header file needs to be included and changes to the new vx-insn-asm.h header file cause a recompile of all dependent files like it should. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-14lib/raid6: drop RAID6_USE_EMPTY_ZERO_PAGEGiulio Benetti
RAID6_USE_EMPTY_ZERO_PAGE is unused and hardcoded to 0, so let's drop it. Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-03-08lib/raid6: Include <asm/ppc-opcode.h> for VPERMXORPaul Menzel
On Ubuntu 21.10 (ppc64le) building raid6test with gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 fails with the error below. gcc -I.. -I ../../../include -g -O2 \ -I../../../arch/powerpc/include -DCONFIG_ALTIVEC \ -c -o vpermxor1.o vpermxor1.c vpermxor1.c: In function ‘raid6_vpermxor1_gen_syndrome_real’: vpermxor1.c:64:29: error: expected string literal before ‘VPERMXOR’ 64 | asm(VPERMXOR(%0,%1,%2,%3):"=v"(wq0):"v"(gf_high), "v"(gf_low), "v"(wq0)); | ^~~~~~~~ make: *** [Makefile:58: vpermxor1.o] Error 1 So, include the header asm/ppc-opcode.h defining this macro also when not building the Linux kernel but only this too. Cc: Matt Brown <matthew.brown.dev@gmail.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org>
2022-03-08lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3Paul Menzel
Buidling raid6test on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the errors below: $ cd lib/raid6/test/ $ make <stdin>:1:1: error: stray ‘\’ in program <stdin>:1:2: error: stray ‘#’ in program <stdin>:1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ \ before ‘<’ token [...] The errors come from the HAS_ALTIVEC test, which fails, and the POWER optimized versions are not built. That’s also reason nobody noticed on the other architectures. GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release announcment: > * WARNING: Backward-incompatibility! > Number signs (#) appearing inside a macro reference or function invocation > no longer introduce comments and should not be escaped with backslashes: > thus a call such as: > foo := $(shell echo '#') > is legal. Previously the number sign needed to be escaped, for example: > foo := $(shell echo '\#') > Now this latter will resolve to "\#". If you want to write makefiles > portable to both versions, assign the number sign to a variable: > H := \# > foo := $(shell echo '$H') > This was claimed to be fixed in 3.81, but wasn't, for some reason. > To detect this change search for 'nocomment' in the .FEATURES variable. So, do the same as commit 9564a8cf422d ("Kbuild: fix # escaping in .cmd files for future Make") and commit 929bef467771 ("bpf: Use $(pound) instead of \# in Makefiles") and define and use a $(pound) variable. Reference for the change in make: https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Cc: Matt Brown <matthew.brown.dev@gmail.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org>
2022-03-08lib/raid6/test: fix multiple definition linking errorDirk Müller
GCC 10+ defaults to -fno-common, which enforces proper declaration of external references using "extern". without this change a link would fail with: lib/raid6/test/algos.c:28: multiple definition of `raid6_call'; lib/raid6/test/test.c:22: first defined here the pq.h header that is included already includes an extern declaration so we can just remove the redundant one here. Cc: <stable@vger.kernel.org> Signed-off-by: Dirk Müller <dmueller@suse.de> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org>
2022-01-06lib/raid6: Use strict priority ranking for pq gen() benchmarkingDirk Müller
On x86_64, currently 3 variants of AVX512, 3 variants of AVX2 and 3 variants of SSE2 are benchmarked on initialization, taking between 144-153 jiffies. Testing across a hardware pool of various generations of intel cpus I could not find a single case where SSE2 won over AVX2 or AVX512. There are cases where AVX2 wins over AVX512 however. Change "prefer" into an integer priority field (similar to how recov selection works) to have more than one ranking level available, which is backwards compatible with existing behavior. Give AVX2/512 variants higher priority over SSE2 in order to skip SSE testing when AVX is available. in a AVX2/x86_64/HZ=250 case this saves in the order of 200ms of initialization time. Signed-off-by: Dirk Müller <dmueller@suse.de> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org>
2022-01-06lib/raid6: skip benchmark of non-chosen xor_syndrome functionsDirk Müller
In commit fe5cbc6e06c7 ("md/raid6 algorithms: delta syndrome functions") a xor_syndrome() benchmarking was added also to the raid6_choose_gen() function. However, the results of that benchmarking were intentionally discarded and did not influence the choice. It picked the xor_syndrome() variant related to the best performing gen_syndrome(). Reduce runtime of raid6_choose_gen() without modifying its outcome by only benchmarking the xor_syndrome() of the best gen_syndrome() variant. For a HZ=250 x86_64 system with avx2 and without avx512 this removes 5 out of 6 xor() benchmarks, saving 340ms of raid6 initialization time. Signed-off-by: Dirk Müller <dmueller@suse.de> Signed-off-by: Song Liu <song@kernel.org>
2021-09-22isystem: delete global -isystem compile optionAlexey Dobriyan
Further isolate kernel from userspace, prevent accidental inclusion of undesireable headers, mainly float.h and stdatomic.h. nds32 keeps -isystem globally due to intrinsics used in entrenched header. -isystem is selectively reenabled for some files, again, for intrinsics. Compile tested on: hexagon-defconfig hexagon-allmodconfig alpha-allmodconfig alpha-allnoconfig alpha-defconfig arm64-allmodconfig arm64-allnoconfig arm64-defconfig arm-am200epdkit arm-aspeed_g4 arm-aspeed_g5 arm-assabet arm-at91_dt arm-axm55xx arm-badge4 arm-bcm2835 arm-cerfcube arm-clps711x arm-cm_x300 arm-cns3420vb arm-colibri_pxa270 arm-colibri_pxa300 arm-collie arm-corgi arm-davinci_all arm-dove arm-ep93xx arm-eseries_pxa arm-exynos arm-ezx arm-footbridge arm-gemini arm-h3600 arm-h5000 arm-hackkit arm-hisi arm-imote2 arm-imx_v4_v5 arm-imx_v6_v7 arm-integrator arm-iop32x arm-ixp4xx arm-jornada720 arm-keystone arm-lart arm-lpc18xx arm-lpc32xx arm-lpd270 arm-lubbock arm-magician arm-mainstone arm-milbeaut_m10v arm-mini2440 arm-mmp2 arm-moxart arm-mps2 arm-multi_v4t arm-multi_v5 arm-multi_v7 arm-mv78xx0 arm-mvebu_v5 arm-mvebu_v7 arm-mxs arm-neponset arm-netwinder arm-nhk8815 arm-omap1 arm-omap2plus arm-orion5x arm-oxnas_v6 arm-palmz72 arm-pcm027 arm-pleb arm-pxa arm-pxa168 arm-pxa255-idp arm-pxa3xx arm-pxa910 arm-qcom arm-realview arm-rpc arm-s3c2410 arm-s3c6400 arm-s5pv210 arm-sama5 arm-shannon arm-shmobile arm-simpad arm-socfpga arm-spear13xx arm-spear3xx arm-spear6xx arm-spitz arm-stm32 arm-sunxi arm-tct_hammer arm-tegra arm-trizeps4 arm-u8500 arm-versatile arm-vexpress arm-vf610m4 arm-viper arm-vt8500_v6_v7 arm-xcep arm-zeus csky-allmodconfig csky-allnoconfig csky-defconfig h8300-edosk2674 h8300-h8300h-sim h8300-h8s-sim i386-allmodconfig i386-allnoconfig i386-defconfig ia64-allmodconfig ia64-allnoconfig ia64-bigsur ia64-generic ia64-gensparse ia64-tiger ia64-zx1 m68k-amcore m68k-amiga m68k-apollo m68k-atari m68k-bvme6000 m68k-hp300 m68k-m5208evb m68k-m5249evb m68k-m5272c3 m68k-m5275evb m68k-m5307c3 m68k-m5407c3 m68k-m5475evb m68k-mac m68k-multi m68k-mvme147 m68k-mvme16x m68k-q40 m68k-stmark2 m68k-sun3 m68k-sun3x microblaze-allmodconfig microblaze-allnoconfig microblaze-mmu mips-ar7 mips-ath25 mips-ath79 mips-bcm47xx mips-bcm63xx mips-bigsur mips-bmips_be mips-bmips_stb mips-capcella mips-cavium_octeon mips-ci20 mips-cobalt mips-cu1000-neo mips-cu1830-neo mips-db1xxx mips-decstation mips-decstation_64 mips-decstation_r4k mips-e55 mips-fuloong2e mips-gcw0 mips-generic mips-gpr mips-ip22 mips-ip27 mips-ip28 mips-ip32 mips-jazz mips-jmr3927 mips-lemote2f mips-loongson1b mips-loongson1c mips-loongson2k mips-loongson3 mips-malta mips-maltaaprp mips-malta_kvm mips-malta_qemu_32r6 mips-maltasmvp mips-maltasmvp_eva mips-maltaup mips-maltaup_xpa mips-mpc30x mips-mtx1 mips-nlm_xlp mips-nlm_xlr mips-omega2p mips-pic32mzda mips-pistachio mips-qi_lb60 mips-rb532 mips-rbtx49xx mips-rm200 mips-rs90 mips-rt305x mips-sb1250_swarm mips-tb0219 mips-tb0226 mips-tb0287 mips-vocore2 mips-workpad mips-xway nds32-allmodconfig nds32-allnoconfig nds32-defconfig nios2-10m50 nios2-3c120 nios2-allmodconfig nios2-allnoconfig openrisc-allmodconfig openrisc-allnoconfig openrisc-or1klitex openrisc-or1ksim openrisc-simple_smp parisc-allnoconfig parisc-generic-32bit parisc-generic-64bit powerpc-acadia powerpc-adder875 powerpc-akebono powerpc-amigaone powerpc-arches powerpc-asp8347 powerpc-bamboo powerpc-bluestone powerpc-canyonlands powerpc-cell powerpc-chrp32 powerpc-cm5200 powerpc-currituck powerpc-ebony powerpc-eiger powerpc-ep8248e powerpc-ep88xc powerpc-fsp2 powerpc-g5 powerpc-gamecube powerpc-ge_imp3a powerpc-holly powerpc-icon powerpc-iss476-smp powerpc-katmai powerpc-kilauea powerpc-klondike powerpc-kmeter1 powerpc-ksi8560 powerpc-linkstation powerpc-lite5200b powerpc-makalu powerpc-maple powerpc-mgcoge powerpc-microwatt powerpc-motionpro powerpc-mpc512x powerpc-mpc5200 powerpc-mpc7448_hpc2 powerpc-mpc8272_ads powerpc-mpc8313_rdb powerpc-mpc8315_rdb powerpc-mpc832x_mds powerpc-mpc832x_rdb powerpc-mpc834x_itx powerpc-mpc834x_itxgp powerpc-mpc834x_mds powerpc-mpc836x_mds powerpc-mpc836x_rdk powerpc-mpc837x_mds powerpc-mpc837x_rdb powerpc-mpc83xx powerpc-mpc8540_ads powerpc-mpc8560_ads powerpc-mpc85xx_cds powerpc-mpc866_ads powerpc-mpc885_ads powerpc-mvme5100 powerpc-obs600 powerpc-pasemi powerpc-pcm030 powerpc-pmac32 powerpc-powernv powerpc-ppa8548 powerpc-ppc40x powerpc-ppc44x powerpc-ppc64 powerpc-ppc64e powerpc-ppc6xx powerpc-pq2fads powerpc-ps3 powerpc-pseries powerpc-rainier powerpc-redwood powerpc-sam440ep powerpc-sbc8548 powerpc-sequoia powerpc-skiroot powerpc-socrates powerpc-storcenter powerpc-stx_gp3 powerpc-taishan powerpc-tqm5200 powerpc-tqm8540 powerpc-tqm8541 powerpc-tqm8548 powerpc-tqm8555 powerpc-tqm8560 powerpc-tqm8xx powerpc-walnut powerpc-warp powerpc-wii powerpc-xes_mpc85xx riscv-allmodconfig riscv-allnoconfig riscv-nommu_k210 riscv-nommu_k210_sdcard riscv-nommu_virt riscv-rv32 s390-allmodconfig s390-allnoconfig s390-debug s390-zfcpdump sh-ap325rxa sh-apsh4a3a sh-apsh4ad0a sh-dreamcast sh-ecovec24 sh-ecovec24-romimage sh-edosk7705 sh-edosk7760 sh-espt sh-hp6xx sh-j2 sh-kfr2r09 sh-kfr2r09-romimage sh-landisk sh-lboxre2 sh-magicpanelr2 sh-microdev sh-migor sh-polaris sh-r7780mp sh-r7785rp sh-rsk7201 sh-rsk7203 sh-rsk7264 sh-rsk7269 sh-rts7751r2d1 sh-rts7751r2dplus sh-sdk7780 sh-sdk7786 sh-se7206 sh-se7343 sh-se7619 sh-se7705 sh-se7712 sh-se7721 sh-se7722 sh-se7724 sh-se7750 sh-se7751 sh-se7780 sh-secureedge5410 sh-sh03 sh-sh2007 sh-sh7710voipgw sh-sh7724_generic sh-sh7757lcr sh-sh7763rdp sh-sh7770_generic sh-sh7785lcr sh-sh7785lcr_32bit sh-shmin sh-shx3 sh-titan sh-ul2 sh-urquell sparc-allmodconfig sparc-allnoconfig sparc-sparc32 sparc-sparc64 um-i386-allmodconfig um-i386-allnoconfig um-i386-defconfig um-x86_64-allmodconfig um-x86_64-allnoconfig x86_64-allmodconfig x86_64-allnoconfig x86_64-defconfig xtensa-allmodconfig xtensa-allnoconfig xtensa-audio_kc705 xtensa-cadence_csp xtensa-common xtensa-generic_kc705 xtensa-iss xtensa-nommu_kc705 xtensa-smp_lx200 xtensa-virt xtensa-xip_kc705 Tested-by: Nathan Chancellor <nathan@kernel.org> # build (hexagon) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-01-04lib/raid6: Let $(UNROLL) rules work with macOS userlandJohn Millikin
Older versions of BSD awk are fussy about the order of '-v' and '-f' flags, and require a space after the flag name. This causes build failures on platforms with an old awk, such as macOS and NetBSD. Since GNU awk and modern versions of BSD awk (distributed with FreeBSD/OpenBSD) are fine with either form, the definition of 'cmd_unroll' can be trivially tweaked to let the lib/raid6 Makefile work with both old and new awk flag dialects. Signed-off-by: John Millikin <john@john-millikin.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-04-09x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2Jason A. Donenfeld
Now that the kernel specifies binutils 2.23 as the minimum version, we can remove ifdefs for AVX2 and ADX throughout. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-04-09x86: remove always-defined CONFIG_AS_SSSE3Masahiro Yamada
CONFIG_AS_SSSE3 was introduced by commit 75aaf4c3e6a4 ("x86/raid6: correctly check for assembler capabilities"). We raise the minimal supported binutils version from time to time. The last bump was commit 1fb12b35e5ff ("kbuild: Raise the minimum required binutils version to 2.21"). I confirmed the code in $(call as-instr,...) can be assembled by the binutils 2.21 assembler and also by LLVM integrated assembler. Remove CONFIG_AS_SSSE3, which is always defined. I added ifdef CONFIG_X86 to lib/raid6/algos.c to avoid link errors on non-x86 architectures. lib/raid6/algos.c is built not only for the kernel but also for testing the library code from userspace. I added -DCONFIG_X86 to lib/raid6/test/Makefile to cator to this usecase. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Ingo Molnar <mingo@kernel.org>
2020-04-09lib/raid6/test: fix build on distros whose /bin/sh is not bashMasahiro Yamada
You can build a user-space test program for the raid6 library code, like this: $ cd lib/raid6/test $ make The command in $(shell ...) function is evaluated by /bin/sh by default. (or, you can specify the shell by passing SHELL=<shell> from command line) Currently '>&/dev/null' is used to sink both stdout and stderr. Because this code is bash-ism, it only works when /bin/sh is a symbolic link to bash (this is the case on RHEL etc.) This does not work on Ubuntu where /bin/sh is a symbolic link to dash. I see lots of /bin/sh: 1: Syntax error: Bad fd number and warning "your version of binutils lacks ... support" Replace it with portable '>/dev/null 2>&1'. Fixes: 4f8c55c5ad49 ("lib/raid6: build proper files on corresponding arch") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2020-03-25.gitignore: add SPDX License IdentifierMasahiro Yamada
Add SPDX License Identifier to all .gitignore files. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-04kbuild: rename hostprogs-y/always to hostprogs/always-yMasahiro Yamada
In old days, the "host-progs" syntax was used for specifying host programs. It was renamed to the current "hostprogs-y" in 2004. It is typically useful in scripts/Makefile because it allows Kbuild to selectively compile host programs based on the kernel configuration. This commit renames like follows: always -> always-y hostprogs-y -> hostprogs So, scripts/Makefile will look like this: always-$(CONFIG_BUILD_BIN2C) += ... always-$(CONFIG_KALLSYMS) += ... ... hostprogs := $(always-y) $(always-m) I think this makes more sense because a host program is always a host program, irrespective of the kernel configuration. We want to specify which ones to compile by CONFIG options, so always-y will be handier. The "always", "hostprogs-y", "hostprogs-m" will be kept for backward compatibility for a while. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-01-13md/raid6: fix algorithm choice under larger PAGE_SIZEZhengyuan Liu
There are several algorithms available for raid6 to generate xor and syndrome parity, including basic int1, int2 ... int32 and SIMD optimized implementation like sse and neon. To test and choose the best algorithms at the initial stage, we need provide enough disk data to feed the algorithms. However, the disk number we provided depends on page size and gfmul table, seeing bellow: const int disks = (65536/PAGE_SIZE) + 2; So when come to 64K PAGE_SIZE, there is only one data disk plus 2 parity disk, as a result the chosed algorithm is not reliable. For example, on my arm64 machine with 64K page enabled, it will choose intx32 as the best one, although the NEON implementation is better. This patch tries to fix the problem by defining a constant raid6 disk number to supporting arbitrary page size. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid6/test: fix a compilation warningZhengyuan Liu
The compilation warning is redefination showed as following: In file included from tables.c:2: ../../../include/linux/export.h:180: warning: "EXPORT_SYMBOL" redefined #define EXPORT_SYMBOL(sym) __EXPORT_SYMBOL(sym, "") In file included from tables.c:1: ../../../include/linux/raid/pq.h:61: note: this is the location of the previous definition #define EXPORT_SYMBOL(sym) Fixes: 69a94abb82ee ("export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2019-12-09lib: raid6: fix awk build warningsGreg Kroah-Hartman
Newer versions of awk spit out these fun warnings: awk: ../lib/raid6/unroll.awk:16: warning: regexp escape sequence `\#' is not a known regexp operator As commit 700c1018b86d ("x86/insn: Fix awk regexp warnings") showed, it turns out that there are a number of awk strings that do not need to be escaped and newer versions of awk now warn about this. Fix the string up so that no warning is produced. The exact same kernel module gets created before and after this patch, showing that it wasn't needed. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20191206152600.GA75093@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-01lib/raid6: fix unnecessary rebuild of vpermxor*.cMasahiro Yamada
The following four files are every time rebuilt: UNROLL lib/raid6/vpermxor1.c UNROLL lib/raid6/vpermxor2.c UNROLL lib/raid6/vpermxor4.c UNROLL lib/raid6/vpermxor8.c Fix the suffixes in the targets. Fixes: 72ad21075df8 ("lib/raid6: refactor unroll rules with pattern rules") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-12Merge tag 'kbuild-v5.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - remove headers_{install,check}_all targets - remove unreasonable 'depends on !UML' from CONFIG_SAMPLES - re-implement 'make headers_install' more cleanly - add new header-test-y syntax to compile-test headers - compile-test exported headers to ensure they are compilable in user-space - compile-test headers under include/ to ensure they are self-contained - remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value flags - add -Werror=unknown-warning-option for Clang - add 128-bit built-in types support to genksyms - fix missed rebuild of modules.builtin - propagate 'No space left on device' error in fixdep to Make - allow Clang to use its integrated assembler - improve some coccinelle scripts - add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute path for $(srctree). - do not ignore errors when compression utility is missing - misc cleanups * tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (49 commits) kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefix kbuild: Inform user to pass ARCH= for make mrproper kbuild: fix compression errors getting ignored kbuild: add a flag to force absolute path for srctree kbuild: replace KBUILD_SRCTREE with boolean building_out_of_srctree kbuild: remove src and obj from the top Makefile scripts/tags.sh: remove unused environment variables from comments scripts/tags.sh: drop SUBARCH support for ARM kbuild: compile-test kernel headers to ensure they are self-contained kheaders: include only headers into kheaders_data.tar.xz kheaders: remove meaningless -R option of 'ls' kbuild: support header-test-pattern-y kbuild: do not create wrappers for header-test-y kbuild: compile-test exported headers to ensure they are self-contained init/Kconfig: add CONFIG_CC_CAN_LINK kallsyms: exclude kasan local symbols on s390 kbuild: add more hints about SUBDIRS replacement coccinelle: api/stream_open: treat all wait_.*() calls as blocking coccinelle: put_device: Add a cast to an expression for an assignment coccinelle: put_device: Adjust a message construction ...
2019-07-08Merge tag 's390-5.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Improve stop_machine wait logic: replace cpu_relax_yield call in generic stop_machine function with a weak stop_machine_yield function. This is overridden on s390, which yields the current cpu to the neighbouring cpu after a couple of retries, instead of blindly giving up the cpu to the hipervisor. This significantly improves stop_machine performance on s390 in overcommitted scenarios. This includes common code changes which have been Acked by Peter Zijlstra and Thomas Gleixner. - Improve jump label transformation speed: transform jump labels without using stop_machine. - Refactoring of the vfio-ccw cp handling, simplifying the code and avoiding unneeded allocating/copying. - Various vfio-ccw fixes (ccw translation, state machine). - Add support for vfio-ap queue interrupt control in the guest. This includes s390 kvm changes which have been Acked by Christian Borntraeger. - Add protected virtualization support for virtio-ccw. - Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to remove some code which most likely isn't working at all, besides that s390 didn't even compile for !CONFIG_SMP. - Support for special flagged EP11 CPRBs for zcrypt. - Handle PCI devices with no support for new MIO instructions. - Avoid KASAN false positives in reworked stack unwinder. - Couple of fixes for the QDIO layer. - Convert s390 specific documentation to ReST format. - Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if hardware is missing. This way our modules behave like most other modules and which is also what systemd's systemd-modules-load.service expects. - Replace defconfig with performance_defconfig, so there is one config file less to maintain. - Remove the SCLP call home device driver, which was never useful. - Cleanups all over the place. * tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits) docs: s390: s390dbf: typos and formatting, update crash command docs: s390: unify and update s390dbf kdocs at debug.c docs: s390: restore important non-kdoc parts of s390dbf.rst vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1 s390/pci: correctly handle MIO opt-out s390/pci: deal with devices that have no support for MIO instructions s390: ap: kvm: Enable PQAP/AQIC facility for the guest s390: ap: implement PAPQ AQIC interception in kernel vfio: ap: register IOMMU VFIO notifier s390: ap: kvm: add PQAP interception for AQIC s390/unwind: cleanup unused READ_ONCE_TASK_STACK s390/kasan: avoid false positives during stack unwind s390/qdio: don't touch the dsci in tiqdio_add_input_queues() s390/qdio: (re-)initialize tiqdio list entries s390/dasd: Fix a precision vs width bug in dasd_feature_list() s390/cio: introduce driver_override on the css bus vfio-ccw: make convert_ccw0_to_ccw1 static vfio-ccw: Remove copy_ccw_from_iova() vfio-ccw: Factor out the ccw0-to-ccw1 transition vfio-ccw: Copy CCW data outside length calculation ...
2019-06-24lib/raid6: refactor unroll rules with pattern rulesMasahiro Yamada
This Makefile repeats very similar rules. Let's use pattern rules. $(UNROLL) can be replaced with $*. No intended change in behavior. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-06-24lib/raid6: remove duplicated CFLAGS_REMOVE_altivec8.oMasahiro Yamada
No intended change in behavior. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11RAID/s390: remove invalid 'r' inline asm operand modifierVasily Gorbik
gcc silently ignores unsupported inline asm operand modifiers, effectively turning '%r0' into '%0', but upcoming clang 9 complains about them: lib/raid6/s390vx8.c:63:16: error: invalid operand in inline asm: 'VLM $2,$3,0,${1:r}' asm volatile ("VLM %2,%3,0,%r1" ^ Clean up what look like a typo 'r' inline asm operand modifier usage. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 of the license extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 315 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 83Thomas Gleixner
Based on 1 normalized pattern(s): this file is part of the linux kernel and is made available under the terms of the gnu general public license version 2 or at your option any later version incorporated herein by reference extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 18 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520075211.321157221@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 48Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation inc 53 temple place ste 330 boston ma 02111 1307 usa either version 2 of the license or at your option any later version incorporated herein by reference extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 13 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170858.645641371@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-15Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: - An improvement from Ard Biesheuvel, who noted that the identity map setup was taking a long time due to flush_cache_louis(). - Update a comment about dma_ops from Wolfram Sang. - Remove use of "-p" with ld, where this flag has been a no-op since 2004. - Remove the printing of the virtual memory layout, which is no longer useful since we hide pointers. - Correct SCU help text. - Remove legacy TWD registration method. - Add pgprot_device() implementation for mapping PCI sysfs resource files. - Initialise PFN limits earlier for kmemleak. - Fix argument count to match macro definition (affects clang builds) - Use unified assembler language almost everywhere for clang, and other clang improvements (from Stefan Agner, Nathan Chancellor). - Support security extension for noMMU and other noMMU cleanups (from Vladimir Murzin). - Remove unnecessary SMP bringup code (which was incorrectly copy'n' pasted from the ARM platform implementations) and remove it from the arch code to discourge further copys of it appearing. - Add Cortex A9 erratum preventing kexec working on some SoCs. - AMBA bus identification updates from Mike Leach. - More use of raw spinlocks to avoid -RT kernel issues (from Yang Shi and Sebastian Andrzej Siewior). - MCPM hyp/svc mode mismatch fixes from Marek Szyprowski. * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (32 commits) ARM: 8849/1: NOMMU: Fix encodings for PMSAv8's PRBAR4/PRLAR4 ARM: 8848/1: virt: Align GIC version check with arm64 counterpart ARM: 8847/1: pm: fix HYP/SVC mode mismatch when MCPM is used ARM: 8845/1: use unified assembler in c files ARM: 8844/1: use unified assembler in assembly files ARM: 8843/1: use unified assembler in headers ARM: 8841/1: use unified assembler in macros ARM: 8840/1: use a raw_spinlock_t in unwind ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t ARM: 8837/1: coresight: etmv4: Update ID register table to add UCI support ARM: 8836/1: drivers: amba: Update component matching to use the CoreSight UCI values. ARM: 8838/1: drivers: amba: Updates to component identification for driver matching. ARM: 8833/1: Ensure that NEON code always compiles with Clang ARM: avoid Cortex-A9 livelock on tight dmb loops ARM: smp: remove arch-provided "pen_release" ARM: actions: remove boot_lock and pen_release ARM: oxnas: remove CPU hotplug implementation ARM: qcom: remove unnecessary boot_lock ARM: 8832/1: NOMMU: Limit visibility for CONFIG_FLASH_{MEM_BASE,SIZE} ARM: 8831/1: NOMMU: pmsa-v8: remove unneeded semicolon ...
2019-02-28lib/raid6: arm: optimize away a mask operation in NEON recovery routineArd Biesheuvel
The NEON recovery code was modeled after the x86 SIMD code, and for some reason, that code uses a 16 bit wide signed shift and a mask to perform what amounts to a 8 bit unsigned shift. So fold the ops together. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28lib/raid6: use vdupq_n_u8 to avoid endianness warningsndesaulniers@google.com
Clang warns: vector initializers are not compatible with NEON intrinsics in big endian mode [-Wnonportable-vector-initialization] While this is usually the case, it's not an issue for this case since we're initializing the uint8x16_t (16x uint8_t's) with the same value. Instead, use vdupq_n_u8 which both compilers lower into a single movi instruction: https://godbolt.org/z/vBrgzt This avoids the static storage for a constant value. Link: https://github.com/ClangBuiltLinux/linux/issues/214 Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-12ARM: 8833/1: Ensure that NEON code always compiles with ClangNathan Chancellor
While building arm32 allyesconfig, I ran into the following errors: arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with '-mfloat-abi=softfp -mfpu=neon' In file included from lib/raid6/neon1.c:27: /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2: error: "NEON support not enabled" Building V=1 showed NEON_FLAGS getting passed along to Clang but __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang only defining __ARM_NEON__ when targeting armv7, rather than armv6k, which is the '-march' value for allyesconfig. >From lib/Basic/Targets/ARM.cpp in the Clang source: // This only gets set when Neon instructions are actually available, unlike // the VFP define, hence the soft float and arch check. This is subtly // different from gcc, we follow the intent which was that it should be set // when Neon instructions are actually available. if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) { Builder.defineMacro("__ARM_NEON", "1"); Builder.defineMacro("__ARM_NEON__"); // current AArch32 NEON implementations do not support double-precision // floating-point even when it is present in VFP. Builder.defineMacro("__ARM_NEON_FP", "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP)); } Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets definined by Clang. This doesn't functionally change anything because that code will only run where NEON is supported, which is implicitly armv7. Link: https://github.com/ClangBuiltLinux/linux/issues/287 Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-01-06Merge tag 'kbuild-v4.21-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - improve boolinit.cocci and use_after_iter.cocci semantic patches - fix alignment for kallsyms - move 'asm goto' compiler test to Kconfig and clean up jump_label CONFIG option - generate asm-generic wrappers automatically if arch does not implement mandatory UAPI headers - remove redundant generic-y defines - misc cleanups * tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: rename generated .*conf-cfg to *conf-cfg kbuild: remove unnecessary stubs for archheader and archscripts kbuild: use assignment instead of define ... endef for filechk_* rules arch: remove redundant UAPI generic-y defines kbuild: generate asm-generic wrappers if mandatory headers are missing arch: remove stale comments "UAPI Header export list" riscv: remove redundant kernel-space generic-y kbuild: change filechk to surround the given command with { } kbuild: remove redundant target cleaning on failure kbuild: clean up rule_dtc_dt_yaml kbuild: remove UIMAGE_IN and UIMAGE_OUT jump_label: move 'asm goto' support test to Kconfig kallsyms: lower alignment on ARM scripts: coccinelle: boolinit: drop warnings on named constants scripts: coccinelle: check for redeclaration kconfig: remove unused "file" field of yylval union nds32: remove redundant kernel-space generic-y nios2: remove unneeded HAS_DMA define
2019-01-06kbuild: remove redundant target cleaning on failureMasahiro Yamada
Since commit 9c2af1c7377a ("kbuild: add .DELETE_ON_ERROR special target"), the target file is automatically deleted on failure. The boilerplate code ... || { rm -f $@; false; } is unneeded. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-03Merge branch 'for-next' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/shli/md into for-linus Pull the pending 4.21 changes for md from Shaohua. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: md: fix raid10 hang issue caused by barrier raid10: refactor common wait code from regular read/write request md: remvoe redundant condition check lib/raid6: add option to skip algo benchmarking lib/raid6: sort algos in rough performance order lib/raid6: check for assembler SSSE3 support lib/raid6: avoid __attribute_const__ redefinition lib/raid6: add missing include for raid6test md: remove set but not used variable 'bi_rdev'
2018-12-20lib/raid6: add option to skip algo benchmarkingDaniel Verkamp
This is helpful for systems where fast startup time is important. It is especially nice to avoid benchmarking RAID functions that are never used (for example, BTRFS selects RAID6_PQ even if the parity RAID mode is not in use). This saves 250+ milliseconds of boot time on modern x86 and ARM systems with a dozen or more available implementations. The new option is defaulted to 'y' to match the previous behavior of always benchmarking on init. Signed-off-by: Daniel Verkamp <dverkamp@chromium.org> Signed-off-by: Shaohua Li <shli@fb.com>
2018-12-20lib/raid6: sort algos in rough performance orderDaniel Verkamp
Sort the list of RAID6 algorithms in roughly decreasing order of expected performance: newer instruction sets first (within each architecture) and wider unrollings first. This doesn't make any difference right now, since all functions are benchmarked; a follow-up change will make use of this by optionally choosing the first valid function rather than testing all of them. The Itanium raid6_intx{16,32} entries are also moved down to be near the other raid6_intx entries for clarity. Signed-off-by: Daniel Verkamp <dverkamp@chromium.org> Signed-off-by: Shaohua Li <shli@fb.com>
2018-12-20lib/raid6: check for assembler SSSE3 supportDaniel Verkamp
Allow the x86 SSSE3 recovery function to be tested in raid6test. Signed-off-by: Daniel Verkamp <dverkamp@chromium.org> Signed-off-by: Shaohua Li <shli@fb.com>
2018-12-20raid6/ppc: Fix build for clangJoel Stanley
We cannot build these files with clang as it does not allow altivec instructions in assembly when -msoft-float is passed. Jinsong Ji <jji@us.ibm.com> wrote: > We currently disable Altivec/VSX support when enabling soft-float. So > any usage of vector builtins will break. > > Enable Altivec/VSX with soft-float may need quite some clean up work, so > I guess this is currently a limitation. > > Removing -msoft-float will make it work (and we are lucky that no > floating point instructions will be generated as well). This is a workaround until the issue is resolved in clang. Link: https://bugs.llvm.org/show_bug.cgi?id=31177 Link: https://github.com/ClangBuiltLinux/linux/issues/239 Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-06lib/raid6: Fix arm64 test buildJeremy Linton
The lib/raid6/test fails to build the neon objects on arm64 because the correct machine type is 'aarch64'. Once this is correctly enabled, the neon recovery objects need to be added to the build. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>