summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-13net: mvneta: change page pool nid to NUMA_NO_NODELorenzo Bianconi
With 'commit 44768decb7c0 ("page_pool: handle page recycle for NUMA_NO_NODE condition")' we can safely change nid to NUMA_NO_NODE and accommodate future NUMA aware hardware using mvneta network interface Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-13tools/bpf: Add runqslower tool to tools/bpfAndrii Nakryiko
Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches its BCC-based counterpart 1-to-1, supporting all the same parameters and functionality. runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, as well as memory-mapped interface to global (read-only, in this case) data. Its Makefile also ensures auto-generation of "relocatable" vmlinux.h, which is necessary for BTF-typed raw tracepoints with direct memory access. [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200113073143.1779940-6-andriin@fb.com
2020-01-13bpftool: Apply preserve_access_index attribute to all types in BTF dumpAndrii Nakryiko
This patch makes structs and unions, emitted through BTF dump, automatically CO-RE-relocatable (unless disabled with `#define BPF_NO_PRESERVE_ACCESS_INDEX`, specified before including generated header file). This effectivaly turns usual bpf_probe_read() call into equivalent of bpf_core_read(), by automatically applying builtin_preserve_access_index to any field accesses of types in generated C types header. This is especially useful for tp_btf/fentry/fexit BPF program types. They allow direct memory access, so BPF C code just uses straightfoward a->b->c access pattern to read data from kernel. But without kernel structs marked as CO-RE relocatable through preserve_access_index attribute, one has to enclose all the data reads into a special __builtin_preserve_access_index code block, like so: __builtin_preserve_access_index(({ x = p->pid; /* where p is struct task_struct *, for example */ })); This is very inconvenient and obscures the logic quite a bit. By marking all auto-generated types with preserve_access_index attribute the above code is reduced to just a clean and natural `x = p->pid;`. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200113073143.1779940-5-andriin@fb.com
2020-01-13selftests/bpf: Conform selftests/bpf Makefile output to libbpf and bpftoolAndrii Nakryiko
Bring selftest/bpf's Makefile output to the same format used by libbpf and bpftool: 2 spaces of padding on the left + 8-character left-aligned build step identifier. Also, hide feature detection output by default. Can be enabled back by setting V=1. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200113073143.1779940-4-andriin@fb.com
2020-01-13libbpf: Clean up bpf_helper_defs.h generation outputAndrii Nakryiko
bpf_helpers_doc.py script, used to generate bpf_helper_defs.h, unconditionally emits one informational message to stderr. Remove it and preserve stderr to contain only relevant errors. Also make sure script invocations command is muted by default in libbpf's Makefile. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200113073143.1779940-3-andriin@fb.com
2020-01-13tools: Sync uapi/linux/if_link.hAndrii Nakryiko
Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during libbpf build. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com
2020-01-13Merge branch 'md-next' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.6/drivers Pull MD changes from Song. * 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid1: introduce wait_for_serialization md/raid1: use bucket based mechanism for IO serialization md: introduce a new struct for IO serialization md: don't destroy serial_info_pool if serialize_policy is true raid1: serialize the overlap write md: reorgnize mddev_create/destroy_serial_pool md: add serialize_policy sysfs node for raid1 md: prepare for enable raid1 io serialization md: fix a typo s/creat/create md: rename wb stuffs raid5: remove worker_cnt_per_group argument from alloc_thread_groups md/raid6: fix algorithm choice under larger PAGE_SIZE raid6/test: fix a compilation warning raid6/test: fix a compilation error md-bitmap: small cleanups
2020-01-13btrfs: relocation: fix reloc_root lifespan and accessQu Wenruo
[BUG] There are several different KASAN reports for balance + snapshot workloads. Involved call paths include: should_ignore_root+0x54/0xb0 [btrfs] build_backref_tree+0x11af/0x2280 [btrfs] relocate_tree_blocks+0x391/0xb80 [btrfs] relocate_block_group+0x3e5/0xa00 [btrfs] btrfs_relocate_block_group+0x240/0x4d0 [btrfs] btrfs_relocate_chunk+0x53/0xf0 [btrfs] btrfs_balance+0xc91/0x1840 [btrfs] btrfs_ioctl_balance+0x416/0x4e0 [btrfs] btrfs_ioctl+0x8af/0x3e60 [btrfs] do_vfs_ioctl+0x831/0xb10 create_reloc_root+0x9f/0x460 [btrfs] btrfs_reloc_post_snapshot+0xff/0x6c0 [btrfs] create_pending_snapshot+0xa9b/0x15f0 [btrfs] create_pending_snapshots+0x111/0x140 [btrfs] btrfs_commit_transaction+0x7a6/0x1360 [btrfs] btrfs_mksubvol+0x915/0x960 [btrfs] btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs] btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs] btrfs_ioctl+0x241b/0x3e60 [btrfs] do_vfs_ioctl+0x831/0xb10 btrfs_reloc_pre_snapshot+0x85/0xc0 [btrfs] create_pending_snapshot+0x209/0x15f0 [btrfs] create_pending_snapshots+0x111/0x140 [btrfs] btrfs_commit_transaction+0x7a6/0x1360 [btrfs] btrfs_mksubvol+0x915/0x960 [btrfs] btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs] btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs] btrfs_ioctl+0x241b/0x3e60 [btrfs] do_vfs_ioctl+0x831/0xb10 [CAUSE] All these call sites are only relying on root->reloc_root, which can undergo btrfs_drop_snapshot(), and since we don't have real refcount based protection to reloc roots, we can reach already dropped reloc root, triggering KASAN. [FIX] To avoid such access to unstable root->reloc_root, we should check BTRFS_ROOT_DEAD_RELOC_TREE bit first. This patch introduces wrappers that provide the correct way to check the bit with memory barriers protection. Most callers don't distinguish merged reloc tree and no reloc tree. The only exception is should_ignore_root(), as merged reloc tree can be ignored, while no reloc tree shouldn't. [CRITICAL SECTION ANALYSIS] Although test_bit()/set_bit()/clear_bit() doesn't imply a barrier, the DEAD_RELOC_TREE bit has extra help from transaction as a higher level barrier, the lifespan of root::reloc_root and DEAD_RELOC_TREE bit are: NULL: reloc_root is NULL PTR: reloc_root is not NULL 0: DEAD_RELOC_ROOT bit not set DEAD: DEAD_RELOC_ROOT bit set (NULL, 0) Initial state __ | /\ Section A btrfs_init_reloc_root() \/ | __ (PTR, 0) reloc_root initialized /\ | | btrfs_update_reloc_root() | Section B | | (PTR, DEAD) reloc_root has been merged \/ | __ === btrfs_commit_transaction() ==================== | /\ clean_dirty_subvols() | | | Section C (NULL, DEAD) reloc_root cleanup starts \/ | __ btrfs_drop_snapshot() /\ | | Section D (NULL, 0) Back to initial state \/ Every have_reloc_root() or test_bit(DEAD_RELOC_ROOT) caller holds transaction handle, so none of such caller can cross transaction boundary. In Section A, every caller just found no DEAD bit, and grab reloc_root. In the cross section A-B, caller may get no DEAD bit, but since reloc_root is still completely valid thus accessing reloc_root is completely safe. No test_bit() caller can cross the boundary of Section B and Section C. In Section C, every caller found the DEAD bit, so no one will access reloc_root. In the cross section C-D, either caller gets the DEAD bit set, avoiding access reloc_root no matter if it's safe or not. Or caller get the DEAD bit cleared, then access reloc_root, which is already NULL, nothing will be wrong. The memory write barriers are between the reloc_root updates and bit set/clear, the pairing read side is before test_bit. Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ barriers ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-13iio: imu: adis: use new `delay` structure for SPI transfer delaysAlexandru Ardelean
In a recent change to the SPI subsystem [1], a new `delay` struct was added to replace the `delay_usecs`. This change replaces the current `delay_usecs` with `delay` for this driver. The `spi_transfer_delay_exec()` function [in the SPI framework] makes sure that both `delay_usecs` & `delay` are used (in this order to preserve backwards compatibility). [1] commit bebcfd272df6485 ("spi: introduce `delay` field for `spi_transfer` + spi_transfer_delay_exec()") Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13iio: adc: stm32-dfsdm: adapt sampling rate to oversampling ratioOlivier Moysan
Update sampling rate when oversampling ratio is changed through the IIO ABI. Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13iio: adc: stm32-dfsdm: fix single conversionOlivier Moysan
Apply data formatting to single conversion, as this is already done in continuous and trigger modes. Fixes: 102afde62937 ("iio: adc: stm32-dfsdm: manage data resolution in trigger mode") Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Cc: <Stable@vger.kernel.org> Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13iio: st_sensors: Make use of device propertiesAndy Shevchenko
Device property API allows to gather device resources from different sources, such as ACPI. Convert the drivers to unleash the power of device property API. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13iio: st_sensors: Drop redundant parameter from st_sensors_of_name_probe()Andy Shevchenko
Since we have access to the struct device_driver and thus to the ID table, there is no need to supply special parameters to st_sensors_of_name_probe(). Besides that we have a common API to get driver match data, there is no need to do matching separately for OF and ACPI. Taking into consideration above, simplify the ST sensors code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13iio: st_gyro: Correct data for LSM9DS0 gyroAndy Shevchenko
The commit 41c128cb25ce ("iio: st_gyro: Add lsm9ds0-gyro support") assumes that gyro in LSM9DS0 is the same as others with 0xd4 WAI ID, but datasheet tells slight different story, i.e. the first scale factor for the chip is 245 dps, and not 250 dps. Correct this by introducing a separate settings for LSM9DS0. Fixes: 41c128cb25ce ("iio: st_gyro: Add lsm9ds0-gyro support") Depends-on: 45a4e4220bf4 ("iio: gyro: st_gyro: fix L3GD20H support") Cc: Leonard Crestez <leonard.crestez@nxp.com> Cc: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13iio: imu: st_lsm6dsx: check return value from st_lsm6dsx_sensor_set_enableLorenzo Bianconi
Add missing return value check in st_lsm6dsx_read_oneshot disabling the sensor. The issue is reported by coverity with the following error: Unchecked return value: If the function returns an error value, the error value may be mistaken for a normal value. Addresses-Coverity-ID: 1446733 ("Unchecked return value") Fixes: b5969abfa8b8 ("iio: imu: st_lsm6dsx: add motion events") Fixes: 290a6ce11d93 ("iio: imu: add support to lsm6dsx driver") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-01-13pinctrl: dt-bindings: Fix some errors in the lgm and pinmux schemaLinus Walleij
This fixes some problems that caused build errors in the lgm-io schema file: - No "bindings" infix in the schema id - Move the allOf inclusion for pinconf and pinmux nodes into the patternProperties for the -pins node - We want "groups" not "group" to be compulsory for a pinmux node blended with a pin config node. - Fix the generic pinmux-schema to list "groups" rather than "group" for a pinmux node, this might have led to some confusion. This is a first user of the generic schema so a bit of a bumpy road. Cc: Rob Herring <robh+dt@kernel.org> Cc: Rahul Tanwar <rahul.tanwar@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-01-13dt-bindings: gpio: wcd934x: Add bindings for gpioSrinivas Kandagatla
Qualcomm Technologies Inc WCD9340/WCD9341 Audio Codec has integrated gpio controller to control 5 gpios on the chip. This patch adds required device tree bindings for it. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20200107130844.20763-2-srinivas.kandagatla@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-01-13md/raid1: introduce wait_for_serializationGuoqing Jiang
Previously, we call check_and_add_serial when serialization is enabled for write IO, but it could allocate and free memory back and forth. Now, let's just get an element from memory pool with the new function, then insert node to rb tree if no collision happens. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md/raid1: use bucket based mechanism for IO serializationGuoqing Jiang
Since raid1 had already used bucket based mechanism to reduce the conflict between write IO and resync IO, it is possible to speed up performance for io serialization with refer to the same mechanism. To align with the barrier bucket mechanism, we created arrays (with the same number of BARRIER_BUCKETS_NR) for spinlock, rb tree and waitqueue. Then we can reduce lock competition with multiple spinlocks, boost search performance with multiple rb trees and also reduce thundering herd problem with multiple waitqueues. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: introduce a new struct for IO serializationGuoqing Jiang
Obviously, IO serialization could cause the degradation of performance a lot. In order to reduce the degradation, so a rb interval tree is added in raid1 to speed up the check of collision. So, a rb root is needed in md_rdev, then abstract all the serialize related members to a new struct (serial_in_rdev), embed it into md_rdev. Of course, we need to free the struct if it is not needed anymore, so rdev/rdevs_uninit_serial are added accordingly. And they should be called when destroty memory pool or can't alloc memory. And we need to consider to call mddev_destroy_serial_pool in case serialize_policy/write-behind is disabled, bitmap is destroyed or in __md_stop_writes. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: don't destroy serial_info_pool if serialize_policy is trueGuoqing Jiang
The serial_info_pool is needed if array sets serialize_policy to true, so don't destroy it. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid1: serialize the overlap writeGuoqing Jiang
Before dispatch write bio, raid1 array which enables serialize_policy need to check if overlap exists between this bio and previous on-flying bios. If there is overlap, then it has to wait until the collision is disappeared. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: reorgnize mddev_create/destroy_serial_poolGuoqing Jiang
So far, IO serialization is used for two scenarios: 1. raid1 which enables write-behind mode, and there is rdev in the array which is multi-queue device and flaged with writemostly. 2. IO serialization is enabled or disabled by change serialize_policy. So introduce rdev_need_serial to check the first scenario. And for 1, IO serialization is enabled automatically while 2 is controlled manually. And it is possible that both scenarios are true, so for create serial pool, rdev/rdevs_init_serial should be separate from check if the pool existed or not. Then for destroy pool, we need to check if the pool is needed by other rdevs due to the first scenario. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: add serialize_policy sysfs node for raid1Guoqing Jiang
With the new sysfs node, we can use it to control if raid1 array wants io serialization or not. So mddev_create_serial_pool and mddev_destroy_serial_pool are called in serialize_policy_store to enable or disable the serialization. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: prepare for enable raid1 io serializationGuoqing Jiang
1. The related resources (spin_lock, list and waitqueue) are needed for address raid1 reorder overlap issue too, in this case, rdev is set to NULL for mddev_create/destroy_serial_pool which implies all rdevs need to handle these resources. And also add "is_suspend" to mddev_destroy_serial_pool since it will be called under suspended situation, which also makes both create and destroy pool have same arguments. 2. Introduce rdevs_init_serial which is called if raid1 io serialization is enabled since all rdevs need to init related stuffs. 3. rdev_init_serial and clear_bit(CollisionCheck, &rdev->flags) should be called between suspend and resume. No need to export mddev_create_serial_pool since it is only called in md-mod module. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: fix a typo s/creat/createGuoqing Jiang
It actually means create here, so fix the typo. Reported-by: Song Liu <liu.song.a23@gmail.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: rename wb stuffsGuoqing Jiang
Previously, wb_info_pool and wb_list stuffs are introduced to address potential data inconsistence issue for write behind device. Now rename them to serial related name, since the same mechanism will be used to address reorder overlap write issue for raid1. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid5: remove worker_cnt_per_group argument from alloc_thread_groupsGuoqing Jiang
We can use "cnt" directly to update conf->worker_cnt_per_group if alloc_thread_groups returns 0. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md/raid6: fix algorithm choice under larger PAGE_SIZEZhengyuan Liu
There are several algorithms available for raid6 to generate xor and syndrome parity, including basic int1, int2 ... int32 and SIMD optimized implementation like sse and neon. To test and choose the best algorithms at the initial stage, we need provide enough disk data to feed the algorithms. However, the disk number we provided depends on page size and gfmul table, seeing bellow: const int disks = (65536/PAGE_SIZE) + 2; So when come to 64K PAGE_SIZE, there is only one data disk plus 2 parity disk, as a result the chosed algorithm is not reliable. For example, on my arm64 machine with 64K page enabled, it will choose intx32 as the best one, although the NEON implementation is better. This patch tries to fix the problem by defining a constant raid6 disk number to supporting arbitrary page size. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid6/test: fix a compilation warningZhengyuan Liu
The compilation warning is redefination showed as following: In file included from tables.c:2: ../../../include/linux/export.h:180: warning: "EXPORT_SYMBOL" redefined #define EXPORT_SYMBOL(sym) __EXPORT_SYMBOL(sym, "") In file included from tables.c:1: ../../../include/linux/raid/pq.h:61: note: this is the location of the previous definition #define EXPORT_SYMBOL(sym) Fixes: 69a94abb82ee ("export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid6/test: fix a compilation errorZhengyuan Liu
The compilation error is redeclaration showed as following: In file included from ../../../include/linux/limits.h:6, from /usr/include/x86_64-linux-gnu/bits/local_lim.h:38, from /usr/include/x86_64-linux-gnu/bits/posix1_lim.h:161, from /usr/include/limits.h:183, from /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/limits.h:194, from /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/syslimits.h:7, from /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/limits.h:34, from ../../../include/linux/raid/pq.h:30, from algos.c:14: ../../../include/linux/types.h:114:15: error: conflicting types for ‘int64_t’ typedef s64 int64_t; ^~~~~~~ In file included from /usr/include/stdint.h:34, from /usr/lib/gcc/x86_64-linux-gnu/8/include/stdint.h:9, from /usr/include/inttypes.h:27, from ../../../include/linux/raid/pq.h:29, from algos.c:14: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: note: previous \ declaration of ‘int64_t’ was here typedef __int64_t int64_t; Fixes: 54d50897d544 ("linux/kernel.h: split *_MAX and *_MIN macros into <linux/limits.h>") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md-bitmap: small cleanupsZhiqiang Liu
In md_bitmap_unplug, bitmap->storage.filemap is double checked. In md_bitmap_daemon_work, bitmap->storage.filemap should be checked before reference. Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add missed "cpld4_version" attribute. Fixes: 52675da1d087 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs ↵Vadim Pasternak
interfaces Fix attribute name from "jtag_enable", which described twice to "cpld3_version", which is expected to be instead of second appearance of "jtag_enable". Fixes: 2752e34442b5 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add support for next generation systemsVadim Pasternak
Add support for new Mellanox system types of basic class VMOD0010, containing new Mellanox systems equipped with new switch device Spectrum 3 (32x400GbE/64x200G/128x100G Ethernet switch). These are the Top of the Rack 1U/2U/4U systems, equipped with Mellanox Comex card and with the switch board with Mellanox Spectrum-3 device. This class of devices can be equipped with two PS units for 1U/2U or with four PS units for 4U systems. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/mellanox: mlxreg-hotplug: Add support for new capability registerVadim Pasternak
Add support for capability register, which is used for detection of the actual number of interrupt capable components within the particular group, supported by the specific system. Such components could be for example the number of power units and interrupts related to these units. The motivation is to avoid adding a new code in the future in order to distinct between the systems type supported different number of the components like power supplies, FANs, ASICs, line cards. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add support for new capability registerVadim Pasternak
Add support for capability register, which contains information about the number of PS units equipped on the system and about minimum I2C frequency supported by the all system's I2C devices. Utilization of this register allows to avoid necessity of providing new system description, in case it differs in number of PS units or in minimal I2C frequency. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add support for new system typeVadim Pasternak
Add support for new Mellanox system types of basic class VMOD0009, containing Mellanox systems equipped with the switch devices Spectrum 1 (32x100GbE Ethernet switch) and Switch-IB/Switch-IB2 (36x100Gbe InfiniBand switch). These are the Top of the Rack system, equipped with Mellanox Comex card. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Set system mux configuration based on system typeVadim Pasternak
Separate assignment for systems mux configuration based on system type, instead of setting the same configuration for the all. The motivation is to allow introduction of new systems types with the different mux topology. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Add new attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add documentation for the new attributes for: - Exposing reset causes types asserted by: platform reset, SoC reset, AC power failure, software power off request. - Setting and removing system VPD (EEPROM) hardware write protection. - Voltage regulator devices configuration update status and firmware version. - Setting PCIe ASIC reset to disable or enable state during PCIe root complex reset. - System static topology identification, like system's static I2C topology, number and type of FPGA devices within the system and so on. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add more definitions for system attributesVadim Pasternak
Add new attributes for "next-generation" type systems: - Reset cause indication, when system reset has been caused by the platform reset request through CPLD, by AC power failure, by software power off request through CPLD. by signal asserted by SOC through ACPI register. It introduces more reset causes, which can be monitored after the reboots. - Setting and removing system VPD (EEPROM) hardware write protection. It allows to access VPD during production cycle and prevents VPD corruption on system in field. - Voltage regulator devices configuration update status and version. It allows to monitor configuration update status and current active version for such sort of devices. - PCIe ASIC reset disable - when set ASIC will go down upon PCIe root complex reset, otherwise ASIC will stay up during PCIe root complex reset. - System configuration Ids to provide system static topology identification, like system's static I2C topology, number and type of FPGA devices within the system and so on. Add the existing attribute for "iio" bank selection for system type "msn21xx". All the above attributes are exposed through "sysfs" by "mlxreg-io" driver. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Style changesVadim Pasternak
Remove blank lines between "What" and "Date" keywords. Start each section with "What" keyword. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add missed "cpld4_version" attribute. Fixes: 52675da1d087 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs ↵Vadim Pasternak
interfaces Fix attribute name from "jtag_enable", which described twice to "cpld3_version", which is expected to be instead of second appearance of "jtag_enable". Fixes: 2752e34442b5 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Cosmetic changesVadim Pasternak
Remove redundant semicolons at the end of few functions. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13MAINTAINERS: Update for the intel uncore frequency controlSrinivas Pandruvada
Add an entry for drivers/platform/x86/intel-uncore-frequency.c. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: Add support for Uncore frequency controlSrinivas Pandruvada
Some server users set limits on the uncore frequency using MSR 620H, while running latency sensitive workloads. Here uncore frequency controls RING/LLC(last-level cache) clocks. But MSR control is not always possible from the user space, so this driver provides a sysfs interface to set max and min frequency limits. This MSR 620H is a die scoped in multi-die system or package scoped in non multi-die systems. When this driver is loaded, a new directory is created under /sys/devices/system/cpu. For example on a two package Skylake server: $cd /sys/devices/system/cpu/intel_uncore_frequency $ls package_00_die_00 package_01_die_00 $ls package_00_die_00 max_freq_khz min_freq_khz initial_max_freq_khz initial_min_freq_khz $grep . * max_freq_khz:2400000 min_freq_khz:1200000 initial_max_freq_khz:2400000 initial_min_freq_khz:1200000 Here, initial_max_freq_khz and initial_min_freq_khz are read only attributes to show power up or initial values of max and min frequencies respectively. Other attributes are read-write, so that users can modify. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13KVM: VMX: Allow KVM_INTEL when building for Centaur and/or Zhaoxin CPUsSean Christopherson
Change the dependency for KVM_INTEL, i.e. KVM w/ VMX, from Intel CPUs to any CPU that supports the IA32_FEAT_CTL MSR and thus VMX functionality. This effectively allows building KVM_INTEL for Centaur and Zhaoxin CPUs. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20191221044513.21680-20-sean.j.christopherson@intel.com
2020-01-13perf/x86: Provide stubs of KVM helpers for non-Intel CPUsSean Christopherson
Provide stubs for perf_guest_get_msrs() and intel_pt_handle_vmx() when building without support for Intel CPUs, i.e. CPU_SUP_INTEL=n. Lack of stubs is not currently a problem as the only user, KVM_INTEL, takes a dependency on CPU_SUP_INTEL=y. Provide the stubs for all CPUs so that KVM_INTEL can be built for any CPU with compatible hardware support, e.g. Centuar and Zhaoxin CPUs. Note, the existing stub for perf_guest_get_msrs() is essentially dead code as KVM selects CONFIG_PERF_EVENTS, i.e. the only user guarantees the full implementation is built. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20191221044513.21680-19-sean.j.christopherson@intel.com
2020-01-13KVM: VMX: Use VMX_FEATURE_* flags to define VMCS control bitsSean Christopherson
Define the VMCS execution control flags (consumed by KVM) using their associated VMX_FEATURE_* to provide a strong hint that new VMX features are expected to be added to VMX_FEATURE and considered for reporting via /proc/cpuinfo. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20191221044513.21680-18-sean.j.christopherson@intel.com