summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-10-25bpftool: Switch to libbpf's hashmap for PIDs/names referencesQuentin Monnet
In order to show PIDs and names for processes holding references to BPF programs, maps, links, or BTF objects, bpftool creates hash maps to store all relevant information. This commit is part of a set that transitions from the kernel's hash map implementation to the one coming with libbpf. The motivation is to make bpftool less dependent of kernel headers, to ease the path to a potential out-of-tree mirror, like libbpf has. This is the third and final step of the transition, in which we convert the hash maps used for storing the information about the processes holding references to BPF objects (programs, maps, links, BTF), and at last we drop the inclusion of tools/include/linux/hashtable.h. Note: Checkpatch complains about the use of __weak declarations, and the missing empty lines after the bunch of empty function declarations when compiling without the BPF skeletons (none of these were introduced in this patch). We want to keep things as they are, and the reports should be safe to ignore. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211023205154.6710-6-quentin@isovalent.com
2021-10-25bpftool: Switch to libbpf's hashmap for programs/maps in BTF listingQuentin Monnet
In order to show BPF programs and maps using BTF objects when the latter are being listed, bpftool creates hash maps to store all relevant items. This commit is part of a set that transitions from the kernel's hash map implementation to the one coming with libbpf. The motivation is to make bpftool less dependent of kernel headers, to ease the path to a potential out-of-tree mirror, like libbpf has. This commit focuses on the two hash maps used by bpftool when listing BTF objects to store references to programs and maps, and convert them to the libbpf's implementation. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211023205154.6710-5-quentin@isovalent.com
2021-10-25bpftool: Switch to libbpf's hashmap for pinned paths of BPF objectsQuentin Monnet
In order to show pinned paths for BPF programs, maps, or links when listing them with the "-f" option, bpftool creates hash maps to store all relevant paths under the bpffs. So far, it would rely on the kernel implementation (from tools/include/linux/hashtable.h). We can make bpftool rely on libbpf's implementation instead. The motivation is to make bpftool less dependent of kernel headers, to ease the path to a potential out-of-tree mirror, like libbpf has. This commit is the first step of the conversion: the hash maps for pinned paths for programs, maps, and links are converted to libbpf's hashmap.{c,h}. Other hash maps used for the PIDs of process holding references to BPF objects are left unchanged for now. On the build side, this requires adding a dependency to a second header internal to libbpf, and making it a dependency for the bootstrap bpftool version as well. The rest of the changes are a rather straightforward conversion. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211023205154.6710-4-quentin@isovalent.com
2021-10-25bpftool: Do not expose and init hash maps for pinned path in main.cQuentin Monnet
BPF programs, maps, and links, can all be listed with their pinned paths by bpftool, when the "-f" option is provided. To do so, bpftool builds hash maps containing all pinned paths for each kind of objects. These three hash maps are always initialised in main.c, and exposed through main.h. There appear to be no particular reason to do so: we can just as well make them static to the files that need them (prog.c, map.c, and link.c respectively), and initialise them only when we want to show objects and the "-f" switch is provided. This may prevent unnecessary memory allocations if the implementation of the hash maps was to change in the future. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211023205154.6710-3-quentin@isovalent.com
2021-10-25bpftool: Remove Makefile dep. on $(LIBBPF) for $(LIBBPF_INTERNAL_HDRS)Quentin Monnet
The dependency is only useful to make sure that the $(LIBBPF_HDRS_DIR) directory is created before we try to install locally the required libbpf internal header. Let's create this directory properly instead. This is in preparation of making $(LIBBPF_INTERNAL_HDRS) a dependency to the bootstrap bpftool version, in which case we want no dependency on $(LIBBPF). Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211023205154.6710-2-quentin@isovalent.com
2021-10-25nvdimm/pmem: stop using q_usage_count as external pgmap refcountChristoph Hellwig
Originally all DAX access when through block_device operations and thus needed a queue reference. But since commit cccbce671582 ("filesystem-dax: convert to dax_direct_access()") all this happens at the DAX device level which uses its own refcounting. Having the external refcount thus wasn't needed but has otherwise been harmless for long time. But now that "block: drain file system I/O on del_gendisk" waits for q_usage_count to reach 0 in del_gendisk this whole scheme can't work anymore (and pmem is the only driver abusing q_usage_count like that). So switch to the internal reference and remove the unbalanced blk_freeze_queue_start that is taken care of by del_gendisk. Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk") Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211019073641.2323410-2-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-10-25fortify: strlen: Avoid shadowing previous localsQian Cai
The __compiletime_strlen() macro expansion will shadow p_size and p_len local variables. No callers currently use any of the shadowed names for their "p" variable, so there are no code generation problems. Add "__" prefixes to variable definitions __compiletime_strlen() to avoid new W=2 warnings: ./include/linux/fortify-string.h: In function 'strnlen': ./include/linux/fortify-string.h:17:9: warning: declaration of 'p_size' shadows a previous local [-Wshadow] 17 | size_t p_size = __builtin_object_size(p, 1); \ | ^~~~~~ ./include/linux/fortify-string.h:77:17: note: in expansion of macro '__compiletime_strlen' 77 | size_t p_len = __compiletime_strlen(p); | ^~~~~~~~~~~~~~~~~~~~ ./include/linux/fortify-string.h:76:9: note: shadowed declaration is here 76 | size_t p_size = __builtin_object_size(p, 1); | ^~~~~~ Signed-off-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20211025210528.261643-1-quic_qiancai@quicinc.com
2021-10-25Merge branch 'Parallelize verif_scale selftests'Alexei Starovoitov
Andrii Nakryiko says: ==================== Reduce amount of waiting time when running test_progs in parallel mode (-j) by splitting bpf_verif_scale selftests into multiple tests. Previously it was structured as a test with multiple subtests, but subtests are not easily parallelizable with test_progs' infra. Also in practice each scale subtest is really an independent test with nothing shared across all substest. This patch set changes how test_progs test discovery works. Now it is possible to define multiple tests within a single source code file. One of the patches also marks tc_redirect selftests as serial, because it's extremely harmful to the test system when run in parallel mode. ==================== Acked-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-10-25selftests/bpf: Split out bpf_verif_scale selftests into multiple testsAndrii Nakryiko
Instead of using subtests in bpf_verif_scale selftest, turn each scale sub-test into its own test. Each subtest is compltely independent and just reuses a bit of common test running logic, so the conversion is trivial. For convenience, keep all of BPF verifier scale tests in one file. This conversion shaves off a significant amount of time when running test_progs in parallel mode. E.g., just running scale tests (-t verif_scale): BEFORE ====== Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED real 0m22.894s user 0m0.012s sys 0m22.797s AFTER ===== Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED real 0m12.044s user 0m0.024s sys 0m27.869s Ten second saving right there. test_progs -j is not yet ready to be turned on by default, unfortunately, and some tests fail almost every time, but this is a good improvement nevertheless. Ignoring few failures, here is sequential vs parallel run times when running all tests now: SEQUENTIAL ========== Summary: 206/953 PASSED, 4 SKIPPED, 0 FAILED real 1m5.625s user 0m4.211s sys 0m31.650s PARALLEL ======== Summary: 204/952 PASSED, 4 SKIPPED, 2 FAILED real 0m35.550s user 0m4.998s sys 0m39.890s Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211022223228.99920-5-andrii@kernel.org
2021-10-25selftests/bpf: Mark tc_redirect selftest as serialAndrii Nakryiko
It seems to cause a lot of harm to kprobe/tracepoint selftests. Yucong mentioned before that it does manipulate sysfs, which might be the reason. So let's mark it as serial, though ideally it would be less intrusive on the system at test. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211022223228.99920-4-andrii@kernel.org
2021-10-25selftests/bpf: Support multiple tests per fileAndrii Nakryiko
Revamp how test discovery works for test_progs and allow multiple test entries per file. Any global void function with no arguments and serial_test_ or test_ prefix is considered a test. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211022223228.99920-3-andrii@kernel.org
2021-10-25selftests/bpf: Normalize selftest entry pointsAndrii Nakryiko
Ensure that all test entry points are global void functions with no input arguments. Mark few subtest entry points as static. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211022223228.99920-2-andrii@kernel.org
2021-10-25net/mlx5: SF_DEV Add SF device trace pointsParav Pandit
Add SF device add and delete specific trace points. echo mlx5:mlx5_sf_dev_add >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_dev_del >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_vhca_event >> /sys/kernel/debug/tracing/set_event Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: SF, Add SF trace pointsParav Pandit
Add support for trace events for SFs to improve debugging. This covers (a) port add and free trace points (b) device level trace points (c) SF hardware context add, free trace points. (d) SF function activate/deacticate and state trace points SF events examples: echo mlx5:mlx5_sf_add >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_free >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_hwc_alloc >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_hwc_free >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_hwc_deferred_free >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_update_state >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_activate >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_deactivate >> /sys/kernel/debug/tracing/set_event Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure max_macs paramShay Drory
Currently, max_macs is taking 70Kbytes of memory per function. This size is not needed in all use cases, and is critical with large scale. Hence, allow user to configure the number of max_macs. For example, to reduce the number of max_macs to 1, execute:: $ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \ cmode driverinit $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure event_eq_size paramShay Drory
Event EQ is an EQ which received the notification of almost all the events generated by the NIC. Currently, each event EQ is taking 512KB of memory. This size is not needed in most use cases, and is critical with large scale. Hence, allow user to configure the size of the event EQ. For example to reduce event EQ size to 64, execute:: $ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64 $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure io_eq_size paramShay Drory
Currently, each I/O EQ is taking 128KB of memory. This size is not needed in all use cases, and is critical with large scale. Hence, allow user to configure the size of I/O EQs. For example, to reduce I/O EQ size to 64, execute: $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64 $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Bridge, support replacing existing FDB entryVlad Buslov
The SWITCHDEV_FDB_ADD_TO_DEVICE is used for both adding new and replacing existing entry. Implement support for replacing existing FDB entries in mlx5 offload code. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Bridge, extract code to lookup and del/notify entryVlad Buslov
Following two patterns in bridge code are used in multiple places where similar code is duplicated: - Lookup FDB entry from hashtable by address+vid pair. - Notify software bridge and then delete existing FDB entry. In order to improve code quality and prepare for following patch series that also uses described patterns, extract the codes to dedicated helper functions. This commit doesn't change functionality. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Add periodic update of host time to firmwareAya Levin
Firmware logs its asserts also to non-volatile memory. In order to reduce drift between the NIC and the host, the driver sets the host epoch-time to the firmware every hour. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Print health buffer by log levelAya Levin
Add log macro which gets log level as a parameter. Use the severity read from the health buffer and the new log macro to log the health buffer with severity as log level. Prior to this patch, health buffer was printed in error log level regardless of its severity. Now the user may filter dmesg (--level) or change kernel log level to focus on different severity levels of firmware errors. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Extend health buffer dumpAya Levin
Enhance health buffer to include: - assert_var5: expose the 6'th assert variable. - time: error's time-stamp in seconds (epoch time). - rfr: Recovery Flow Requiered. When set, indicates that the error cannot be recovered without flow involving reset. - severity: error's severity value, ranging from emergency to debug. Expose them in the health buffer dump (dmesg and devlink fw reporter). Health buffer in dmesg: mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR: mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800 mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492 mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758 mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0 mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR) mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9 mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Reduce flow counters bulk query buffer size for SFsAvihai Horon
Currently, the flow counters bulk query buffer takes a little more than 512KB of memory, which is aligned to the next power of 2, to 1MB. The buffer size determines the maximum number of flow counters that can be queried at a time. Thus, having a bigger buffer can improve performance for users that need to query many flow counters. SFs don't use many flow counters and don't need a big buffer. Since this size is critical with large scale, reduce the size of the bulk query buffer for SFs. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Fix unused function warning of mlx5i_flow_type_maskShay Drory
The cited commit is causing unused-function warning[1] when CONFIG_MLX5_EN_RXNFC is not set. Fix this by moving the function into the ifdef, where it's only used [1] warning: ‘mlx5i_flow_type_mask’ defined but not used [-Wunused-function] Fixes: 9fbe1c25ecca ("net/mlx5i: Enable Rx steering for IPoIB via ethtool") Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Remove unnecessary checks for slow path flagPaul Blakey
After previous changes, caller (mlx5e_tc_offload_fdb_rules()) already checks for the slow path flag, and if set won't call offload/unoffload sample. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5e: don't write directly to netdev->dev_addrJakub Kicinski
Use a local buffer and eth_hw_addr_set() Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25ice: check whether PTP is initialized in ice_ptp_release()Yongxin Liu
PTP is currently only supported on E810 devices, it is checked in ice_ptp_init(). However, there is no check in ice_ptp_release(). For other E800 series devices, ice_ptp_release() will be wrongly executed. Fix the following calltrace. INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. Workqueue: ice ice_service_task [ice] Call Trace: dump_stack_lvl+0x5b/0x82 dump_stack+0x10/0x12 register_lock_class+0x495/0x4a0 ? find_held_lock+0x3c/0xb0 __lock_acquire+0x71/0x1830 lock_acquire+0x1e6/0x330 ? ice_ptp_release+0x3c/0x1e0 [ice] ? _raw_spin_lock+0x19/0x70 ? ice_ptp_release+0x3c/0x1e0 [ice] _raw_spin_lock+0x38/0x70 ? ice_ptp_release+0x3c/0x1e0 [ice] ice_ptp_release+0x3c/0x1e0 [ice] ice_prepare_for_reset+0xcb/0xe0 [ice] ice_do_reset+0x38/0x110 [ice] ice_service_task+0x138/0xf10 [ice] ? __this_cpu_preempt_check+0x13/0x20 process_one_work+0x26a/0x650 worker_thread+0x3f/0x3b0 ? __kthread_parkme+0x51/0xb0 ? process_one_work+0x650/0x650 kthread+0x161/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Fixes: 4dd0d5c33c3e ("ice: add lock around Tx timestamp tracker flush") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-25ice: Respond to a NETDEV_UNREGISTER event for LAGDave Ertman
When the PF is a member of a link aggregate, and the driver is removed, the process will hang unless we respond to the NETDEV_UNREGISTER event that is sent to the event_handler for LAG. Add a case statement for the ice_lag_event_handler to unlink the PF from the link aggregate. Also remove code that was incorrectly applying a dev_hold to peer_netdevs that were associated with the ice driver. Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-25Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for ↵Amit Pundir
sm8250 target" This reverts commit 001ce9785c0674d913531345e86222c965fc8bf4. This upstream commit broke AOSP (post Android 12 merge) build on RB5. The device either silently crashes into USB crash mode after android boot animation or we see a blank blue screen with following dpu errors in dmesg: [ T444] hw recovery is not complete for ctl:3 [ T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu error]enc31 intf1 ctl 3 reset failure: -22 [ T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout [ T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110 [ C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame done timeout [ T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout [ T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110 Fixes: 001ce9785c06 ("arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target") Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20211014135410.4136412-1-dmitry.baryshkov@linaro.org
2021-10-25btrfs: subpage: introduce btrfs_subpage_bitmap_infoQu Wenruo
Currently we use fixed size u16 bitmap for subpage bitmap. This is fine for 4K sectorsize with 64K page size. But for 4K sectorsize and larger page size, the bitmap is too small, while for smaller page size like 16K, u16 bitmaps waste too much space. Here we introduce a new helper structure, btrfs_subpage_bitmap_info, to record the proper bitmap size, and where each bitmap should start at. By this, we can later compact all subpage bitmaps into one u32 bitmap. This patch is the first step. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directlyQu Wenruo
The existing calling convention of btrfs_alloc_subpage() is pretty awful. Change it to a more common pattern by returning struct btrfs_subpage directly and let the caller to determine if the call succeeded. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller ↵Qu Wenruo
than PAGE_SIZE There are two call sites of btrfs_alloc_subpage(): - btrfs_attach_subpage() We have ensured sectorsize is smaller than PAGE_SIZE - alloc_extent_buffer() We call btrfs_alloc_subpage() unconditionally. The alloc_extent_buffer() forces us to check the sectorsize size against page size inside btrfs_alloc_subpage(). Since the function name, btrfs_alloc_subpage(), already indicates it should only get called for subpage cases, do the check in alloc_extent_buffer() and add an ASSERT() in btrfs_alloc_subpage(). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: update comment for fs_devices::seed_list in btrfs_rm_deviceSu Yue
Update it since commit 944d3f9fac61 ("btrfs: switch seed device to list api") did conversion from fs_devices::seed to fs_devices::seed_list. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Su Yue <l@damenly.su> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: drop unnecessary ret in ioctl_quota_rescan_statusAnand Jain
There is no need for the variable ret after d66105cfa873 ("btrfs: allocate btrfs_ioctl_quota_rescan_args on stack"), remove it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: send: simplify send_create_inode_if_neededMarcos Paulo de Souza
The out label is being overused, we can simply return if the condition permits. No functional changes. Reviewed-by: Su Yue <l@damenly.su> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: rename btrfs_alloc_chunk to btrfs_create_chunkNikolay Borisov
The user facing function used to allocate new chunks is btrfs_chunk_alloc, unfortunately there is yet another similar sounding function - btrfs_alloc_chunk. This creates confusion, especially since the latter function can be considered "private" in the sense that it implements the first stage of chunk creation and as such is called by btrfs_chunk_alloc. To avoid the awkwardness that comes with having similarly named but distinctly different in their purpose function rename btrfs_alloc_chunk to btrfs_create_chunk, given that the main purpose of this function is to orchestrate the whole process of allocating a chunk - reserving space into devices, deciding on characteristics of the stripe size and creating the in-memory structures. Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25Linux 5.15-rc7Linus Torvalds
2021-10-25secretmem: Prevent secretmem_users from wrapping to zeroMatthew Wilcox (Oracle)
Commit 110860541f44 ("mm/secretmem: use refcount_t instead of atomic_t") attempted to fix the problem of secretmem_users wrapping to zero and allowing suspend once again. But it was reverted in commit 87066fdd2e30 ("Revert 'mm/secretmem: use refcount_t instead of atomic_t'") because of the problems it caused - a refcount_t was not semantically the right type to use. Instead prevent secretmem_users from wrapping to zero by forbidding new users if the number of users has wrapped from positive to negative. This stops a long way short of reaching the necessary 4 billion users where it wraps to zero again, so there's no need to be clever with special anti-wrap types or checking the return value from atomic_inc(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Jordy Zomer <jordy@pwning.systems> Cc: Kees Cook <keescook@chromium.org>, Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25Merge branch 'bluetooth-don-t-write-directly-to-netdev-dev_addr'Jakub Kicinski
Jakub Kicinski says: ==================== bluetooth: don't write directly to netdev->dev_addr The usual conversions. ==================== Link: https://lore.kernel.org/r/20211022231834.2710245-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25bluetooth: use dev_addr_set()Jakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Reviewed-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25bluetooth: use eth_hw_addr_set()Jakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Convert bluetooth from memcpy(... ETH_ADDR) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, ETH_ALEN) + eth_hw_addr_set(dev, np) Reviewed-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25spi: Fix tegra20 build with CONFIG_PM=n once againLinus Torvalds
Commit efafec27c565 ("spi: Fix tegra20 build with CONFIG_PM=n") already fixed the build without PM support once. There was an alternative fix by Guenter in commit 2bab94090b01 ("spi: tegra20-slink: Declare runtime suspend and resume functions conditionally"), and Mark then merged the two correctly in ffb1e76f4f32 ("Merge tag 'v5.15-rc2' into spi-5.15"). But for some inexplicable reason, Mark then merged things _again_ in commit 59c4e190b10c ("Merge tag 'v5.15-rc3' into spi-5.15"), and screwed things up at that point, and the __maybe_unused attribute on tegra_slink_runtime_resume() went missing. Reinstate it, so that alpha (and other architectures without PM support) builds cleanly again. Btw, this is another prime example of how random back-merges are not good. Just don't do them. Subsystem developers should not merge my tree in any normal circumstances. Both of those merge commits pointed to above are bad: even the one that got the merge result right doesn't even mention _why_ it was done, and the one that got it wrong is obviously broken. Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: - Fix clang-related relocation warning in futex code - Fix incorrect use of get_kernel_nofault() - Fix bad code generation in __get_user_check() when kasan is enabled - Ensure TLB function table is correctly aligned - Remove duplicated string function definitions in decompressor - Fix link-time orphan section warnings - Fix old-style function prototype for arch_init_kprobes() - Only warn about XIP address when not compile testing - Handle BE32 big endian for keystone2 remapping * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S ARM: 9141/1: only warn about XIP address when not compile testing ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype ARM: 9138/1: fix link warning with XIP + frame-pointer ARM: 9134/1: remove duplicate memcpy() definition ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images ARM: 9125/1: fix incorrect use of get_kernel_nofault() ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
2021-10-25fddi: defza: add missing pointer type castJakub Kicinski
hw_addr is a uint AKA unsigned int. dev_addr_set() takes a u8 *. drivers/net/fddi/defza.c:1383:27: error: passing argument 2 of 'dev_addr_set' from incompatible pointer type [-Werror=incompatible-pointer-types] Reported-by: kernel test robot <lkp@intel.com> Fixes: 1e9258c389ee ("fddi: defxx,defza: use dev_addr_set()") Acked-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/20211025160000.2803818-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25Merge tag 'libata-5.15-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull libata fix from Damien Le Moal: "A single fix in this pull request addressing an invalid error code return in the sata_mv driver (from Zheyu)" * tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: sata_mv: Fix the error handling of mv_chip_id()
2021-10-25Merge tag 'pinctrl-v5.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Some late pin control fixes, the most generally annoying will probably be the AMD IRQ storm fix affecting the Microsoft surface. Summary: - Three fixes pertaining to Broadcom DT bindings. Some stuff didn't work out as inteded, we need to back out - A resume bug fix in the STM32 driver - Disable and mask the interrupts on probe in the AMD pinctrl driver, affecting Microsoft surface" * tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: amd: disable and mask interrupts on probe pinctrl: stm32: use valid pin identifier in stm32_pinctrl_resume() Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode" dt-bindings: pinctrl: brcm,ns-pinmux: drop unneeded CRU from example Revert "dt-bindings: pinctrl: bcm4708-pinmux: rework binding to use syscon"
2021-10-25sbitmap: silence data race warningJens Axboe
KCSAN complaints about the sbitmap hint update: ================================================================== BUG: KCSAN: data-race in sbitmap_queue_clear / sbitmap_queue_clear write to 0xffffe8ffffd145b8 of 4 bytes by interrupt on cpu 1: sbitmap_queue_clear+0xca/0xf0 lib/sbitmap.c:606 blk_mq_put_tag+0x82/0x90 __blk_mq_free_request+0x114/0x180 block/blk-mq.c:507 blk_mq_free_request+0x2c8/0x340 block/blk-mq.c:541 __blk_mq_end_request+0x214/0x230 block/blk-mq.c:565 blk_mq_end_request+0x37/0x50 block/blk-mq.c:574 lo_complete_rq+0xca/0x170 drivers/block/loop.c:541 blk_complete_reqs block/blk-mq.c:584 [inline] blk_done_softirq+0x69/0x90 block/blk-mq.c:589 __do_softirq+0x12c/0x26e kernel/softirq.c:558 run_ksoftirqd+0x13/0x20 kernel/softirq.c:920 smpboot_thread_fn+0x22f/0x330 kernel/smpboot.c:164 kthread+0x262/0x280 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 write to 0xffffe8ffffd145b8 of 4 bytes by interrupt on cpu 0: sbitmap_queue_clear+0xca/0xf0 lib/sbitmap.c:606 blk_mq_put_tag+0x82/0x90 __blk_mq_free_request+0x114/0x180 block/blk-mq.c:507 blk_mq_free_request+0x2c8/0x340 block/blk-mq.c:541 __blk_mq_end_request+0x214/0x230 block/blk-mq.c:565 blk_mq_end_request+0x37/0x50 block/blk-mq.c:574 lo_complete_rq+0xca/0x170 drivers/block/loop.c:541 blk_complete_reqs block/blk-mq.c:584 [inline] blk_done_softirq+0x69/0x90 block/blk-mq.c:589 __do_softirq+0x12c/0x26e kernel/softirq.c:558 run_ksoftirqd+0x13/0x20 kernel/softirq.c:920 smpboot_thread_fn+0x22f/0x330 kernel/smpboot.c:164 kthread+0x262/0x280 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 value changed: 0x00000035 -> 0x00000044 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.15.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ================================================================== which is a data race, but not an important one. This is just updating the percpu alloc hint, and the reader of that hint doesn't ever require it to be valid. Just annotate it with data_race() to silence this one. Reported-by: syzbot+4f8bfd804b4a1f95b8f6@syzkaller.appspotmail.com Acked-by: Marco Elver <elver@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-25fs: get rid of the res2 iocb->ki_complete argumentJens Axboe
The second argument was only used by the USB gadget code, yet everyone pays the overhead of passing a zero to be passed into aio, where it ends up being part of the aio res2 value. Now that everybody is passing in zero, kill off the extra argument. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-25x86/sev: Expose sev_es_ghcb_hv_call() for use by HyperVTianyu Lan
Hyper-V needs to issue the GHCB HV call in order to read/write MSRs in Isolation VMs. For that, expose sev_es_ghcb_hv_call(). The Hyper-V Isolation VMs are unenlightened guests and run a paravisor at VMPL0 for communicating. GHCB pages are being allocated and set up by that paravisor. Linux gets the GHCB page's physical address via MSR_AMD64_SEV_ES_GHCB from the paravisor and should not change it. Add a @set_ghcb_msr parameter to sev_es_ghcb_hv_call() to control whether the function should set the GHCB's address prior to the call or not and export that function for use by HyperV. [ bp: - Massage commit message - add a struct ghcb forward declaration to fix randconfig builds. ] Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20211025122116.264793-6-ltykernel@gmail.com
2021-10-25usb: remove res2 argument from gadget code completionsJens Axboe
The USB gadget code is the only code that every tried to utilize the 2nd argument of the aio completions, but there are strong suspicions that it was never actually used by anything on the userspace side. Out of the 3 cases that touch it, two of them just pass in the same as res, and the last one passes in error/transfer in res like any other normal use case. Remove the 2nd argument, pass 0 like the rest of the in-kernel users of kiocb based IO. Link: https://lore.kernel.org/linux-block/20211021174021.273c82b1.john@metanate.com/ Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>