summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-09-27accel/ivpu: Do not use wait event interruptibleStanislaw Gruszka
If we receive signal when waiting for IPC message response in ivpu_ipc_receive() we return error and continue to operate. Then the driver can send another IPC messages and re-use occupied slot of the message still processed by the firmware. This can result in corrupting firmware memory and following FW crash with messages: [ 3698.569719] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x1103, ret -512 [ 3698.569747] intel_vpu 0000:00:0b.0: [drm] ivpu_jsm_unregister_db(): Failed to unregister doorbell 3: -512 [ 3698.569756] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): IPC message vpu:0x88980000 not released by firmware [ 3698.569763] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): JSM message vpu:0x88980040 not released by firmware [ 3698.570234] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x110e, ret -512 [ 3698.570318] intel_vpu 0000:00:0b.0: [drm] *ERROR* ivpu_mmu_dump_event(): MMU EVTQ: 0x10 (Translation fault) SSID: 0 SID: 3, e[2] 00000000, e[3] 00000208, in addr: 0x88988000, fetch addr: 0x0 To fix the issue don't use interruptible variant of wait event to allow firmware to finish IPC processing. Fixes: 5d7422cfb498 ("accel/ivpu: Add IPC driver and JSM messages") Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-2-stanislaw.gruszka@linux.intel.com
2023-09-27MAINTAINERS: update nouveau maintainersDanilo Krummrich
Since I will continue to work on Nouveau consistently, also beyond my former and still ongoing VM_BIND/EXEC work, add myself to the list of Nouveau maintainers. Signed-off-by: Danilo Krummrich <dakr@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230926014913.7721-1-dakr@redhat.com
2023-09-26selftests: Fix wrong TARGET in kselftest top level MakefileJuntong Deng
The 'uevents' subdirectory does not exist in tools/testing/selftests/ and adding 'uevents' to the TARGETS list results in the following error: make[1]: Entering directory 'xx/tools/testing/selftests/uevents' make[1]: *** No targets specified and no makefile found. Stop. make[1]: Leaving directory 'xx/tools/testing/selftests/uevents' What actually exists in tools/testing/selftests/ is the 'uevent' subdirectory. Signed-off-by: Juntong Deng <juntong.deng@outlook.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-26ima: Finish deprecation of IMA_TRUSTED_KEYRING KconfigOleksandr Tymoshenko
The removal of IMA_TRUSTED_KEYRING made IMA_LOAD_X509 and IMA_BLACKLIST_KEYRING unavailable because the latter two depend on the former. Since IMA_TRUSTED_KEYRING was deprecated in favor of INTEGRITY_TRUSTED_KEYRING use it as a dependency for the two Kconfigs affected by the deprecation. Fixes: 5087fd9e80e5 ("ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig") Signed-off-by: Oleksandr Tymoshenko <ovt@google.com> Reviewed-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-09-26Merge tag 'wq-for-6.6-rc3-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: - Remove double allocation of wq_update_pod_attrs_buf - Fix missing allocation of pwq_release_worker when wq_cpu_intensive_thresh_us is set to a custom value * tag 'wq-for-6.6-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Fix missed pwq_release_worker creation in wq_cpu_intensive_thresh_init() workqueue: Removed double allocation of wq_update_pod_attrs_buf
2023-09-26i915/guc: Get runtime pm in busyness worker only if already activeUmesh Nerlige Ramappa
Ideally the busyness worker should take a gt pm wakeref because the worker only needs to be active while gt is awake. However, the gt_park path cancels the worker synchronously and this complicates the flow if the worker is also running at the same time. The cancel waits for the worker and when the worker releases the wakeref, that would call gt_park and would lead to a deadlock. The resolution is to take the global pm wakeref if runtime pm is already active. If not, we don't need to update the busyness stats as the stats would already be updated when the gt was parked. Note: - We do not requeue the worker if we cannot take a reference to runtime pm since intel_guc_busyness_unpark would requeue the worker in the resume path. - If the gt was parked longer than time taken for GT timestamp to roll over, we ignore those rollovers since we don't care about tracking the exact GT time. We only care about roll overs when the gt is active and running workloads. - There is a window of time between gt_park and runtime suspend, where the worker may run. This is acceptable since the worker will not find any new data to update busyness. v2: (Daniele) - Edit commit message and code comment - Use runtime pm in the worker - Put runtime pm after enabling the worker - Use Link tag and add Fixes tag v3: (Daniele) - Reword commit and comments and add details Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7077 Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230925192117.2497058-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit e2f99b79d4c594cdf7ab449e338d4947f5ea8903) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-09-26drm/i915/gt: Fix reservation address in ggtt_reserve_guc_topJavier Pello
There is an assertion in ggtt_reserve_guc_top that the global GTT is of size at least GUC_GGTT_TOP, which is not the case on a 32-bit platform; see commit 562d55d991b39ce376c492df2f7890fd6a541ffc ("drm/i915/bdw: Only use 2g GGTT for 32b platforms"). If GEM_BUG_ON is enabled, this triggers a BUG(); if GEM_BUG_ON is disabled, the subsequent reservation fails and the driver fails to initialise the device: i915 0000:00:02.0: [drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC i915 0000:00:02.0: Device initialization failed (-28) i915 0000:00:02.0: Please file a bug on drm/i915; see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details. i915: probe of 0000:00:02.0 failed with error -28 Make the reservation at the top of the available space, whatever that is, instead of assuming that the top will be GUC_GGTT_TOP. Fixes: 911800765ef6 ("drm/i915/uc: Reserve upper range of GGTT") Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9080 Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Fernando Pacheco <fernando.pacheco@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230902171039.2229126186d697dbcf62d6d8@otheo.eu (cherry picked from commit 0f3fa942d91165c2702577e9274d2ee1c7212afc) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-09-26i915: Limit the length of an sg list to the requested lengthMatthew Wilcox (Oracle)
The folio conversion changed the behaviour of shmem_sg_alloc_table() to put the entire length of the last folio into the sg list, even if the sg list should have been shorter. gen8_ggtt_insert_entries() relied on the list being the right length and would overrun the end of the page tables. Other functions may also have been affected. Clamp the length of the last entry in the sg list to be the expected length. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Fixes: 0b62af28f249 ("i915: convert shmem_sg_free_table() to use a folio_batch") Cc: stable@vger.kernel.org # 6.5.x Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9256 Link: https://lore.kernel.org/lkml/6287208.lOV4Wx5bFT@natalenko.name/ Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230919194855.347582-1-willy@infradead.org (cherry picked from commit 26a8e32e6d77900819c0c730fbfb393692dbbeea) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-09-26Merge tag 'for-6.6-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - delayed refs fixes: - fix race when refilling delayed refs block reserve - prevent transaction block reserve underflow when starting transaction - error message and value adjustments - fix build warnings with CONFIG_CC_OPTIMIZE_FOR_SIZE and -Wmaybe-uninitialized - fix for smatch report where uninitialized data from invalid extent buffer range could be returned to the caller - fix numeric overflow in statfs when calculating lower threshold for a full filesystem * tag 'for-6.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: initialize start_slot in btrfs_log_prealloc_extents btrfs: make sure to initialize start and len in find_free_dev_extent btrfs: reset destination buffer when read_extent_buffer() gets invalid range btrfs: properly report 0 avail for very full file systems btrfs: log message if extent item not found when running delayed extent op btrfs: remove redundant BUG_ON() from __btrfs_inc_extent_ref() btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 btrfs: prevent transaction block reserve underflow when starting transaction btrfs: fix race when refilling delayed refs block reserve
2023-09-26Merge tag 'linux-kselftest-fixes-6.6-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan: "One single fix to unmount tracefs when test created mount" * tag 'linux-kselftest-fixes-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/user_events: Fix to unmount tracefs when test created mount
2023-09-26Merge tag 'v6.6-rc4.vfs.fixes' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "This contains the usual miscellaneous fixes and cleanups for vfs and individual fses: Fixes: - Revert ki_pos on error from buffered writes for direct io fallback - Add missing documentation for block device and superblock handling for changes merged this cycle - Fix reiserfs flexible array usage - Ensure that overlayfs sets ctime when setting mtime and atime - Disable deferred caller completions with overlayfs writes until proper support exists Cleanups: - Remove duplicate initialization in pipe code - Annotate aio kioctx_table with __counted_by" * tag 'v6.6-rc4.vfs.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: overlayfs: set ctime when setting mtime and atime ntfs3: put resources during ntfs_fill_super() ovl: disable IOCB_DIO_CALLER_COMP porting: document superblock as block device holder porting: document new block device opening order fs/pipe: remove duplicate "offset" initializer fs-writeback: do not requeue a clean inode having skipped pages aio: Annotate struct kioctx_table with __counted_by direct_write_fallback(): on error revert the ->ki_pos update from buffered write reiserfs: Replace 1-element array with C99 style flex-array
2023-09-26Merge tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "Build: - Update header files in the tools/**/include directory to sync with the kernel sources as usual. - Remove unused bpf-prologue files. While it's not strictly a fix, but the functionality was removed in this cycle so better to get rid of the code together. - Other minor build fixes. Misc: - Fix uninitialized memory access in PMU parsing code - Fix segfaults on software event" * tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf jevent: fix core dump on software events on s390 perf pmu: Ensure all alias variables are initialized perf jevents metric: Fix type of strcmp_cpuid_str perf trace: Avoid compile error wrt redefining bool perf bpf-prologue: Remove unused file tools headers UAPI: Update tools's copy of drm.h headers tools arch x86: Sync the msr-index.h copy with the kernel sources perf bench sched-seccomp-notify: Use the tools copy of seccomp.h UAPI tools headers UAPI: Copy seccomp.h to be able to build 'perf bench' in older systems tools headers UAPI: Sync files changed by new fchmodat2 and map_shadow_stack syscalls with the kernel sources perf tools: Update copy of libbpf's hashmap.c
2023-09-26regulator/core: Revert "fix kobject release warning and memory leak in ↵Michał Mirosław
regulator_register()" This reverts commit 5f4b204b6b8153923d5be8002c5f7082985d153f. Since rdev->dev now has a release() callback, the proper way of freeing the initialized device can be restored. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/d7f469f3f7b1f0e1d52f9a7ede3f3c5703382090.1695077303.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2023-09-26regulator/core: regulator_register: set device->class earlierMichał Mirosław
When fixing a memory leak in commit d3c731564e09 ("regulator: plug of_node leak in regulator_register()'s error path") it moved the device_initialize() call earlier, but did not move the `dev->class` initialization. The bug was spotted and fixed by reverting part of the commit (in commit 5f4b204b6b81 "regulator: core: fix kobject release warning and memory leak in regulator_register()") but introducing a different bug: now early error paths use `kfree(dev)` instead of `put_device()` for an already initialized `struct device`. Move the missing assignments to just after `device_initialize()`. Fixes: d3c731564e09 ("regulator: plug of_node leak in regulator_register()'s error path") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/b5b19cb458c40c9d02f3d5a7bd1ba7d97ba17279.1695077303.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2023-09-26MAINTAINERS: aspeed: Update Andrew's email addressAndrew Jeffery
I've changed employers, have company email that deals with patch-based workflows without too much of a headache, and am trying to steer some content out of my personal mail. Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au> Link: https://lore.kernel.org/r/20230925030647.40283-1-andrew@codeconstruct.com.au Signed-off-by: Joel Stanley <joel@jms.id.au>
2023-09-26MAINTAINERS: aspeed: Update git tree URLZev Weiss
The description for joel/aspeed.git on git.kernel.org currently says: Old Aspeed tree. Please see joel/bmc.git Let's update MAINTAINERS accordingly. Signed-off-by: Zev Weiss <zev@bewilderbeest.net> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20230922223405.24717-2-zev@bewilderbeest.net Signed-off-by: Joel Stanley <joel@jms.id.au>
2023-09-26RDMA/mlx5: Fix mkey cache possible deadlock on cleanupShay Drory
Fix the deadlock by refactoring the MR cache cleanup flow to flush the workqueue without holding the rb_lock. This adds a race between cache cleanup and creation of new entries which we solve by denied creation of new entries after cache cleanup started. Lockdep: WARNING: possible circular locking dependency detected [ 2785.326074 ] 6.2.0-rc6_for_upstream_debug_2023_01_31_14_02 #1 Not tainted [ 2785.339778 ] ------------------------------------------------------ [ 2785.340848 ] devlink/53872 is trying to acquire lock: [ 2785.341701 ] ffff888124f8c0c8 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0xc8/0x900 [ 2785.343403 ] [ 2785.343403 ] but task is already holding lock: [ 2785.344464 ] ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib] [ 2785.346273 ] [ 2785.346273 ] which lock already depends on the new lock. [ 2785.346273 ] [ 2785.347720 ] [ 2785.347720 ] the existing dependency chain (in reverse order) is: [ 2785.349003 ] [ 2785.349003 ] -> #1 (&dev->cache.rb_lock){+.+.}-{3:3}: [ 2785.350160 ] __mutex_lock+0x14c/0x15c0 [ 2785.350962 ] delayed_cache_work_func+0x2d1/0x610 [mlx5_ib] [ 2785.352044 ] process_one_work+0x7c2/0x1310 [ 2785.352879 ] worker_thread+0x59d/0xec0 [ 2785.353636 ] kthread+0x28f/0x330 [ 2785.354370 ] ret_from_fork+0x1f/0x30 [ 2785.355135 ] [ 2785.355135 ] -> #0 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}: [ 2785.356515 ] __lock_acquire+0x2d8a/0x5fe0 [ 2785.357349 ] lock_acquire+0x1c1/0x540 [ 2785.358121 ] __flush_work+0xe8/0x900 [ 2785.358852 ] __cancel_work_timer+0x2c7/0x3f0 [ 2785.359711 ] mlx5_mkey_cache_cleanup+0xfb/0x250 [mlx5_ib] [ 2785.360781 ] mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x16/0x30 [mlx5_ib] [ 2785.361969 ] __mlx5_ib_remove+0x68/0x120 [mlx5_ib] [ 2785.362960 ] mlx5r_remove+0x63/0x80 [mlx5_ib] [ 2785.363870 ] auxiliary_bus_remove+0x52/0x70 [ 2785.364715 ] device_release_driver_internal+0x3c1/0x600 [ 2785.365695 ] bus_remove_device+0x2a5/0x560 [ 2785.366525 ] device_del+0x492/0xb80 [ 2785.367276 ] mlx5_detach_device+0x1a9/0x360 [mlx5_core] [ 2785.368615 ] mlx5_unload_one_devl_locked+0x5a/0x110 [mlx5_core] [ 2785.369934 ] mlx5_devlink_reload_down+0x292/0x580 [mlx5_core] [ 2785.371292 ] devlink_reload+0x439/0x590 [ 2785.372075 ] devlink_nl_cmd_reload+0xaef/0xff0 [ 2785.372973 ] genl_family_rcv_msg_doit.isra.0+0x1bd/0x290 [ 2785.374011 ] genl_rcv_msg+0x3ca/0x6c0 [ 2785.374798 ] netlink_rcv_skb+0x12c/0x360 [ 2785.375612 ] genl_rcv+0x24/0x40 [ 2785.376295 ] netlink_unicast+0x438/0x710 [ 2785.377121 ] netlink_sendmsg+0x7a1/0xca0 [ 2785.377926 ] sock_sendmsg+0xc5/0x190 [ 2785.378668 ] __sys_sendto+0x1bc/0x290 [ 2785.379440 ] __x64_sys_sendto+0xdc/0x1b0 [ 2785.380255 ] do_syscall_64+0x3d/0x90 [ 2785.381031 ] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 2785.381967 ] [ 2785.381967 ] other info that might help us debug this: [ 2785.381967 ] [ 2785.383448 ] Possible unsafe locking scenario: [ 2785.383448 ] [ 2785.384544 ] CPU0 CPU1 [ 2785.385383 ] ---- ---- [ 2785.386193 ] lock(&dev->cache.rb_lock); [ 2785.386940 ] lock((work_completion)(&(&ent->dwork)->work)); [ 2785.388327 ] lock(&dev->cache.rb_lock); [ 2785.389425 ] lock((work_completion)(&(&ent->dwork)->work)); [ 2785.390414 ] [ 2785.390414 ] *** DEADLOCK *** [ 2785.390414 ] [ 2785.391579 ] 6 locks held by devlink/53872: [ 2785.392341 ] #0: ffffffff84c17a50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40 [ 2785.393630 ] #1: ffff888142280218 (&devlink->lock_key){+.+.}-{3:3}, at: devlink_get_from_attrs_lock+0x12d/0x2d0 [ 2785.395324 ] #2: ffff8881422d3c38 (&dev->lock_key){+.+.}-{3:3}, at: mlx5_unload_one_devl_locked+0x4a/0x110 [mlx5_core] [ 2785.397322 ] #3: ffffffffa0e59068 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_detach_device+0x60/0x360 [mlx5_core] [ 2785.399231 ] #4: ffff88810e3cb0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600 [ 2785.400864 ] #5: ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib] Fixes: b95845178328 ("RDMA/mlx5: Change the cache structure to an RB-tree") Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-09-26RDMA/mlx5: Fix NULL string errorShay Drory
checkpath is complaining about NULL string, change it to 'Unknown'. Fixes: 37aa5c36aa70 ("IB/mlx5: Add UARs write-combining and non-cached mapping") Signed-off-by: Shay Drory <shayd@nvidia.com> Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26RDMA/mlx5: Fix mutex unlocking on error flow for steering anchor creationHamdan Igbaria
The mutex was not unlocked on some of the error flows. Moved the unlock location to include all the error flow scenarios. Fixes: e1f4a52ac171 ("RDMA/mlx5: Create an indirect flow table for steering anchor") Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com> Link: https://lore.kernel.org/r/1244a69d783da997c0af0b827c622eb00495492e.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26RDMA/mlx5: Fix assigning access flags to cache mkeysMichael Guralnik
After the change to use dynamic cache structure, new cache entries can be added and the mkey allocation can no longer assume that all mkeys created for the cache have access_flags equal to zero. Example of a flow that exposes the issue: A user registers MR with RO on a HCA that cannot UMR RO and the mkey is created outside of the cache. When the user deregisters the MR, a new cache entry is created to store mkeys with RO. Later, the user registers 2 MRs with RO. The first MR is reused from the new cache entry. When we try to get the second mkey from the cache we see the entry is empty so we go to the MR cache mkey allocation flow which would have allocated a mkey with no access flags, resulting the user getting a MR without RO. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Reviewed-by: Edward Srouji <edwards@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/8a802700b82def3ace3f77cd7a9ad9d734af87e7.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26ASoC: fsl-asoc-card: use integer type for fll_id and pll_idShengjiu Wang
As the pll_id and pll_id can be zero (WM8960_SYSCLK_AUTO) with the commit 2bbc2df46e67 ("ASoC: wm8960: Make automatic the default clocking mode") Then the machine driver will skip to call set_sysclk() and set_pll() for codec, when the sysclk rate is different with what wm8960 read at probe, the output sound frequency is wrong. So change the fll_id and pll_id initial value, still keep machine driver's behavior same as before. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://lore.kernel.org/r/1695202992-24864-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-09-26rbd: take header_rwsem in rbd_dev_refresh() only when updatingIlya Dryomov
rbd_dev_refresh() has been holding header_rwsem across header and parent info read-in unnecessarily for ages. With commit 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held"), the potential for deadlocks became much more real owning to a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores not allowing new readers after a writer is registered. For example, assuming that I/O request 1, I/O request 2 and header read-in request all target the same OSD: 1. I/O request 1 comes in and gets submitted 2. watch error occurs 3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and releases lock_rwsem 4. after reestablishing the watch, rbd_reregister_watch() calls rbd_dev_refresh() which takes header_rwsem for write and submits a header read-in request 5. I/O request 2 comes in: after taking lock_rwsem for read in __rbd_img_handle_request(), it blocks trying to take header_rwsem for read in rbd_img_object_requests() 6. another watch error occurs 7. rbd_watch_errcb() blocks trying to take lock_rwsem for write 8. I/O request 1 completion is received by the messenger but can't be processed because lock_rwsem won't be granted anymore 9. header read-in request completion can't be received, let alone processed, because the messenger is stranded Change rbd_dev_refresh() to take header_rwsem only for actually updating rbd_dev->header. Header and parent info read-in don't need any locking. Cc: stable@vger.kernel.org # 0b035401c570: rbd: move rbd_dev_refresh() definition Cc: stable@vger.kernel.org # 510a7330c82a: rbd: decouple header read-in from updating rbd_dev->header Cc: stable@vger.kernel.org # c10311776f0a: rbd: decouple parent info read-in from updating rbd_dev Cc: stable@vger.kernel.org Fixes: 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26rbd: decouple parent info read-in from updating rbd_devIlya Dryomov
Unlike header read-in, parent info read-in is already decoupled in get_parent_info(), but it's buried in rbd_dev_v2_parent_info() along with the processing logic. Separate the initial read-in and update read-in logic into rbd_dev_setup_parent() and rbd_dev_update_parent() respectively and have rbd_dev_v2_parent_info() just populate struct parent_image_info (i.e. what get_parent_info() did). Some existing QoI issues, like flatten of a standalone clone being disregarded on refresh, remain. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26rbd: decouple header read-in from updating rbd_dev->headerIlya Dryomov
Make rbd_dev_header_info() populate a passed struct rbd_image_header instead of rbd_dev->header and introduce rbd_dev_update_header() for updating mutable fields in rbd_dev->header upon refresh. The initial read-in of both mutable and immutable fields in rbd_dev_image_probe() passes in rbd_dev->header so no update step is required there. rbd_init_layout() is now called directly from rbd_dev_image_probe() instead of individually in format 1 and format 2 implementations. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26rbd: move rbd_dev_refresh() definitionIlya Dryomov
Move rbd_dev_refresh() definition further down to avoid having to move struct parent_image_info definition in the next commit. This spares some forward declarations too. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26wifi: mac80211: expand __ieee80211_data_to_8023() statusJohannes Berg
Make __ieee80211_data_to_8023() return more individual drop reasons instead of just doing RX_DROP_U_INVALID_8023. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return valueJohannes Berg
This has many different reasons, split the return value into the individual reasons for better traceability. Also, since symbolic tracing doesn't work for these, add a few comments for the numbering. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26wifi: mac80211: remove RX_DROP_UNUSABLEJohannes Berg
Convert all instances of RX_DROP_UNUSABLE to indicate a better reason, and then remove RX_DROP_UNUSABLE. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26wifi: mac80211: fix check for unusable RX resultJohannes Berg
If we just check "result & RX_DROP_UNUSABLE", this really only works by accident, because SKB_DROP_REASON_SUBSYS_MAC80211_UNUSABLE got to have the value 1, and SKB_DROP_REASON_SUBSYS_MAC80211_MONITOR is 2. Fix this to really check the entire subsys mask for the value, so it doesn't matter what the subsystem value is. Fixes: 7f4e09700bdc ("wifi: mac80211: report all unusable beacon frames") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26wifi: cfg80211: add local_state_change to deauth traceJohannes Berg
Add the local_state_change request to the deauth trace for easier debugging. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26wifi: mac80211: Create resources for disabled linksBenjamin Berg
When associating to an MLD AP, links may be disabled. Create all resources associated with a disabled link so that we can later enable it without having to create these resources on the fly. Fixes: 6d543b34dbcf ("wifi: mac80211: Support disabled links during association") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://lore.kernel.org/r/20230925173028.f9afdb26f6c7.I4e6e199aaefc1bf017362d64f3869645fa6830b5@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26wifi: cfg80211: avoid leaking stack data into traceBenjamin Berg
If the structure is not initialized then boolean types might be copied into the tracing data without being initialised. This causes data from the stack to leak into the trace and also triggers a UBSAN failure which can easily be avoided here. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://lore.kernel.org/r/20230925171855.a9271ef53b05.I8180bae663984c91a3e036b87f36a640ba409817@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26block: fix kernel-doc for disk_force_media_change()Randy Dunlap
Drop one function parameter's kernel-doc comment since the parameter was removed. This prevents a kernel-doc warning: block/disk-events.c:300: warning: Excess function parameter 'events' description in 'disk_force_media_change' Fixes: ab6860f62bfe ("block: simplify the disk_force_media_change interface") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Closes: lore.kernel.org/r/202309060957.vfl0mUur-lkp@intel.com Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230926005232.23666-1-rdunlap@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-09-25Merge branch 'bpf: Add missed stats for kprobes'Andrii Nakryiko
Jiri Olsa says: ==================== hi, at the moment we can't retrieve the number of missed kprobe executions and subsequent execution of BPF programs. This patchset adds: - counting of missed execution on attach layer for: . kprobes attached through perf link (kprobe/ftrace) . kprobes attached through kprobe.multi link (fprobe) - counting of recursion_misses for BPF kprobe programs It's still technically possible to create kprobe without perf link (using SET_BPF perf ioctl) in which case we don't have a way to retrieve the kprobe's 'missed' count. However both libbpf and cilium/ebpf libraries use perf link if it's available, and for old kernels without perf link support we can use BPF program to retrieve the kprobe missed count. v3 changes: - added acks [Song] - make test_missed not serial [Andrii] Also available at: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git bpf/missed_stats thanks, jirka ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-25selftests/bpf: Add test for recursion counts of perf event link tracepointJiri Olsa
Adding selftest that puts kprobe on bpf_fentry_test1 that calls bpf_printk and invokes bpf_trace_printk tracepoint. The bpf_trace_printk tracepoint has test[234] programs attached to it. Because kprobe execution goes through bpf_prog_active check, programs attached to the tracepoint will fail the recursion check and increment the recursion_misses stats. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-10-jolsa@kernel.org
2023-09-25selftests/bpf: Add test for recursion counts of perf event link kprobeJiri Olsa
Adding selftest that puts kprobe.multi on bpf_fentry_test1 that calls bpf_kfunc_common_test kfunc which has 3 perf event kprobes and 1 kprobe.multi attached. Because fprobe (kprobe.multi attach layear) does not have strict recursion check the kprobe's bpf_prog_active check is hit for test2-5. Disabling this test for arm64, because there's no fprobe support yet. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-9-jolsa@kernel.org
2023-09-25selftests/bpf: Add test for missed counts of perf event link kprobeJiri Olsa
Adding test that puts kprobe on bpf_fentry_test1 that calls bpf_kfunc_common_test kfunc, which has also kprobe on. The latter won't get triggered due to kprobe recursion check and kprobe missed counter is incremented. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-8-jolsa@kernel.org
2023-09-25bpftool: Display missed count for kprobe perf linkJiri Olsa
Adding 'missed' field to display missed counts for kprobes attached by perf event link, like: # bpftool link 5: perf_event prog 82 kprobe ffffffff815203e0 ksys_write 6: perf_event prog 83 kprobe ffffffff811d1e50 scheduler_tick missed 682217 # bpftool link -jp [{ "id": 5, "type": "perf_event", "prog_id": 82, "retprobe": false, "addr": 18446744071584220128, "func": "ksys_write", "offset": 0, "missed": 0 },{ "id": 6, "type": "perf_event", "prog_id": 83, "retprobe": false, "addr": 18446744071580753488, "func": "scheduler_tick", "offset": 0, "missed": 693469 } ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-7-jolsa@kernel.org
2023-09-25bpftool: Display missed count for kprobe_multi linkJiri Olsa
Adding 'missed' field to display missed counts for kprobes attached by kprobe multi link, like: # bpftool link 5: kprobe_multi prog 76 kprobe.multi func_cnt 1 missed 1 addr func [module] ffffffffa039c030 fp3_test [fprobe_test] # bpftool link -jp [{ "id": 5, "type": "kprobe_multi", "prog_id": 76, "retprobe": false, "func_cnt": 1, "missed": 1, "funcs": [{ "addr": 18446744072102723632, "func": "fp3_test", "module": "fprobe_test" } ] } ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-6-jolsa@kernel.org
2023-09-25bpf: Count missed stats in trace_call_bpfJiri Olsa
Increase misses stats in case bpf array execution is skipped because of recursion check in trace_call_bpf. Adding bpf_prog_inc_misses_counters that increase misses counts for all bpf programs in bpf_prog_array. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-5-jolsa@kernel.org
2023-09-25bpf: Add missed value to kprobe perf link infoJiri Olsa
Add missed value to kprobe attached through perf link info to hold the stats of missed kprobe handler execution. The kprobe's missed counter gets incremented when kprobe handler is not executed due to another kprobe running on the same cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-4-jolsa@kernel.org
2023-09-25bpf: Add missed value to kprobe_multi link infoJiri Olsa
Add missed value to kprobe_multi link info to hold the stats of missed kprobe_multi probe. The missed counter gets incremented when fprobe fails the recursion check or there's no rethook available for return probe. In either case the attached bpf program is not executed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-3-jolsa@kernel.org
2023-09-25bpf: Count stats for kprobe_multi programsJiri Olsa
Adding support to gather missed stats for kprobe_multi programs due to bpf_prog_active protection. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-2-jolsa@kernel.org
2023-09-25Merge branch 'add libbpf getters for individual ringbuffers'Andrii Nakryiko
Martin Kelly says: ==================== This patch series adds a new ring__ API to libbpf exposing getters for accessing the individual ringbuffers inside a struct ring_buffer. This is useful for polling individually, getting available data, or similar use cases. The API looks like this, and was roughly proposed by Andrii Nakryiko in another thread: Getting a ring struct: struct ring *ring_buffer__ring(struct ring_buffer *rb, unsigned int idx); Using the ring struct: unsigned long ring__consumer_pos(const struct ring *r); unsigned long ring__producer_pos(const struct ring *r); size_t ring__avail_data_size(const struct ring *r); size_t ring__size(const struct ring *r); int ring__map_fd(const struct ring *r); int ring__consume(struct ring *r); Changes in v2: - Addressed all feedback from Andrii Nakryiko ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-25selftests/bpf: Add tests for ring__consumeMartin Kelly
Add tests for new API ring__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-15-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__consumeMartin Kelly
Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com
2023-09-25selftests/bpf: Add tests for ring__map_fdMartin Kelly
Add tests for the new API ring__map_fd. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-13-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__map_fdMartin Kelly
Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com
2023-09-25selftests/bpf: Add tests for ring__sizeMartin Kelly
Add tests for the new API ring__size. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-11-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__sizeMartin Kelly
Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com