linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-09-26	mmc: mtk-sd: Use readl_poll_timeout_atomic in msdc_reset_hw	Pablo Sun
	Use atomic readl_poll_timeout_atomic, because msdc_reset_hw may be invoked in IRQ handler in the following context: msdc_irq() -> msdc_cmd_done() -> msdc_reset_hw() The following kernel BUG stack trace can be observed on Genio 1200 EVK after initializing MSDC1 hardware during kernel boot: [ 1.187441] BUG: scheduling while atomic: swapper/0/0/0x00010002 [ 1.189157] Modules linked in: [ 1.204633] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.15.42-mtk+modified #1 [ 1.205713] Hardware name: MediaTek Genio 1200 EVK-P1V2-EMMC (DT) [ 1.206484] Call trace: [ 1.206796] dump_backtrace+0x0/0x1ac [ 1.207266] show_stack+0x24/0x30 [ 1.207692] dump_stack_lvl+0x68/0x84 [ 1.208162] dump_stack+0x1c/0x38 [ 1.208587] __schedule_bug+0x68/0x80 [ 1.209056] __schedule+0x6ec/0x7c0 [ 1.209502] schedule+0x7c/0x110 [ 1.209915] schedule_hrtimeout_range_clock+0xc4/0x1f0 [ 1.210569] schedule_hrtimeout_range+0x20/0x30 [ 1.211148] usleep_range_state+0x84/0xc0 [ 1.211661] msdc_reset_hw+0xc8/0x1b0 [ 1.212134] msdc_cmd_done.isra.0+0x4ac/0x5f0 [ 1.212693] msdc_irq+0x104/0x2d4 [ 1.213121] __handle_irq_event_percpu+0x68/0x280 [ 1.213725] handle_irq_event+0x70/0x15c [ 1.214230] handle_fasteoi_irq+0xb0/0x1a4 [ 1.214755] handle_domain_irq+0x6c/0x9c [ 1.215260] gic_handle_irq+0xc4/0x180 [ 1.215741] call_on_irq_stack+0x2c/0x54 [ 1.216245] do_interrupt_handler+0x5c/0x70 [ 1.216782] el1_interrupt+0x30/0x80 [ 1.217242] el1h_64_irq_handler+0x1c/0x2c [ 1.217769] el1h_64_irq+0x78/0x7c [ 1.218206] cpuidle_enter_state+0xc8/0x600 [ 1.218744] cpuidle_enter+0x44/0x5c [ 1.219205] do_idle+0x224/0x2d0 [ 1.219624] cpu_startup_entry+0x30/0x80 [ 1.220129] rest_init+0x108/0x134 [ 1.220568] arch_call_rest_init+0x1c/0x28 [ 1.221094] start_kernel+0x6c0/0x700 [ 1.221564] __primary_switched+0xc0/0xc8 Fixes: ffaea6ebfe9c ("mmc: mtk-sd: Use readl_poll_timeout instead of open-coded polling") Signed-off-by: Pablo Sun <pablo.sun@mediatek.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioachino.delregno@collabora.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230922095348.22182-1-pablo.sun@mediatek.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-09-26	mmc: core: Fix error propagation for some ioctl commands	Ulf Hansson
	Userspace has currently no way of checking the internal R1 response error bits for some commands. This is a problem for some commands, like RPMB for example. Typically, we may detect that the busy completion has successfully ended, while in fact the card did not complete the requested operation. To fix the problem, let's always poll with CMD13 for these commands and during the polling, let's also aggregate the R1 response bits. Before completing the ioctl request, let's propagate the R1 response bits too. Reviewed-by: Avri Altman <avri.altman@wdc.com> Co-developed-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230913112921.553019-1-ulf.hansson@linaro.org
2023-09-26	mmc: sdhci-sprd: Fix error code in sdhci_sprd_tuning()	Dan Carpenter
	Return an error code if sdhci_sprd_get_best_clk_sample() fails. Currently, it returns success. Fixes: d83d251bf3c2 ("mmc: sdhci-sprd: Add SD HS mode online tuning") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wenchao Chen <wenchao.chen@unisoc.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/a8af0a08-8405-43cc-bd83-85ff25f572ca@moroto.mountain Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-09-26	mmc: sdhci-pci-gli: fix LPM negotiation so x86/S0ix SoCs can suspend	Sven van Ashbrook
	To improve the r/w performance of GL9763E, the current driver inhibits LPM negotiation while the device is active. This prevents a large number of SoCs from suspending, notably x86 systems which commonly use S0ix as the suspend mechanism - for example, Intel Alder Lake and Raptor Lake processors. Failure description: 1. Userspace initiates s2idle suspend (e.g. via writing to /sys/power/state) 2. This switches the runtime_pm device state to active, which disables LPM negotiation, then calls the "regular" suspend callback 3. With LPM negotiation disabled, the bus cannot enter low-power state 4. On a large number of SoCs, if the bus not in a low-power state, S0ix cannot be entered, which in turn prevents the SoC from entering suspend. Fix by re-enabling LPM negotiation in the device's suspend callback. Suggested-by: Stanislaw Kardach <skardach@google.com> Fixes: f9e5b33934ce ("mmc: host: Improve I/O read/write performance for GL9763E") Cc: stable@vger.kernel.org Signed-off-by: Sven van Ashbrook <svenva@chromium.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20230831160055.v3.1.I7ed1ca09797be2dd76ca914c57d88b32d24dac88@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-09-26	mmc: core: sdio: hold retuning if sdio in 1-bit mode	Haibo Chen
	tuning only support in 4-bit mode or 8 bit mode, so in 1-bit mode, need to hold retuning. Find this issue when use manual tuning method on imx93. When system resume back, SDIO WIFI try to switch back to 4 bit mode, first will trigger retuning, and all tuning command failed. Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: dfa13ebbe334 ("mmc: host: Add facility to support re-tuning") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230830093922.3095850-1-haibo.chen@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-09-26	dt-bindings: mmc: sdhci-msm: correct minimum number of clocks	Krzysztof Kozlowski
	In the TXT binding before conversion, the "xo" clock was listed as optional. Conversion kept it optional in "clock-names", but not in "clocks". This fixes dbts_check warnings like: qcom-sdx65-mtp.dtb: mmc@8804000: clocks: [[13, 59], [13, 58]] is too short Cc: <stable@vger.kernel.org> Fixes: a45537723f4b ("dt-bindings: mmc: sdhci-msm: Convert bindings to yaml") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230825135503.282135-1-krzysztof.kozlowski@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-09-26	MAINTAINERS: aspeed: Update Andrew's email address	Andrew Jeffery
	I've changed employers, have company email that deals with patch-based workflows without too much of a headache, and am trying to steer some content out of my personal mail. Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au> Link: https://lore.kernel.org/r/20230925030647.40283-1-andrew@codeconstruct.com.au Signed-off-by: Joel Stanley <joel@jms.id.au>
2023-09-26	MAINTAINERS: aspeed: Update git tree URL	Zev Weiss
	The description for joel/aspeed.git on git.kernel.org currently says: Old Aspeed tree. Please see joel/bmc.git Let's update MAINTAINERS accordingly. Signed-off-by: Zev Weiss <zev@bewilderbeest.net> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20230922223405.24717-2-zev@bewilderbeest.net Signed-off-by: Joel Stanley <joel@jms.id.au>
2023-09-26	RDMA/mlx5: Fix mkey cache possible deadlock on cleanup	Shay Drory
	Fix the deadlock by refactoring the MR cache cleanup flow to flush the workqueue without holding the rb_lock. This adds a race between cache cleanup and creation of new entries which we solve by denied creation of new entries after cache cleanup started. Lockdep: WARNING: possible circular locking dependency detected [ 2785.326074 ] 6.2.0-rc6_for_upstream_debug_2023_01_31_14_02 #1 Not tainted [ 2785.339778 ] ------------------------------------------------------ [ 2785.340848 ] devlink/53872 is trying to acquire lock: [ 2785.341701 ] ffff888124f8c0c8 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0xc8/0x900 [ 2785.343403 ] [ 2785.343403 ] but task is already holding lock: [ 2785.344464 ] ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib] [ 2785.346273 ] [ 2785.346273 ] which lock already depends on the new lock. [ 2785.346273 ] [ 2785.347720 ] [ 2785.347720 ] the existing dependency chain (in reverse order) is: [ 2785.349003 ] [ 2785.349003 ] -> #1 (&dev->cache.rb_lock){+.+.}-{3:3}: [ 2785.350160 ] __mutex_lock+0x14c/0x15c0 [ 2785.350962 ] delayed_cache_work_func+0x2d1/0x610 [mlx5_ib] [ 2785.352044 ] process_one_work+0x7c2/0x1310 [ 2785.352879 ] worker_thread+0x59d/0xec0 [ 2785.353636 ] kthread+0x28f/0x330 [ 2785.354370 ] ret_from_fork+0x1f/0x30 [ 2785.355135 ] [ 2785.355135 ] -> #0 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}: [ 2785.356515 ] __lock_acquire+0x2d8a/0x5fe0 [ 2785.357349 ] lock_acquire+0x1c1/0x540 [ 2785.358121 ] __flush_work+0xe8/0x900 [ 2785.358852 ] __cancel_work_timer+0x2c7/0x3f0 [ 2785.359711 ] mlx5_mkey_cache_cleanup+0xfb/0x250 [mlx5_ib] [ 2785.360781 ] mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x16/0x30 [mlx5_ib] [ 2785.361969 ] __mlx5_ib_remove+0x68/0x120 [mlx5_ib] [ 2785.362960 ] mlx5r_remove+0x63/0x80 [mlx5_ib] [ 2785.363870 ] auxiliary_bus_remove+0x52/0x70 [ 2785.364715 ] device_release_driver_internal+0x3c1/0x600 [ 2785.365695 ] bus_remove_device+0x2a5/0x560 [ 2785.366525 ] device_del+0x492/0xb80 [ 2785.367276 ] mlx5_detach_device+0x1a9/0x360 [mlx5_core] [ 2785.368615 ] mlx5_unload_one_devl_locked+0x5a/0x110 [mlx5_core] [ 2785.369934 ] mlx5_devlink_reload_down+0x292/0x580 [mlx5_core] [ 2785.371292 ] devlink_reload+0x439/0x590 [ 2785.372075 ] devlink_nl_cmd_reload+0xaef/0xff0 [ 2785.372973 ] genl_family_rcv_msg_doit.isra.0+0x1bd/0x290 [ 2785.374011 ] genl_rcv_msg+0x3ca/0x6c0 [ 2785.374798 ] netlink_rcv_skb+0x12c/0x360 [ 2785.375612 ] genl_rcv+0x24/0x40 [ 2785.376295 ] netlink_unicast+0x438/0x710 [ 2785.377121 ] netlink_sendmsg+0x7a1/0xca0 [ 2785.377926 ] sock_sendmsg+0xc5/0x190 [ 2785.378668 ] __sys_sendto+0x1bc/0x290 [ 2785.379440 ] __x64_sys_sendto+0xdc/0x1b0 [ 2785.380255 ] do_syscall_64+0x3d/0x90 [ 2785.381031 ] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 2785.381967 ] [ 2785.381967 ] other info that might help us debug this: [ 2785.381967 ] [ 2785.383448 ] Possible unsafe locking scenario: [ 2785.383448 ] [ 2785.384544 ] CPU0 CPU1 [ 2785.385383 ] ---- ---- [ 2785.386193 ] lock(&dev->cache.rb_lock); [ 2785.386940 ] lock((work_completion)(&(&ent->dwork)->work)); [ 2785.388327 ] lock(&dev->cache.rb_lock); [ 2785.389425 ] lock((work_completion)(&(&ent->dwork)->work)); [ 2785.390414 ] [ 2785.390414 ] * DEADLOCK * [ 2785.390414 ] [ 2785.391579 ] 6 locks held by devlink/53872: [ 2785.392341 ] #0: ffffffff84c17a50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40 [ 2785.393630 ] #1: ffff888142280218 (&devlink->lock_key){+.+.}-{3:3}, at: devlink_get_from_attrs_lock+0x12d/0x2d0 [ 2785.395324 ] #2: ffff8881422d3c38 (&dev->lock_key){+.+.}-{3:3}, at: mlx5_unload_one_devl_locked+0x4a/0x110 [mlx5_core] [ 2785.397322 ] #3: ffffffffa0e59068 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_detach_device+0x60/0x360 [mlx5_core] [ 2785.399231 ] #4: ffff88810e3cb0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600 [ 2785.400864 ] #5: ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib] Fixes: b95845178328 ("RDMA/mlx5: Change the cache structure to an RB-tree") Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-09-26	RDMA/mlx5: Fix NULL string error	Shay Drory
	checkpath is complaining about NULL string, change it to 'Unknown'. Fixes: 37aa5c36aa70 ("IB/mlx5: Add UARs write-combining and non-cached mapping") Signed-off-by: Shay Drory <shayd@nvidia.com> Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26	RDMA/mlx5: Fix mutex unlocking on error flow for steering anchor creation	Hamdan Igbaria
	The mutex was not unlocked on some of the error flows. Moved the unlock location to include all the error flow scenarios. Fixes: e1f4a52ac171 ("RDMA/mlx5: Create an indirect flow table for steering anchor") Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com> Link: https://lore.kernel.org/r/1244a69d783da997c0af0b827c622eb00495492e.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26	RDMA/mlx5: Fix assigning access flags to cache mkeys	Michael Guralnik
	After the change to use dynamic cache structure, new cache entries can be added and the mkey allocation can no longer assume that all mkeys created for the cache have access_flags equal to zero. Example of a flow that exposes the issue: A user registers MR with RO on a HCA that cannot UMR RO and the mkey is created outside of the cache. When the user deregisters the MR, a new cache entry is created to store mkeys with RO. Later, the user registers 2 MRs with RO. The first MR is reused from the new cache entry. When we try to get the second mkey from the cache we see the entry is empty so we go to the MR cache mkey allocation flow which would have allocated a mkey with no access flags, resulting the user getting a MR without RO. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Reviewed-by: Edward Srouji <edwards@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/8a802700b82def3ace3f77cd7a9ad9d734af87e7.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26	ASoC: fsl-asoc-card: use integer type for fll_id and pll_id	Shengjiu Wang
	As the pll_id and pll_id can be zero (WM8960_SYSCLK_AUTO) with the commit 2bbc2df46e67 ("ASoC: wm8960: Make automatic the default clocking mode") Then the machine driver will skip to call set_sysclk() and set_pll() for codec, when the sysclk rate is different with what wm8960 read at probe, the output sound frequency is wrong. So change the fll_id and pll_id initial value, still keep machine driver's behavior same as before. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://lore.kernel.org/r/1695202992-24864-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-09-26	rbd: take header_rwsem in rbd_dev_refresh() only when updating	Ilya Dryomov
	rbd_dev_refresh() has been holding header_rwsem across header and parent info read-in unnecessarily for ages. With commit 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held"), the potential for deadlocks became much more real owning to a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores not allowing new readers after a writer is registered. For example, assuming that I/O request 1, I/O request 2 and header read-in request all target the same OSD: 1. I/O request 1 comes in and gets submitted 2. watch error occurs 3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and releases lock_rwsem 4. after reestablishing the watch, rbd_reregister_watch() calls rbd_dev_refresh() which takes header_rwsem for write and submits a header read-in request 5. I/O request 2 comes in: after taking lock_rwsem for read in __rbd_img_handle_request(), it blocks trying to take header_rwsem for read in rbd_img_object_requests() 6. another watch error occurs 7. rbd_watch_errcb() blocks trying to take lock_rwsem for write 8. I/O request 1 completion is received by the messenger but can't be processed because lock_rwsem won't be granted anymore 9. header read-in request completion can't be received, let alone processed, because the messenger is stranded Change rbd_dev_refresh() to take header_rwsem only for actually updating rbd_dev->header. Header and parent info read-in don't need any locking. Cc: stable@vger.kernel.org # 0b035401c570: rbd: move rbd_dev_refresh() definition Cc: stable@vger.kernel.org # 510a7330c82a: rbd: decouple header read-in from updating rbd_dev->header Cc: stable@vger.kernel.org # c10311776f0a: rbd: decouple parent info read-in from updating rbd_dev Cc: stable@vger.kernel.org Fixes: 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26	rbd: decouple parent info read-in from updating rbd_dev	Ilya Dryomov
	Unlike header read-in, parent info read-in is already decoupled in get_parent_info(), but it's buried in rbd_dev_v2_parent_info() along with the processing logic. Separate the initial read-in and update read-in logic into rbd_dev_setup_parent() and rbd_dev_update_parent() respectively and have rbd_dev_v2_parent_info() just populate struct parent_image_info (i.e. what get_parent_info() did). Some existing QoI issues, like flatten of a standalone clone being disregarded on refresh, remain. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26	rbd: decouple header read-in from updating rbd_dev->header	Ilya Dryomov
	Make rbd_dev_header_info() populate a passed struct rbd_image_header instead of rbd_dev->header and introduce rbd_dev_update_header() for updating mutable fields in rbd_dev->header upon refresh. The initial read-in of both mutable and immutable fields in rbd_dev_image_probe() passes in rbd_dev->header so no update step is required there. rbd_init_layout() is now called directly from rbd_dev_image_probe() instead of individually in format 1 and format 2 implementations. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26	rbd: move rbd_dev_refresh() definition	Ilya Dryomov
	Move rbd_dev_refresh() definition further down to avoid having to move struct parent_image_info definition in the next commit. This spares some forward declarations too. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2023-09-26	wifi: mac80211: expand __ieee80211_data_to_8023() status	Johannes Berg
	Make __ieee80211_data_to_8023() return more individual drop reasons instead of just doing RX_DROP_U_INVALID_8023. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value	Johannes Berg
	This has many different reasons, split the return value into the individual reasons for better traceability. Also, since symbolic tracing doesn't work for these, add a few comments for the numbering. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	wifi: mac80211: remove RX_DROP_UNUSABLE	Johannes Berg
	Convert all instances of RX_DROP_UNUSABLE to indicate a better reason, and then remove RX_DROP_UNUSABLE. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	wifi: mac80211: fix check for unusable RX result	Johannes Berg
	If we just check "result & RX_DROP_UNUSABLE", this really only works by accident, because SKB_DROP_REASON_SUBSYS_MAC80211_UNUSABLE got to have the value 1, and SKB_DROP_REASON_SUBSYS_MAC80211_MONITOR is 2. Fix this to really check the entire subsys mask for the value, so it doesn't matter what the subsystem value is. Fixes: 7f4e09700bdc ("wifi: mac80211: report all unusable beacon frames") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	wifi: cfg80211: add local_state_change to deauth trace	Johannes Berg
	Add the local_state_change request to the deauth trace for easier debugging. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	wifi: mac80211: Create resources for disabled links	Benjamin Berg
	When associating to an MLD AP, links may be disabled. Create all resources associated with a disabled link so that we can later enable it without having to create these resources on the fly. Fixes: 6d543b34dbcf ("wifi: mac80211: Support disabled links during association") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://lore.kernel.org/r/20230925173028.f9afdb26f6c7.I4e6e199aaefc1bf017362d64f3869645fa6830b5@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	wifi: cfg80211: avoid leaking stack data into trace	Benjamin Berg
	If the structure is not initialized then boolean types might be copied into the tracing data without being initialised. This causes data from the stack to leak into the trace and also triggers a UBSAN failure which can easily be avoided here. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://lore.kernel.org/r/20230925171855.a9271ef53b05.I8180bae663984c91a3e036b87f36a640ba409817@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-09-26	block: fix kernel-doc for disk_force_media_change()	Randy Dunlap
	Drop one function parameter's kernel-doc comment since the parameter was removed. This prevents a kernel-doc warning: block/disk-events.c:300: warning: Excess function parameter 'events' description in 'disk_force_media_change' Fixes: ab6860f62bfe ("block: simplify the disk_force_media_change interface") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Closes: lore.kernel.org/r/202309060957.vfl0mUur-lkp@intel.com Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230926005232.23666-1-rdunlap@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-09-25	Merge branch 'bpf: Add missed stats for kprobes'	Andrii Nakryiko
	Jiri Olsa says: ==================== hi, at the moment we can't retrieve the number of missed kprobe executions and subsequent execution of BPF programs. This patchset adds: - counting of missed execution on attach layer for: . kprobes attached through perf link (kprobe/ftrace) . kprobes attached through kprobe.multi link (fprobe) - counting of recursion_misses for BPF kprobe programs It's still technically possible to create kprobe without perf link (using SET_BPF perf ioctl) in which case we don't have a way to retrieve the kprobe's 'missed' count. However both libbpf and cilium/ebpf libraries use perf link if it's available, and for old kernels without perf link support we can use BPF program to retrieve the kprobe missed count. v3 changes: - added acks [Song] - make test_missed not serial [Andrii] Also available at: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git bpf/missed_stats thanks, jirka ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-25	selftests/bpf: Add test for recursion counts of perf event link tracepoint	Jiri Olsa
	Adding selftest that puts kprobe on bpf_fentry_test1 that calls bpf_printk and invokes bpf_trace_printk tracepoint. The bpf_trace_printk tracepoint has test[234] programs attached to it. Because kprobe execution goes through bpf_prog_active check, programs attached to the tracepoint will fail the recursion check and increment the recursion_misses stats. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-10-jolsa@kernel.org
2023-09-25	selftests/bpf: Add test for recursion counts of perf event link kprobe	Jiri Olsa
	Adding selftest that puts kprobe.multi on bpf_fentry_test1 that calls bpf_kfunc_common_test kfunc which has 3 perf event kprobes and 1 kprobe.multi attached. Because fprobe (kprobe.multi attach layear) does not have strict recursion check the kprobe's bpf_prog_active check is hit for test2-5. Disabling this test for arm64, because there's no fprobe support yet. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-9-jolsa@kernel.org
2023-09-25	selftests/bpf: Add test for missed counts of perf event link kprobe	Jiri Olsa
	Adding test that puts kprobe on bpf_fentry_test1 that calls bpf_kfunc_common_test kfunc, which has also kprobe on. The latter won't get triggered due to kprobe recursion check and kprobe missed counter is incremented. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-8-jolsa@kernel.org
2023-09-25	bpftool: Display missed count for kprobe perf link	Jiri Olsa
	Adding 'missed' field to display missed counts for kprobes attached by perf event link, like: # bpftool link 5: perf_event prog 82 kprobe ffffffff815203e0 ksys_write 6: perf_event prog 83 kprobe ffffffff811d1e50 scheduler_tick missed 682217 # bpftool link -jp [{ "id": 5, "type": "perf_event", "prog_id": 82, "retprobe": false, "addr": 18446744071584220128, "func": "ksys_write", "offset": 0, "missed": 0 },{ "id": 6, "type": "perf_event", "prog_id": 83, "retprobe": false, "addr": 18446744071580753488, "func": "scheduler_tick", "offset": 0, "missed": 693469 } ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-7-jolsa@kernel.org
2023-09-25	bpftool: Display missed count for kprobe_multi link	Jiri Olsa
	Adding 'missed' field to display missed counts for kprobes attached by kprobe multi link, like: # bpftool link 5: kprobe_multi prog 76 kprobe.multi func_cnt 1 missed 1 addr func [module] ffffffffa039c030 fp3_test [fprobe_test] # bpftool link -jp [{ "id": 5, "type": "kprobe_multi", "prog_id": 76, "retprobe": false, "func_cnt": 1, "missed": 1, "funcs": [{ "addr": 18446744072102723632, "func": "fp3_test", "module": "fprobe_test" } ] } ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-6-jolsa@kernel.org
2023-09-25	bpf: Count missed stats in trace_call_bpf	Jiri Olsa
	Increase misses stats in case bpf array execution is skipped because of recursion check in trace_call_bpf. Adding bpf_prog_inc_misses_counters that increase misses counts for all bpf programs in bpf_prog_array. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-5-jolsa@kernel.org
2023-09-25	bpf: Add missed value to kprobe perf link info	Jiri Olsa
	Add missed value to kprobe attached through perf link info to hold the stats of missed kprobe handler execution. The kprobe's missed counter gets incremented when kprobe handler is not executed due to another kprobe running on the same cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-4-jolsa@kernel.org
2023-09-25	bpf: Add missed value to kprobe_multi link info	Jiri Olsa
	Add missed value to kprobe_multi link info to hold the stats of missed kprobe_multi probe. The missed counter gets incremented when fprobe fails the recursion check or there's no rethook available for return probe. In either case the attached bpf program is not executed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-3-jolsa@kernel.org
2023-09-25	bpf: Count stats for kprobe_multi programs	Jiri Olsa
	Adding support to gather missed stats for kprobe_multi programs due to bpf_prog_active protection. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-2-jolsa@kernel.org
2023-09-25	Merge branch 'add libbpf getters for individual ringbuffers'	Andrii Nakryiko
	Martin Kelly says: ==================== This patch series adds a new ring__ API to libbpf exposing getters for accessing the individual ringbuffers inside a struct ring_buffer. This is useful for polling individually, getting available data, or similar use cases. The API looks like this, and was roughly proposed by Andrii Nakryiko in another thread: Getting a ring struct: struct ring ring_buffer__ring(struct ring_buffer rb, unsigned int idx); Using the ring struct: unsigned long ring__consumer_pos(const struct ring r); unsigned long ring__producer_pos(const struct ring r); size_t ring__avail_data_size(const struct ring r); size_t ring__size(const struct ring r); int ring__map_fd(const struct ring r); int ring__consume(struct ring r); Changes in v2: - Addressed all feedback from Andrii Nakryiko ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-25	selftests/bpf: Add tests for ring__consume	Martin Kelly
	Add tests for new API ring__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-15-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Add ring__consume	Martin Kelly
	Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com
2023-09-25	selftests/bpf: Add tests for ring__map_fd	Martin Kelly
	Add tests for the new API ring__map_fd. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-13-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Add ring__map_fd	Martin Kelly
	Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com
2023-09-25	selftests/bpf: Add tests for ring__size	Martin Kelly
	Add tests for the new API ring__size. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-11-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Add ring__size	Martin Kelly
	Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com
2023-09-25	selftests/bpf: Add tests for ring__avail_data_size	Martin Kelly
	Add test for the new API ring__avail_data_size. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-9-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Add ring__avail_data_size	Martin Kelly
	Add ring__avail_data_size for querying the currently available data in the ringbuffer, similar to the BPF_RB_AVAIL_DATA flag in bpf_ringbuf_query. This is racy during ongoing operations but is still useful for overall information on how a ringbuffer is behaving. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-8-martin.kelly@crowdstrike.com
2023-09-25	selftests/bpf: Add tests for ring__*_pos	Martin Kelly
	Add tests for the new APIs ring__producer_pos and ring__consumer_pos. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-7-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Add ring__producer_pos, ring__consumer_pos	Martin Kelly
	Add APIs to get the producer and consumer position for a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-6-martin.kelly@crowdstrike.com
2023-09-25	selftests/bpf: Add tests for ring_buffer__ring	Martin Kelly
	Add tests for the new API ring_buffer__ring. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-5-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Add ring_buffer__ring	Martin Kelly
	Add a new function ring_buffer__ring, which exposes struct ring * to the user, representing a single ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-4-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Switch rings to array of pointers	Martin Kelly
	Switch rb->rings to be an array of pointers instead of a contiguous block. This allows for each ring pointer to be stable after ring_buffer__add is called, which allows us to expose struct ring * to the user without gotchas. Without this change, the realloc in ring_buffer__add could invalidate a struct ring *, making it unsafe to give to the user. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-3-martin.kelly@crowdstrike.com
2023-09-25	libbpf: Refactor cleanup in ring_buffer__add	Martin Kelly
	Refactor the cleanup code in ring_buffer__add to use a unified err_out label. This reduces code duplication, as well as plugging a potential leak if mmap_sz != (__u64)(size_t)mmap_sz (currently this would miss unmapping tmp because ringbuf_unmap_ring isn't called). Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-2-martin.kelly@crowdstrike.com