linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-04-26	mmc: improve API to make clear hw_reset callback is for cards	Wolfram Sang
	To make it unambiguous that the hw_reset callback is for cards and not for controllers, we add 'card' to the callback name and convert all users in one go. We keep the argument as mmc_host, though, because the callback is used very early when mmc_card is not yet populated. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220408080045.6497-4-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: core: improve API to make clear that mmc_sw_reset is for cards	Wolfram Sang
	To make it unambiguous that mmc_sw_reset() is for cards and not for controllers, we make the function argument mmc_card instead of mmc_host. There are no users to convert currently. Suggested-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220408080045.6497-3-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	MAINTAINERS: Add linux-renesas-soc@vger.kernel.org list for Renesas ↵	Lad Prabhakar
	TMIO/SDHI driver Add linux-renesas-soc@vger.kernel.org list entry for Renesas TMIO/SDHI driver. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220404174159.571-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: remove superfluous specific M3W entry	Wolfram Sang
	We don't need to specify the Gen3 compatible entry for M3W because it will be provided by the generic Gen3 fallback. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220404130551.20209-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: R-Car V3H ES2.0 gained HS400 support	Wolfram Sang
	The hardware evolved, so we only need to disable HS400 support on ES1.* revisions. Update the code. Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com> [wsa: refactored to top-of-tree] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220404123404.16289-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: omap: Make it CCF clk API compatible	Janusz Krzysztofik
	The driver, OMAP specific, now omits clk_prepare/unprepare() steps, not supported by OMAP custom implementation of clock API. However, non-CCF stubs of those functions exist for use on such platforms until converted to CCF. Update the driver to be compatible with CCF implementation of clock API. Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Link: https://lore.kernel.org/r/20220402112004.129886-1-jmkrzyszt@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: mmc_spi: parse speed mode options	Christian Löhle
	Since SD and MMC Highspeed modes are also valid for SPI let's parse them too. Signed-off-by: Christian Loehle <cloehle@hyperstone.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20c6efa9a4c7423bbfb9352705c4a53a@hyperstone.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: core: block: fix sloppy typing in mmc_blk_ioctl_multi_cmd()	Sergey Shtylyov
	Despite mmc_ioc_multi_cmd::num_of_cmds is a 64-bit field, its maximum value is limited to MMC_IOC_MAX_CMDS (only 255); using a 64-bit local variable to hold a copy of that field leads to gcc generating ineffective loop code: despite the source code using an int variable for the loop counters, the 32-bit object code uses 64-bit unsigned counters. Also, gcc has to drop the most significant word of that 64-bit variable when calling kcalloc() and assigning to mmc_queue_req::ioc_count anyway. Using the unsigned int variable instead results in a better code. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/eea3b0bd-6091-f005-7189-b5b7868abdb6@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	dt-bindings: mmc: mtk-sd: increase reg items	Tinghan Shen
	MediaTek has a new version of mmc IP since mt8183. Some IO registers are moved to top to improve hardware design and named as "host top registers". Add host top register in the reg binding description for mt8183 and successors. Signed-off-by: Wenbin Mei <wenbin.mei@mediatek.com> Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220330094532.21721-2-tinghan.shen@mediatek.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	dt-bindings: mmc: xenon: Convert to JSON schema	Chris Packham
	Convert the marvell,xenon-sdhci binding to JSON schema. Currently the in-tree dts files don't validate because they use sdhci@ instead of mmc@ as required by the generic mmc-controller schema. The compatible "marvell,sdhci-xenon" was not documented in the old binding but it accompanies the of "marvell,armada-3700-sdhci" in the armada-37xx SoC dtsi so this combination is added to the new binding document. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220329220544.2132135-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: R-Car V3M also has no HS400	Wolfram Sang
	Further digging in the datasheets revealed that R-Car V3M also has no HS400 support. Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220404105831.5096-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: Add missing checks for the presence of quirks	Geert Uytterhoeven
	When running on an system without any quirks (e.g. R-Car V3U), the kernel crashes with a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000002 ... Hardware name: Renesas Falcon CPU and Breakout boards based on r8a779a0 (DT) Workqueue: events_freezable mmc_rescan ... Call trace: renesas_sdhi_internal_dmac_start_dma+0x54/0x12c tmio_process_mrq+0x124/0x274 Fix this by adding the missing checks for the validatity of the priv->quirks pointer. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/cc3178c2ff60f640f4d5a071d51f6b0b1db37656.1648822020.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: mmci: stm32: use a buffer for unaligned DMA requests	Yann Gautier
	In SDIO mode, the sg list for requests can be unaligned with what the STM32 SDMMC internal DMA can support. In that case, instead of failing, use a temporary bounce buffer to copy from/to the sg list. This buffer is limited to 1MB. But for that we need to also limit max_req_size to 1MB. It has not shown any throughput penalties for SD-cards or eMMC. Signed-off-by: Yann Gautier <yann.gautier@foss.st.com> Link: https://lore.kernel.org/r/20220328145114.334577-1-yann.gautier@foss.st.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: style fix for proper function bodies	Wolfram Sang
	Put the braces to the proper position to make reading the code easier. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220320124538.62028-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: make 'dmac_only_one_rx' a quirk	Wolfram Sang
	After Shimoda-san's much appreciated refactoring of the quirk handling, we can convert now 'dmac_only_one_rx' from an ugly global flag to a regular quirk. This makes quirk handling more consistent and easier to maintain. After this patch, soc_dma_quirks is completely gone, hooray! Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220320123016.57991-7-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: make 'fixed_addr_mode' a quirk	Wolfram Sang
	After Shimoda-san's much appreciated refactoring of the quirk handling, we can convert now the 'fixed_addr_mode' from an ugly global flag to a regular quirk. This makes quirk handling more consistent and easier to maintain. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220320123016.57991-6-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: remove a stale comment	Wolfram Sang
	The whitelist has been refactored away with a0fb3fc8af01 ("mmc: renesas_sdhi: remove whitelist for internal DMAC") so the comment doesn't make any sense anymore. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220320123016.57991-5-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: make setup selection more understandable	Wolfram Sang
	When I read 'no_fallback', I forgot what fallback even though I was the author of this change. Name it better to make the code easier to understand. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220320123016.57991-4-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: R-Car D3 also has no HS400	Wolfram Sang
	It is not explicitly expressed in the docs, but the needed data strobe pin is indeed missing for D3. The BSP disables HS400 as well. This means a little refactoring to reuse an already existing setup. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220320123016.57991-3-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: renesas_sdhi: remove outdated headers	Wolfram Sang
	We moved quirk handling out of the SDHI core to the individual drivers. So, no need to include headers needed for soc_device_match et al. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220320123016.57991-2-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	mmc: core: Set HS clock speed before sending HS CMD13	Brian Norris
	Way back in commit 4f25580fb84d ("mmc: core: changes frequency to hs_max_dtr when selecting hs400es"), Rockchip engineers noticed that some eMMC don't respond to SEND_STATUS commands very reliably if they're still running at a low initial frequency. As mentioned in that commit, JESD84-B51 P49 suggests a sequence in which the host: 1. sets HS_TIMING 2. bumps the clock ("<= 52 MHz") 3. sends further commands It doesn't exactly require that we don't use a lower-than-52MHz frequency, but in practice, these eMMC don't like it. The aforementioned commit tried to get that right for HS400ES, although it's unclear whether this ever truly worked as committed into mainline, as other changes/refactoring adjusted the sequence in conflicting ways: 08573eaf1a70 ("mmc: mmc: do not use CMD13 to get status after speed mode switch") 53e60650f74e ("mmc: core: Allow CMD13 polling when switching to HS mode for mmc") In any case, today we do step 3 before step 2. Let's fix that, and also apply the same logic to HS200/400, where this eMMC has problems too. Resolves errors like this seen when booting some RK3399 Gru/Scarlet systems: [ 2.058881] mmc1: CQHCI version 5.10 [ 2.097545] mmc1: SDHCI controller on fe330000.mmc [fe330000.mmc] using ADMA [ 2.209804] mmc1: mmc_select_hs400es failed, error -84 [ 2.215597] mmc1: error -84 whilst initialising MMC card [ 2.417514] mmc1: mmc_select_hs400es failed, error -110 [ 2.423373] mmc1: error -110 whilst initialising MMC card [ 2.605052] mmc1: mmc_select_hs400es failed, error -110 [ 2.617944] mmc1: error -110 whilst initialising MMC card [ 2.835884] mmc1: mmc_select_hs400es failed, error -110 [ 2.841751] mmc1: error -110 whilst initialising MMC card Ealier versions of this patch bumped to 200MHz/HS200 speeds too early, which caused issues on, e.g., qcom-msm8974-fairphone-fp2. (Thanks for the report Luca!) After a second look, it appears that aligns with JESD84 / page 45 / table 28, so we need to keep to lower (HS / 52 MHz) rates first. Fixes: 08573eaf1a70 ("mmc: mmc: do not use CMD13 to get status after speed mode switch") Fixes: 53e60650f74e ("mmc: core: Allow CMD13 polling when switching to HS mode for mmc") Fixes: 4f25580fb84d ("mmc: core: changes frequency to hs_max_dtr when selecting hs400es") Cc: Shawn Lin <shawn.lin@rock-chips.com> Link: https://lore.kernel.org/linux-mmc/11962455.O9o76ZdvQC@g550jk/ Reported-by: Luca Weiss <luca@z3ntu.xyz> Signed-off-by: Brian Norris <briannorris@chromium.org> Tested-by: Luca Weiss <luca@z3ntu.xyz> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220422100824.v4.1.I484f4ee35609f78b932bd50feed639c29e64997e@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-26	usb: dwc3: pci: add support for the Intel Meteor Lake-P	Heikki Krogerus
	This patch adds the necessary PCI IDs for Intel Meteor Lake-P devices. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20220425103518.44028-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26	virtio_net: fix wrong buf address calculation when using xdp	Nikolay Aleksandrov
	We received a report[1] of kernel crashes when Cilium is used in XDP mode with virtio_net after updating to newer kernels. After investigating the reason it turned out that when using mergeable bufs with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf() calculates the build_skb address wrong because the offset can become less than the headroom so it gets the address of the previous page (-X bytes depending on how lower offset is): page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256 This is a pr_err() I added in the beginning of page_to_skb which clearly shows offset that is less than headroom by adding 4 bytes of metadata via an xdp prog. The calculations done are: receive_mergeable(): headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes offset = xdp.data - page_address(xdp_page) - vi->hdr_len - metasize; page_to_skb(): p = page_address(page) + offset; ... buf = p - headroom; Now buf goes -4 bytes from the page's starting address as can be seen above which is set as skb->head and skb->data by build_skb later. Depending on what's done with the skb (when it's freed most often) we get all kinds of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate the new headroom after the xdp program has run, similar to how offset and len are recalculated. Headroom is directly related to data_hard_start, data and data_meta, so we use them to get the new size. The result is correct (similar pr_err() in page_to_skb, one case of xdp_page and one case of virtnet buf): a) Case with 4 bytes of metadata [ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252 [ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252 b) Case of pushing data +32 bytes [ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288 [ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288 c) Case of pushing data -33 bytes [ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223 [ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223 Offset and headroom are equal because offset points to the start of reserved bytes for the virtio_net header which are at buf start + headroom, while data points at buf start + vnet hdr size + headroom so when data or data_meta are adjusted by the xdp prog both the headroom size and the offset change equally. We can use data_hard_start to compute the new headroom after the xdp prog (linearized / page start case, the virtnet buf case is similar just with bigger base offset): xdp.data_hard_start = page_address + vnet_hdr xdp.data = page_address + vnet_hdr + headroom new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize An example reproducer xdp prog[3] is below. [1] https://github.com/cilium/cilium/issues/19453 [2] Two of the many traces: [ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940 [ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7 [ 41.300891] kernel BUG at include/linux/mm.h:720! [ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37 [ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014 [ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0 [ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6 [ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292 [ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000 [ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff [ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff [ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600 [ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c [ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000 [ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0 [ 41.321387] Call Trace: [ 41.321819] <TASK> [ 41.322193] skb_release_data+0x13f/0x1c0 [ 41.322902] __kfree_skb+0x20/0x30 [ 41.343870] tcp_recvmsg_locked+0x671/0x880 [ 41.363764] tcp_recvmsg+0x5e/0x1c0 [ 41.384102] inet_recvmsg+0x42/0x100 [ 41.406783] ? sock_recvmsg+0x1d/0x70 [ 41.428201] sock_read_iter+0x84/0xd0 [ 41.445592] ? 0xffffffffa3000000 [ 41.462442] new_sync_read+0x148/0x160 [ 41.479314] ? 0xffffffffa3000000 [ 41.496937] vfs_read+0x138/0x190 [ 41.517198] ksys_read+0x87/0xc0 [ 41.535336] do_syscall_64+0x3b/0x90 [ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 41.568050] RIP: 0033:0x48765b [ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30 [ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000 [ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b [ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016 [ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4 [ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9 [ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff [ 41.744254] </TASK> [ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net and [ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60 [ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e [ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60 [ 33.532463] flags: 0xfffffc0000000(node=0\|zone=1\|lastcpupid=0x1fffff) [ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000 [ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000 [ 33.532471] page dumped because: nonzero mapcount [ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net [ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37 [ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014 [ 33.532484] Call Trace: [ 33.532496] <TASK> [ 33.532500] dump_stack_lvl+0x45/0x5a [ 33.532506] bad_page.cold+0x63/0x94 [ 33.532510] free_pcp_prepare+0x290/0x420 [ 33.532515] free_unref_page+0x1b/0x100 [ 33.532518] skb_release_data+0x13f/0x1c0 [ 33.532524] kfree_skb_reason+0x3e/0xc0 [ 33.532527] ip6_mc_input+0x23c/0x2b0 [ 33.532531] ip6_sublist_rcv_finish+0x83/0x90 [ 33.532534] ip6_sublist_rcv+0x22b/0x2b0 [3] XDP program to reproduce(xdp_pass.c): #include <linux/bpf.h> #include <bpf/bpf_helpers.h> SEC("xdp_pass") int xdp_pkt_pass(struct xdp_md *ctx) { bpf_xdp_adjust_head(ctx, -(int)32); return XDP_PASS; } char _license[] SEC("license") = "GPL"; compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass CC: stable@vger.kernel.org CC: Jason Wang <jasowang@redhat.com> CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com> CC: Daniel Borkmann <daniel@iogearbox.net> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: virtualization@lists.linux-foundation.org Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220425103703.3067292-1-razor@blackwall.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-26	net: dsa: mv88e6xxx: Fix port_hidden_wait to account for port_base_addr	Nathan Rossi
	The other port_hidden functions rely on the port_read/port_write functions to access the hidden control port. These functions apply the offset for port_base_addr where applicable. Update port_hidden_wait to use the port_wait_bit so that port_base_addr offsets are accounted for when waiting for the busy bit to change. Without the offset the port_hidden_wait function would timeout on devices that have a non-zero port_base_addr (e.g. MV88E6141), however devices that have a zero port_base_addr would operate correctly (e.g. MV88E6390). Fixes: 609070133aff ("net: dsa: mv88e6xxx: update code operating on hidden registers") Signed-off-by: Nathan Rossi <nathan@nathanrossi.com> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220425070454.348584-1-nathan@nathanrossi.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-26	net: phy: marvell10g: fix return value on error	Baruch Siach
	Return back the error value that we get from phy_read_mmd(). Fixes: c84786fa8f91 ("net: phy: marvell10g: read copper results from CSSR1") Signed-off-by: Baruch Siach <baruch.siach@siklu.com> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/f47cb031aeae873bb008ba35001607304a171a20.1650868058.git.baruch@tkos.co.il Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-26	bug: Have __warn() prototype defined unconditionally	Shida Zhang
	The __warn() prototype is declared in CONFIG_BUG scope but the function definition in panic.c is unconditional. The IBT enablement started using it unconditionally but a CONFIG_X86_KERNEL_IBT=y, CONFIG_BUG=n .config will trigger a arch/x86/kernel/traps.c: In function ‘__exc_control_protection’: arch/x86/kernel/traps.c:249:17: error: implicit declaration of function \ ‘__warn’; did you mean ‘pr_warn’? [-Werror=implicit-function-declaration] Pull up the declarations so that they're unconditionally visible too. [ bp: Rewrite commit message. ] Fixes: 991625f3dd2c ("x86/ibt: Add IBT feature, MSR and #CP handling") Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220426032007.510245-1-starzhangzsd@gmail.com
2022-04-26	net: bcmgenet: hide status block before TX timestamping	Jonathan Lemon
	The hardware checksum offloading requires use of a transmit status block inserted before the outgoing frame data, this was updated in '9a9ba2a4aaaa ("net: bcmgenet: always enable status blocks")' However, skb_tx_timestamp() assumes that it is passed a raw frame and PTP parsing chokes on this status block. Fix this by calling __skb_pull(), which hides the TSB before calling skb_tx_timestamp(), so an outgoing PTP packet is parsed correctly. As the data in the skb has already been set up for DMA, and the dma_unmap_* calls use a separately stored address, there is no no effective change in the data transmission. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220424165307.591145-1-jonathan.lemon@gmail.com Fixes: d03825fba459 ("net: bcmgenet: add skb_tx_timestamp call") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-26	MAINTAINERS: omap: remove me as a maintainer	Rajendra Nayak
	The codeaurora.org domain is no longer valid, remove my id from the maintainers for OMAP PM frameworks. I haven't contributed to them in years, neither do I plan to in the future so not updating this with my new quicinc id. Signed-off-by: Rajendra Nayak <quic_rjendra@quicinc.com> Message-Id: <1649824983-29400-1-git-send-email-quic_rjendra@quicinc.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2022-04-26	mtd: mtdoops: Add a timestamp to the mtdoops header.	Jean-Marc Eurin
	On some systems, the oops only has relative time from boot. Signed-off-by: Jean-Marc Eurin <jmeurin@google.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220425160927.3823016-1-jmeurin@google.com
2022-04-26	mtd: mtdoops: Create a header structure for the saved mtdoops.	Jean-Marc Eurin
	Create a dump header to enable the addition of fields without having to modify the rest of the code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jean-Marc Eurin <jmeurin@google.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220421234244.2172003-3-jmeurin@google.com
2022-04-26	mtd: mtdoops: Fix the size of the header read buffer.	Jean-Marc Eurin
	The read buffer size depends on the MTDOOPS_HEADER_SIZE. Tested: Changed the header size, it doesn't panic, header is still read/written correctly. Signed-off-by: Jean-Marc Eurin <jmeurin@google.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220421234244.2172003-2-jmeurin@google.com
2022-04-26	mctp: defer the kfree of object mdev->addrs	Lin Ma
	The function mctp_unregister() reclaims the device's relevant resource when a netcard detaches. However, a running routine may be unaware of this and cause the use-after-free of the mdev->addrs object. The race condition can be demonstrated below cleanup thread another thread \| unregister_netdev() \| mctp_sendmsg() ... \| ... mctp_unregister() \| rt = mctp_route_lookup() ... \| mctl_local_output() kfree(mdev->addrs) \| ... \| saddr = rt->dev->addrs[0]; \| An attacker can adopt the (recent provided) mtcpserial driver with pty to fake the device detaching and use the userfaultfd to increase the race success chance (in mctp_sendmsg). The KASan report for such a POC is shown below: [ 86.051955] ================================================================== [ 86.051955] BUG: KASAN: use-after-free in mctp_local_output+0x4e9/0xb7d [ 86.051955] Read of size 1 at addr ffff888005f298c0 by task poc/295 [ 86.051955] [ 86.051955] Call Trace: [ 86.051955] <TASK> [ 86.051955] dump_stack_lvl+0x33/0x42 [ 86.051955] print_report.cold.13+0xb2/0x6b3 [ 86.051955] ? preempt_schedule_irq+0x57/0x80 [ 86.051955] ? mctp_local_output+0x4e9/0xb7d [ 86.051955] kasan_report+0xa5/0x120 [ 86.051955] ? mctp_local_output+0x4e9/0xb7d [ 86.051955] mctp_local_output+0x4e9/0xb7d [ 86.051955] ? mctp_dev_set_key+0x79/0x79 [ 86.051955] ? copyin+0x38/0x50 [ 86.051955] ? _copy_from_iter+0x1b6/0xf20 [ 86.051955] ? sysvec_apic_timer_interrupt+0x97/0xb0 [ 86.051955] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 86.051955] ? mctp_local_output+0x1/0xb7d [ 86.051955] mctp_sendmsg+0x64d/0xdb0 [ 86.051955] ? mctp_sk_close+0x20/0x20 [ 86.051955] ? __fget_light+0x2fd/0x4f0 [ 86.051955] ? mctp_sk_close+0x20/0x20 [ 86.051955] sock_sendmsg+0xdd/0x110 [ 86.051955] __sys_sendto+0x1cc/0x2a0 [ 86.051955] ? __ia32_sys_getpeername+0xa0/0xa0 [ 86.051955] ? new_sync_write+0x335/0x550 [ 86.051955] ? alloc_file+0x22f/0x500 [ 86.051955] ? __ip_do_redirect+0x820/0x1820 [ 86.051955] ? vfs_write+0x44d/0x7b0 [ 86.051955] ? vfs_write+0x44d/0x7b0 [ 86.051955] ? fput_many+0x15/0x120 [ 86.051955] ? ksys_write+0x155/0x1b0 [ 86.051955] ? __ia32_sys_read+0xa0/0xa0 [ 86.051955] __x64_sys_sendto+0xd8/0x1b0 [ 86.051955] ? exit_to_user_mode_prepare+0x2f/0x120 [ 86.051955] ? syscall_exit_to_user_mode+0x12/0x20 [ 86.051955] do_syscall_64+0x3a/0x80 [ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 86.051955] RIP: 0033:0x7f82118a56b3 [ 86.051955] RSP: 002b:00007ffdb154b110 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [ 86.051955] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f82118a56b3 [ 86.051955] RDX: 0000000000000010 RSI: 00007f8211cd4000 RDI: 0000000000000007 [ 86.051955] RBP: 00007ffdb154c1d0 R08: 00007ffdb154b164 R09: 000000000000000c [ 86.051955] R10: 0000000000000000 R11: 0000000000000293 R12: 000055d779800db0 [ 86.051955] R13: 00007ffdb154c2b0 R14: 0000000000000000 R15: 0000000000000000 [ 86.051955] </TASK> [ 86.051955] [ 86.051955] Allocated by task 295: [ 86.051955] kasan_save_stack+0x1c/0x40 [ 86.051955] __kasan_kmalloc+0x84/0xa0 [ 86.051955] mctp_rtm_newaddr+0x242/0x610 [ 86.051955] rtnetlink_rcv_msg+0x2fd/0x8b0 [ 86.051955] netlink_rcv_skb+0x11c/0x340 [ 86.051955] netlink_unicast+0x439/0x630 [ 86.051955] netlink_sendmsg+0x752/0xc00 [ 86.051955] sock_sendmsg+0xdd/0x110 [ 86.051955] __sys_sendto+0x1cc/0x2a0 [ 86.051955] __x64_sys_sendto+0xd8/0x1b0 [ 86.051955] do_syscall_64+0x3a/0x80 [ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 86.051955] [ 86.051955] Freed by task 301: [ 86.051955] kasan_save_stack+0x1c/0x40 [ 86.051955] kasan_set_track+0x21/0x30 [ 86.051955] kasan_set_free_info+0x20/0x30 [ 86.051955] __kasan_slab_free+0x104/0x170 [ 86.051955] kfree+0x8c/0x290 [ 86.051955] mctp_dev_notify+0x161/0x2c0 [ 86.051955] raw_notifier_call_chain+0x8b/0xc0 [ 86.051955] unregister_netdevice_many+0x299/0x1180 [ 86.051955] unregister_netdevice_queue+0x210/0x2f0 [ 86.051955] unregister_netdev+0x13/0x20 [ 86.051955] mctp_serial_close+0x6d/0xa0 [ 86.051955] tty_ldisc_kill+0x31/0xa0 [ 86.051955] tty_ldisc_hangup+0x24f/0x560 [ 86.051955] __tty_hangup.part.28+0x2ce/0x6b0 [ 86.051955] tty_release+0x327/0xc70 [ 86.051955] __fput+0x1df/0x8b0 [ 86.051955] task_work_run+0xca/0x150 [ 86.051955] exit_to_user_mode_prepare+0x114/0x120 [ 86.051955] syscall_exit_to_user_mode+0x12/0x20 [ 86.051955] do_syscall_64+0x46/0x80 [ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 86.051955] [ 86.051955] The buggy address belongs to the object at ffff888005f298c0 [ 86.051955] which belongs to the cache kmalloc-8 of size 8 [ 86.051955] The buggy address is located 0 bytes inside of [ 86.051955] 8-byte region [ffff888005f298c0, ffff888005f298c8) [ 86.051955] [ 86.051955] The buggy address belongs to the physical page: [ 86.051955] flags: 0x100000000000200(slab\|node=0\|zone=1) [ 86.051955] raw: 0100000000000200 dead000000000100 dead000000000122 ffff888005c42280 [ 86.051955] raw: 0000000000000000 0000000080660066 00000001ffffffff 0000000000000000 [ 86.051955] page dumped because: kasan: bad access detected [ 86.051955] [ 86.051955] Memory state around the buggy address: [ 86.051955] ffff888005f29780: 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 [ 86.051955] ffff888005f29800: fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc [ 86.051955] >ffff888005f29880: fc fc fc fb fc fc fc fc fa fc fc fc fc fa fc fc [ 86.051955] ^ [ 86.051955] ffff888005f29900: fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc [ 86.051955] ffff888005f29980: fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc [ 86.051955] ================================================================== To this end, just like the commit e04480920d1e ("Bluetooth: defer cleanup of resources in hci_unregister_dev()") this patch defers the destructive kfree(mdev->addrs) in mctp_unregister to the mctp_dev_put, where the refcount of mdev is zero and the entire device is reclaimed. This prevents the use-after-free because the sendmsg thread holds the reference of mdev in the mctp_route object. Fixes: 583be982d934 (mctp: Add device handling and netlink interface) Signed-off-by: Lin Ma <linma@zju.edu.cn> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://lore.kernel.org/r/20220422114340.32346-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-26	drm/i915/fbc: Consult hw.crtc instead of uapi.crtc	Ville Syrjälä
	plane_state->uapi.crtc is not what we want to be looking at. If bigjoiner is used hw.crtc is what tells us what crtc the plane is supposedly using. Not an actual problem on current hardware as the only FBC capable pipe (A) can't be a bigjoiner slave and thus uapi.crtc==hw.crtc always here. But when we get more FBC instances this will become actually important. Fixes: 2e6c99f88679 ("drm/i915/fbc: Nuke lots of crap from intel_fbc_state_cache") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220413152852.7336-1-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> (cherry picked from commit 3e1faae3398789abe8d4797255bfe28d95d81308) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2022-04-26	drm/i915: Fix SEL_FETCH_PLANE_*(PIPE_B+) register addresses	Imre Deak
	Fix typo in the _SEL_FETCH_PLANE_BASE_1_B register base address. Fixes: a5523e2ff074a5 ("drm/i915: Add PSR2 selective fetch registers") References: https://gitlab.freedesktop.org/drm/intel/-/issues/5400 Cc: José Roberto de Souza <jose.souza@intel.com> Cc: <stable@vger.kernel.org> # v5.9+ Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220421162221.2261895-1-imre.deak@intel.com (cherry picked from commit af2cbc6ef967f61711a3c40fca5366ea0bc7fecc) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2022-04-26	cpufreq: qcom-cpufreq-hw: Clear dcvs interrupts	Vladimir Zapolskiy
	It's noted that dcvs interrupts are not self-clearing, thus an interrupt handler runs constantly, which leads to a severe regression in runtime. To fix the problem an explicit write to clear interrupt register is required, note that on OSM platforms the register may not be present. Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support") Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-04-26	tty: n_gsm: fix sometimes uninitialized warning in gsm_dlci_modem_output()	Daniel Starke
	'size' may be used uninitialized in gsm_dlci_modem_output() if called with an adaption that is neither 1 nor 2. The function is currently only called by gsm_modem_upd_via_data() and only for adaption 2. Properly handle every invalid case by returning -EINVAL to silence the compiler warning and avoid future regressions. Fixes: c19ffe00fed6 ("tty: n_gsm: fix invalid use of MSC in advanced option") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220425104726.7986-1-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26	documentation: zonefs: Document sysfs attributes	Damien Le Moal
	Document the max_wro_seq_files, nr_wro_seq_files, max_active_seq_files and nr_active_seq_files sysfs attributes. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
2022-04-25	null-blk: save memory footprint for struct nullb_cmd	Yu Kuai
	Total 16 bytes can be saved in two ways: 1) The field 'bio' will only be used in bio based mode, and the field 'rq' will only be used in mq mode. Since they won't be used in the same time, declare a union for them. 2) The field 'bool fake_timeout' can be placed in the hole after the field 'error'. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20220426022133.3999006-1-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-25	NFSv4: Don't invalidate inode attributes on delegation return	Trond Myklebust
	There is no need to declare attributes such as the ctime, mtime and block size invalid when we're just returning a delegation, so it is inappropriate to call nfs_post_op_update_inode_force_wcc(). Instead, just call nfs_refresh_inode() after faking up the change attribute. We know that the GETATTR op occurs before the DELEGRETURN, so we are safe when doing this. Fixes: 0bc2c9b4dca9 ("NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-04-25	Merge tag 'sunxi-clk-fixes-for-5.18-2' of ↵	Stephen Boyd
	https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes Pull Allwinner clk fixes from Jernej Skrabec: - Add missing sentinel - check return value for platform_get_resource() - mark rtc-32k as critical * tag 'sunxi-clk-fixes-for-5.18-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: clk: sunxi: sun9i-mmc: check return value after calling platform_get_resource() clk: sunxi-ng: sun6i-rtc: Mark rtc-32k as critical clk: sunxi-ng: fix not NULL terminated coccicheck error
2022-04-25	io_uring: fix compile warning for 32-bit builds	Jens Axboe
	If IO_URING_SCM_ALL isn't set, as it would not be on 32-bit builds, then we trigger a warning: fs/io_uring.c: In function '__io_sqe_files_unregister': fs/io_uring.c:8992:13: warning: unused variable 'i' [-Wunused-variable] 8992 \| int i; \| ^ Move the ifdef up to include the 'i' variable declaration. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 5e45690a1cb8 ("io_uring: store SCM state in io_fixed_file->file_ptr") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-25	md: Replace role magic numbers with defined constants	David Sloan
	There are several instances where magic numbers are used in md.c instead of the defined constants in md_p.h. This patch set improves code readability by replacing all occurrences of 0xffff, 0xfffe, and 0xfffd when relating to md roles with their equivalent defined constant. Signed-off-by: David Sloan <david.sloan@eideticom.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid0: Ignore RAID0 layout if the second zone has only one device	Pascal Hambourg
	The RAID0 layout is irrelevant if all members have the same size so the array has only one zone. It is also irrelevant if the array has two zones and the second zone has only one device, for example if the array has two members of different sizes. So in that case it makes sense to allow assembly even when the layout is undefined, like what is done when the array has only one zone. Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5: Annotate functions that hold device_lock with __must_hold	Logan Gunthorpe
	A handful of functions note the device_lock must be held with a comment but this is not comprehensive. Many other functions hold the lock when taken so add an __must_hold() to each call to annotate when the lock is held. This makes it a bit easier to analyse device_lock. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5-ppl: Annotate with rcu_dereference_protected()	Logan Gunthorpe
	To suppress the last remaining sparse warnings about accessing rdev, add rcu_dereference_protected calls to a couple places in raid5-ppl. All of these places are called under raid5_run and therefore are occurring before the array has started and is thus safe. There's no sensible check to do for the second argument of rcu_dereference_protected() so a comment is added instead. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5: Annotate rdev/replacement access when mddev_lock is held	Logan Gunthorpe
	The mddev_lock should be held during raid5_remove_disk() which is when the rdev/replacement pointers are modified. So any access to these pointers marked __rcu should be safe whenever the mddev_lock is held. There are numerous such access that currently produce sparse warnings. Add a helper function, rdev_mdlock_deref() that wraps rcu_dereference_protected() in all these instances. This annotation fixes a number of sparse warnings. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5: Annotate rdev/replacement accesses when nr_pending is elevated	Logan Gunthorpe
	There are a number of accesses to __rcu variables that should be safe because nr_pending in the disk is known to be elevated. Create a wrapper around rcu_dereference_protected() to annotate these accesses and verify that nr_pending is non-zero. This fixes a number of sparse warnings. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5: Add __rcu annotation to struct disk_info	Logan Gunthorpe
	rdev and replacement are protected in some circumstances with rcu_dereference and synchronize_rcu (in raid5_remove_disk()). However, they were not annotated with __rcu so a sparse warning is emitted for every rcu_dereference() call. Add the __rcu annotation and fix up the initialization with RCU_INIT_POINTER, all pointer modifications with rcu_assign_pointer(), a few cases where the pointer value is tested with rcu_access_pointer() and one case where READ_ONCE() is used instead of rcu_dereference(), a case in print_raid5_conf() that should have rcu_dereference() and rcu_read_[un]lock() calls. Additional sparse issues will be fixed up in further commits. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5: Un-nest struct raid5_percpu definition	Logan Gunthorpe
	Sparse reports many warnings of the form: drivers/md/raid5.c:1476:16: warning: dereference of noderef expression This is because all struct raid5_percpu definitions get marked as __percpu when really only the pointer in r5conf should have that annotation. Fix this by moving the defnition of raid5_precpu out of the definition of struct r5conf. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-04-25	md/raid5: Cleanup setup_conf() error returns	Logan Gunthorpe
	Be more careful about the error returns. Most errors in this function are actually ENOMEM, but it forcibly returns EIO if conf has been allocated. Instead return ret and ensure it is set appropriately before each goto abort. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>