linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-08-18	dt-bindings: crypto: qcom,prng: Add SM8450	Konrad Dybcio
	SM8450's PRNG does not require a core clock reference. Add a new compatible with a qcom,prng-ee fallback and handle that. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: drivers - avoid memcpy size warning	Arnd Bergmann
	Some configurations with gcc-12 or gcc-13 produce a warning for the source and destination of a memcpy() in atmel_sha_hmac_compute_ipad_hash() potentially overlapping: In file included from include/linux/string.h:254, from drivers/crypto/atmel-sha.c:15: drivers/crypto/atmel-sha.c: In function 'atmel_sha_hmac_compute_ipad_hash': include/linux/fortify-string.h:57:33: error: '__builtin_memcpy' accessing 129 or more bytes at offsets 408 and 280 overlaps 1 or more bytes at offset 408 [-Werror=restrict] 57 \| #define __underlying_memcpy __builtin_memcpy \| ^ include/linux/fortify-string.h:648:9: note: in expansion of macro '__underlying_memcpy' 648 \| __underlying_##op(p, q, __fortify_size); \ \| ^~~~~~~~~~~~~ include/linux/fortify-string.h:693:26: note: in expansion of macro '__fortify_memcpy_chk' 693 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ \| ^~~~~~~~~~~~~~~~~~~~ drivers/crypto/atmel-sha.c:1773:9: note: in expansion of macro 'memcpy' 1773 \| memcpy(hmac->opad, hmac->ipad, bs); \| ^~~~~~ The same thing happens in two more drivers that have the same logic: drivers/crypto/chelsio/chcr_algo.c: In function 'chcr_ahash_setkey': include/linux/fortify-string.h:57:33: error: '__builtin_memcpy' accessing 129 or more bytes at offsets 260 and 132 overlaps 1 or more bytes at offset 260 [-Werror=restrict] drivers/crypto/bcm/cipher.c: In function 'ahash_hmac_setkey': include/linux/fortify-string.h:57:33: error: '__builtin_memcpy' accessing between 129 and 4294967295 bytes at offsets 840 and 712 overlaps between 1 and 4294967167 bytes at offset 840 [-Werror=restrict] I don't think it can actually happen because the size is strictly bounded to the available block sizes, at most 128 bytes, though inlining decisions could lead gcc to not see that. Use the unsafe_memcpy() helper instead of memcpy(), with the only difference being that this skips the hardening checks that produce the warning. Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	hwrng: iproc-rng200 - Implement suspend and resume calls	Florian Fainelli
	Chips such as BCM7278 support system wide suspend/resume which will cause the HWRNG block to lose its state and reset to its power on reset register values. We need to cleanup and re-initialize the HWRNG for it to be functional coming out of a system suspend cycle. Fixes: c3577f6100ca ("hwrng: iproc-rng200 - Add support for BCM7278") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	hwrng: core - Remove duplicated include	GUO Zihua
	Remove duplicated include of linux/random.h. Resolves checkincludes message. And adjust includes in alphabetical order. Signed-off-by: GUO Zihua <guozihua@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: exynos - fix Wvoid-pointer-to-enum-cast warning	Krzysztof Kozlowski
	'type' is an enum, thus cast of pointer on 64-bit compile test with W=1 causes: exynos-rng.c:280:14: error: cast to smaller integer type 'enum exynos_prng_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: qat - Remove unused function declarations	Yue Haibing
	Commit d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework") declared but never implemented these functions. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: allwinner - Remove unused function declarations	Yue Haibing
	Commit 06f751b61329 ("crypto: allwinner - Add sun8i-ce Crypto Engine") declared but never implemented sun8i_ce_enqueue(). Commit 56f6d5aee88d ("crypto: sun8i-ce - support hash algorithms") declared but never implemented sun8i_ce_hash(). Commit f08fcced6d00 ("crypto: allwinner - Add sun8i-ss cryptographic offloader") declared but never implemented sun8i_ss_enqueue(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: caam/jr - fix shared IRQ line handling	Horia Geantă
	There are cases when the interrupt status register (JRINTR) is non-zero, even though: 1. An interrupt was generated, but it was masked OR 2. There was no interrupt generated at all for the corresponding job ring. 1. The case when interrupt is masked (JRCFGR_LS[IMSK]=1b'1) while other events have happened and are being accounted for, e.g. -JRINTR[HALT]=2b'10 - input job ring underwent a flush of all on-going jobs and processing of still-existing jobs (sitting in the ring) has been halted -JRINTR[HALT]=2b'01 - input job ring is currently undergoing a flush -JRINTR[ENTER_FAIL]=1b'1 - SecMon / SNVS transitioned to FAIL MODE It doesn't matter whether these events would assert the interrupt signal or not, interrupt is anyhow masked. 2. The case when interrupt is not masked (JRCFGR_LS[IMSK]=1b'0), however the events accounted for in JRINTR do not generate interrupts, e.g.: -JRINTR[HALT]=2b'01 -JRINTR[ENTER_FAIL]=1b'1 and JRCFGR_MS[FAIL_MODE]=1b'0 Currently in these cases, when the JR interrupt handler is invoked (as a consequence of JR sharing the interrupt line with other devices - e.g. the two JRs on i.MX7ULP) it continues execution instead of returning IRQ_NONE. This could lead to situations like interrupt handler clearing JRINTR (and thus also the JRINTR[HALT] field) while corresponding job ring is suspended and then that job ring failing on resume path, due to expecting JRINTR[HALT]=b'10 and reading instead JRINTR[HALT]=b'00. Fix this by checking status of JRINTR[JRI] in the JR interrupt handler. If JRINTR[JRI]=1b'0, there was no interrupt generated for this JR and handler must return IRQ_NONE. Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: caam - increase the domain of write memory barrier to full system	Iuliana Prodan
	In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb() fail to make the input ring be updated before the CAAM starts reading it. So, CAAM will process, again, an old descriptor address and will put it in the output ring. This will make caam_jr_dequeue() to fail, since this old descriptor is not in the software ring. To fix this, use wmb() which works on the full system instead of inner/outer shareable domains. Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com> Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: caam - fix unchecked return value error	Gaurav Jain
	error: Unchecked return value (CHECKED_RETURN) check_return: Calling sg_miter_next without checking return value fix: added check if(!sg_miter_next) Fixes: 8a2a0dd35f2e ("crypto: caam - strip input zeros from RSA input buffer") Signed-off-by: Gaurav Jain <gaurav.jain@nxp.com> Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	crypto: caam - fix PM operations definition	Arnd Bergmann
	The newly added PM operations use the deprecated SIMPLE_DEV_PM_OPS() macro, causing a warning in some configurations: drivers/crypto/caam/ctrl.c:828:12: error: 'caam_ctrl_resume' defined but not used [-Werror=unused-function] 828 \| static int caam_ctrl_resume(struct device dev) \| ^~~~~~~~~~~~~~~~ drivers/crypto/caam/ctrl.c:818:12: error: 'caam_ctrl_suspend' defined but not used [-Werror=unused-function] 818 \| static int caam_ctrl_suspend(struct device dev) \| ^~~~~~~~~~~~~~~~~ drivers/crypto/caam/jr.c:732:12: error: 'caam_jr_resume' defined but not used [-Werror=unused-function] 732 \| static int caam_jr_resume(struct device dev) \| ^~~~~~~~~~~~~~ drivers/crypto/caam/jr.c:687:12: error: 'caam_jr_suspend' defined but not used [-Werror=unused-function] 687 \| static int caam_jr_suspend(struct device dev) \| ^~~~~~~~~~~~~~~ Use the normal DEFINE_SIMPLE_DEV_PM_OPS() variant now, and use pm_ptr() to completely eliminate the structure in configs without CONFIG_PM. Fixes: 322d74752c28a ("crypto: caam - add power management support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	Revert "crypto: caam - adjust RNG timing to support more devices"	Herbert Xu
	This reverts commit ef492d080302913e85122a2d92efa2ca174930f8. This patch breaks the RNG on i.MX8MM. Reported-by: Bastian Krause <bst@pengutronix.de> Link: https://lore.kernel.org/all/e1f3f073-9d5e-1bae-f4f8-08dc48adad62@pengutronix.de/ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-18	drm: bridge: samsung-dsim: Fix init during host transfer	Frieder Schrempf
	In case the downstream bridge or panel uses DSI transfers before the DSI host was actually initialized through samsung_dsim_atomic_enable() which clears the stop state (LP11) mode, all transfers will fail. This happens with downstream bridges that are controlled by DSI commands such as the tc358762. As documented in [1] DSI hosts are expected to allow transfers outside the normal bridge enable/disable flow. To fix this make sure that stop state is cleared in samsung_dsim_host_transfer() which restores the previous behavior. We also factor out the common code to enable/disable stop state to samsung_dsim_set_stop_state(). [1] https://docs.kernel.org/gpu/drm-kms-helpers.html#mipi-dsi-bridge-operation Fixes: 0c14d3130654 ("drm: bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec") Reported-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230724151640.555490-1-frieder@fris.de
2023-08-18	gpio: sim: simplify code with cleanup helpers	Bartosz Golaszewski
	Use macros defined in linux/cleanup.h to automate resource lifetime control in gpio-sim. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org>
2023-08-18	OPP: Fix argument name in doc comment	Viresh Kumar
	The name of the argument is "opp_table" and not "table", fix the comment. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308172310.FzcidE4c-lkp@intel.com/ Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-08-18	Merge tag 'net-6.5-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter. No known outstanding regressions. Fixes to fixes: - virtio-net: set queues after driver_ok, avoid a potential race added by recent fix - Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning when VLAN 0 is registered explicitly - nf_tables: - fix false-positive lockdep splat in recent fixes - don't fail inserts if duplicate has expired (fix test failures) - fix races between garbage collection and netns dismantle Current release - new code bugs: - mlx5: Fix mlx5_cmd_update_root_ft() error flow Previous releases - regressions: - phy: fix IRQ-based wake-on-lan over hibernate / power off Previous releases - always broken: - sock: fix misuse of sk_under_memory_pressure() preventing system from exiting global TCP memory pressure if a single cgroup is under pressure - fix the RTO timer retransmitting skb every 1ms if linear option is enabled - af_key: fix sadb_x_filter validation, amment netlink policy - ipsec: fix slab-use-after-free in decode_session6() - macb: in ZynqMP resume always configure PS GTR for non-wakeup source Misc: - netfilter: set default timeout to 3 secs for sctp shutdown send and recv state (from 300ms), align with protocol timers" * tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits) ice: Block switchdev mode when ADQ is active and vice versa qede: fix firmware halt over suspend and resume net: do not allow gso_size to be set to GSO_BY_FRAGS sock: Fix misuse of sk_under_memory_pressure() sfc: don't fail probe if MAE/TC setup fails sfc: don't unregister flow_indr if it was never registered net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset net/mlx5: Fix mlx5_cmd_update_root_ft() error flow net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT i40e: fix misleading debug logs iavf: fix FDIR rule fields masks validation ipv6: fix indentation of a config attribute mailmap: add entries for Simon Horman broadcom: b44: Use b44_writephy() return value net: openvswitch: reject negative ifindex team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves net: phy: broadcom: stub c45 read/write for 54810 netfilter: nft_dynset: disallow object maps netfilter: nf_tables: GC transaction race with netns dismantle netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path ...
2023-08-18	Merge tag 'drm-fixes-2023-08-18-1' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Regular enough week, mostly the usual amdgpu and i915 fixes. Also qaic, nouveau, qxl and a revert for an EDID patch that had some side effects, along with a couple of panel fixes. edid: - revert mode parsing fix that had side effects. i915: - Fix the flow for ignoring GuC SLPC efficient frequency selection - Fix SDVO panel_type initialization - Fix display probe for IVB Q and IVB D GT2 server nouveau: - fix use-after-free in connector code qaic: - integer overflow check fix - fix slicing memory leak panel: - fix JDI LT070ME05000 probing - fix AUO G121EAN01 timings amdgpu: - SMU 13.x fixes - Fix mcbp parameter for gfx9 - SMU 11.x fixes - Temporary fix for large numbers of XCP partitions - S0ix fixes - DCN 2.0 fix qxl: - fix use after free race in dumb object allocation" * tag 'drm-fixes-2023-08-18-1' of git://anongit.freedesktop.org/drm/drm: drm/qxl: fix UAF on handle creation Revert "drm/edid: Fix csync detailed mode parsing" drm/nouveau/disp: fix use-after-free in error handling of nouveau_connector_create Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0"" drm/amd: flush any delayed gfxoff on suspend entry drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix drm/amdgpu: skip xcp drm device allocation when out of drm resource drm/amd/pm: Update pci link width for smu v13.0.6 drm/amd/pm: Fix temperature unit of SMU v13.0.6 drm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7 drm/amdgpu: disable mcbp if parameter zero is set drm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0 accel/qaic: Clean up integer overflow checking in map_user_pages() accel/qaic: Fix slicing memory leak drm/i915: fix display probe for IVB Q and IVB D GT2 server drm/i915/sdvo: fix panel_type initialization drm/i915/guc/slpc: Restore efficient freq earlier drm/panel: simple: Fix AUO G121EAN01 panel timings according to the docs drm/panel: JDI LT070ME05000 simplify with dev_err_probe()
2023-08-17	md: raid0: account for split bio in iostat accounting	David Jeffery
	When a bio is split by md raid0, the newly created bio will not be tracked by md for I/O accounting. Only the portion of I/O still assigned to the original bio which was reduced by the split will be accounted for. This results in md iostat data sometimes showing I/O values far below the actual amount of data being sent through md. md_account_bio() needs to be called for all bio generated by the bio split. A simple example of the issue was generated using a raid0 device on partitions to the same device. Since all raid0 I/O then goes to one device, it makes it easy to see a gap between the md device and its sd storage. Reading an lvm device on top of the md device, the iostat output (some 0 columns and extra devices removed to make the data more compact) was: Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read md2 0.00 0.00 0.00 0.00 0 sde 0.00 0.00 0.00 0.00 0 md2 1364.00 411496.00 0.00 0.00 411496 sde 1734.00 646144.00 0.00 0.00 646144 md2 1699.00 510680.00 0.00 0.00 510680 sde 2155.00 802784.00 0.00 0.00 802784 md2 803.00 241480.00 0.00 0.00 241480 sde 1016.00 377888.00 0.00 0.00 377888 md2 0.00 0.00 0.00 0.00 0 sde 0.00 0.00 0.00 0.00 0 I/O was generated doing large direct I/O reads (12M) with dd to a linear lvm volume on top of the 4 leg raid0 device. The md2 reads were showing as roughly 2/3 of the reads to the sde device containing all of md2's raid partitions. The sum of reads to sde was 1826816 kB, which was the expected amount as it was the amount read by dd. With the patch, the total reads from md will match the reads from sde and be consistent with the amount of I/O generated. Fixes: 10764815ff47 ("md: add io accounting for raid0 and raid5") Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230816181433.13289-1-djeffery@redhat.com
2023-08-17	md/raid0: Fix performance regression for large sequential writes	Jan Kara
	Commit f00d7c85be9e ("md/raid0: fix up bio splitting.") among other things changed how bio that needs to be split is submitted. Before this commit, we have split the bio, mapped and submitted each part. After this commit, we map only the first part of the split bio and submit the second part unmapped. Due to bio sorting in __submit_bio_noacct() this results in the following request ordering: 9,0 18 1181 0.525037895 15995 Q WS 1479315464 + 63392 Split off chunk-sized (1024 sectors) request: 9,0 18 1182 0.629019647 15995 X WS 1479315464 / 1479316488 Request is unaligned to the chunk so it's split in raid0_make_request(). This is the first part mapped and punted to bio_list: 8,0 18 7053 0.629020455 15995 A WS 739921928 + 1016 <- (9,0) 1479315464 Now raid0_make_request() returns, second part is postponed on bio_list. __submit_bio_noacct() resorts the bio_list, mapped request is submitted to the underlying device: 8,0 18 7054 0.629022782 15995 G WS 739921928 + 1016 Now we take another request from the bio_list which is the remainder of the original huge request. Split off another chunk-sized bit from it and the situation repeats: 9,0 18 1183 0.629024499 15995 X WS 1479316488 / 1479317512 8,16 18 6998 0.629025110 15995 A WS 739921928 + 1016 <- (9,0) 1479316488 8,16 18 6999 0.629026728 15995 G WS 739921928 + 1016 ... 9,0 18 1184 0.629032940 15995 X WS 1479317512 / 1479318536 [libnetacq-write] 8,0 18 7059 0.629033294 15995 A WS 739922952 + 1016 <- (9,0) 1479317512 8,0 18 7060 0.629033902 15995 G WS 739922952 + 1016 ... This repeats until we consume the whole original huge request. Now we finally get to processing the second parts of the split off requests (in reverse order): 8,16 18 7181 0.629161384 15995 A WS 739952640 + 8 <- (9,0) 1479377920 8,0 18 7239 0.629162140 15995 A WS 739952640 + 8 <- (9,0) 1479376896 8,16 18 7186 0.629163881 15995 A WS 739951616 + 8 <- (9,0) 1479375872 8,0 18 7242 0.629164421 15995 A WS 739951616 + 8 <- (9,0) 1479374848 ... I guess it is obvious that this IO pattern is extremely inefficient way to perform sequential IO. It also makes bio_list to grow to rather long lengths. Change raid0_make_request() to map both parts of the split bio. Since we know we are provided with at most chunk-sized bios, we will always need to split the incoming bio at most once. Fixes: f00d7c85be9e ("md/raid0: fix up bio splitting.") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20230814092720.3931-2-jack@suse.cz Signed-off-by: Song Liu <song@kernel.org>
2023-08-17	md/raid0: Factor out helper for mapping and submitting a bio	Jan Kara
	Factor out helper function for mapping and submitting a bio out of raid0_make_request(). We will use it later for submitting both parts of a split bio. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20230814092720.3931-1-jack@suse.cz Signed-off-by: Song Liu <song@kernel.org>
2023-08-17	md raid1: allow writebehind to work on any leg device set WriteMostly	Heinz Mauelshagen
	As the WriteMostly flag can be set on any component device of a RAID1 array, remove the constraint that it only works if set on the first one. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Tested-by: Xiao Ni <xni@redhat.com> Link: https://lore.kernel.org/r/2a9592bf3340f34bf588eec984b23ee219f3985e.1692013451.git.heinzm@redhat.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-17	md/raid1: hold the barrier until handle_read_error() finishes	Xueshi Hu
	handle_read_error() will call allow_barrier() to match the former barrier raising. However, it should put the allow_barrier() at the end to avoid a concurrent raid reshape. Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().") Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> Link: https://lore.kernel.org/r/20230814135356.1113639-4-xueshi.hu@smartx.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-17	md/raid1: free the r1bio before waiting for blocked rdev	Xueshi Hu
	Raid1 reshape will change mempool and r1conf::raid_disks which are needed to free r1bio. allow_barrier() make a concurrent raid1_reshape() possible. So, free the in-flight r1bio before waiting blocked rdev. Fixes: 6bfe0b499082 ("md: support blocking writes to an array on device failure") Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> Link: https://lore.kernel.org/r/20230814135356.1113639-3-xueshi.hu@smartx.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-17	md/raid1: call free_r1bio() before allow_barrier() in raid_end_bio_io()	Xueshi Hu
	After allow_barrier, a concurrent raid1_reshape() will replace old mempool and r1conf::raid_disks. Move allow_barrier() to the end of raid_end_bio_io(), so that r1bio can be freed safely. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> Link: https://lore.kernel.org/r/20230814135356.1113639-2-xueshi.hu@smartx.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-17	Merge branch 'netconsole-enable-compile-time-configuration'	Jakub Kicinski
	Breno Leitao says: ==================== netconsole: Enable compile time configuration Enable netconsole features to be set at compilation time. Create two Kconfig options that allow users to set extended logs and release prepending features at compilation time. The first patch de-duplicates the initialization code, and the second patch adds the support in the de-duplicated code, avoiding touching two different functions with the same change. ==================== Link: https://lore.kernel.org/r/20230811093158.1678322-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	netconsole: Enable compile time configuration	Breno Leitao
	Enable netconsole features to be set at compilation time. Create two Kconfig options that allow users to set extended logs and release prepending features at compilation time. Right now, the user needs to pass command line parameters to netconsole, such as "+"/"r" to enable extended logs and version prepending features. With these two options, the user could set the default values for the features at compile time, and don't need to pass it in the command line to get them enabled, simplifying the command line. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230811093158.1678322-3-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	netconsole: Create a allocation helper	Breno Leitao
	De-duplicate the initialization and allocation code for struct netconsole_target. The same allocation and initialization code is duplicated in two different places in the netconsole subsystem, when the netconsole target is initialized by command line parameters (alloc_param_target()), and dynamically by sysfs (make_netconsole_target()). Create a helper function, and call it from the two different functions. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230811093158.1678322-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	net: mdio: fix -Wvoid-pointer-to-enum-cast warning	Justin Stitt
	When building with clang 18 I see the following warning: \| drivers/net/mdio/mdio-xgene.c:338:13: warning: cast to smaller integer \| type 'enum xgene_mdio_id' from 'const void ' [-Wvoid-pointer-to-enum-cast] \| 338 \| mdio_id = (enum xgene_mdio_id)of_id->data; This is due to the fact that `of_id->data` is a void while `enum xgene_mdio_id` has the size of an int. This leads to truncation and possible data loss. Link: https://github.com/ClangBuiltLinux/linux/issues/1910 Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20230815-void-drivers-net-mdio-mdio-xgene-v1-1-5304342e0659@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	Merge branch 'netem-use-a-seeded-prng-for-loss-and-corruption-events'	Jakub Kicinski
	François Michel says: ==================== netem: use a seeded PRNG for loss and corruption events In order to reproduce bugs or performance evaluation of network protocols and applications, it is useful to have reproducible test suites and tools. This patch adds a way to specify a PRNG seed through the TCA_NETEM_PRNG_SEED attribute for generating netem loss and corruption events. Initializing the qdisc with the same seed leads to the exact same loss and corruption patterns. If no seed is explicitly specified, the qdisc generates a random seed using get_random_u64(). This patch can be and has been tested using tc from the following iproute2-next fork: https://github.com/francoismichel/iproute2-next For instance, setting the seed 42424242 on the loopback with a loss rate of 10% will systematically drop the 5th, 12th and 24th packet when sending 25 packets. ==================== Link: https://lore.kernel.org/r/20230815092348.1449179-1-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	netem: use seeded PRNG for correlated loss events	François Michel
	Use prandom_u32_state() instead of get_random_u32() to generate the correlated loss events of netem. Signed-off-by: François Michel <francois.michel@uclouvain.be> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230815092348.1449179-4-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	netem: use a seeded PRNG for generating random losses	François Michel
	Use prandom_u32_state() instead of get_random_u32() to generate the random loss events of netem. The state of the prng is part of the prng attribute of struct netem_sched_data. Signed-off-by: François Michel <francois.michel@uclouvain.be> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230815092348.1449179-3-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	netem: add prng attribute to netem_sched_data	François Michel
	Add prng attribute to struct netem_sched_data and allows setting the seed of the PRNG through netlink using the new TCA_NETEM_PRNG_SEED attribute. The PRNG attribute is not actually used yet. Signed-off-by: François Michel <francois.michel@uclouvain.be> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230815092348.1449179-2-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	net: ena: Use pci_dev_id() to simplify the code	Jialin Zhang
	PCI core API pci_dev_id() can be used to get the BDF number for a pci device. We don't need to compose it manually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230815024248.3519068-1-zhangjialin11@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	tun: add __exit annotations to module exit func tun_cleanup()	Ziyang Xuan
	Add missing __exit annotations to module exit func tun_cleanup(). Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230814083000.3893589-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before ↵	Tejun Heo
	init blk-iocost sometimes causes the following crash: BUG: kernel NULL pointer dereference, address: 00000000000000e0 ... RIP: 0010:_raw_spin_lock+0x17/0x30 Code: be 01 02 00 00 e8 79 38 39 ff 31 d2 89 d0 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 65 ff 05 48 d0 34 7e b9 01 00 00 00 31 c0 <f0> 0f b1 0f 75 02 5d c3 89 c6 e8 ea 04 00 00 5d c3 0f 1f 84 00 00 RSP: 0018:ffffc900023b3d40 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000000000000e0 RCX: 0000000000000001 RDX: ffffc900023b3d20 RSI: ffffc900023b3cf0 RDI: 00000000000000e0 RBP: ffffc900023b3d40 R08: ffffc900023b3c10 R09: 0000000000000003 R10: 0000000000000064 R11: 000000000000000a R12: ffff888102337000 R13: fffffffffffffff2 R14: ffff88810af408c8 R15: ffff8881070c3600 FS: 00007faaaf364fc0(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 00000001097b1000 CR4: 0000000000350ea0 Call Trace: <TASK> ioc_weight_write+0x13d/0x410 cgroup_file_write+0x7a/0x130 kernfs_fop_write_iter+0xf5/0x170 vfs_write+0x298/0x370 ksys_write+0x5f/0xb0 __x64_sys_write+0x1b/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 This happens because iocg->ioc is NULL. The field is initialized by ioc_pd_init() and never cleared. The NULL deref is caused by blkcg_activate_policy() installing blkg_policy_data before initializing it. blkcg_activate_policy() was doing the following: 1. Allocate pd's for all existing blkg's and install them in blkg->pd[]. 2. Initialize all pd's. 3. Online all pd's. blkcg_activate_policy() only grabs the queue_lock and may release and re-acquire the lock as allocation may need to sleep. ioc_weight_write() grabs blkcg->lock and iterates all its blkg's. The two can race and if ioc_weight_write() runs during #1 or between #1 and #2, it can encounter a pd which is not initialized yet, leading to crash. The crash can be reproduced with the following script: #!/bin/bash echo +io > /sys/fs/cgroup/cgroup.subtree_control systemd-run --unit touch-sda --scope dd if=/dev/sda of=/dev/null bs=1M count=1 iflag=direct echo 100 > /sys/fs/cgroup/system.slice/io.weight bash -c "echo '8:0 enable=1' > /sys/fs/cgroup/io.cost.qos" & sleep .2 echo 100 > /sys/fs/cgroup/system.slice/io.weight with the following patch applied: > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index fc49be622e05..38d671d5e10c 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -1553,6 +1553,12 @@ int blkcg_activate_policy(struct gendisk disk, const struct blkcg_policy pol) > pd->online = false; > } > > + if (system_state == SYSTEM_RUNNING) { > + spin_unlock_irq(&q->queue_lock); > + ssleep(1); > + spin_lock_irq(&q->queue_lock); > + } > + > /* all allocated, init in the same order */ > if (pol->pd_init_fn) > list_for_each_entry_reverse(blkg, &q->blkg_list, q_node) I don't see a reason why all pd's should be allocated, initialized and onlined together. The only ordering requirement is that parent blkgs to be initialized and onlined before children, which is guaranteed from the walking order. Let's fix the bug by allocating, initializing and onlining pd for each blkg and holding blkcg->lock over initialization and onlining. This ensures that an installed blkg is always fully initialized and onlined removing the the race window. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Breno Leitao <leitao@debian.org> Fixes: 9d179b865449 ("blkcg: Fix multiple bugs in blkcg_activate_policy()") Link: https://lore.kernel.org/r/ZN0p5_W-Q9mAHBVY@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-17	io_uring/rsrc: Annotate struct io_mapped_ubuf with __counted_by	Kees Cook
	Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct io_mapped_ubuf. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Jens Axboe <axboe@kernel.dk> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: io-uring@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230817212146.never.853-kees@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-17	lkdtm: Add FAM_BOUNDS test for __counted_by	Kees Cook
	Add new CONFIG_UBSAN_BOUNDS test for __counted_by attribute. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-17	Compiler Attributes: counted_by: Adjust name and identifier expansion	Kees Cook
	GCC and Clang's current RFCs name this attribute "counted_by", and have moved away from using a string for the member name. Update the kernel's macros to match. Additionally provide a UAPI no-op macro for UAPI structs that will gain annotations. Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Fixes: dd06e72e68bc ("Compiler Attributes: Add __counted_by macro") Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20230817200558.never.077-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-17	pstore: Support record sizes larger than kmalloc() limit	Yuxiao Zhang
	Currently pstore record buffers are allocated using kmalloc() which has a maximum size based on page size. If a large "pmsg-size" module parameter is specified, pmsg will fail to copy the contents since memdup_user() is limited to kmalloc() allocation sizes. Since we don't need physically contiguous memory for any of the pstore record buffers, use kvzalloc() to avoid such limitations in the core of pstore and in the ram backend, and explicitly read from userspace using vmemdup_user(). This also means that any other backends that want to (or do already) support larger record sizes will Just Work now. Signed-off-by: Yuxiao Zhang <yuxiaozhang@google.com> Link: https://lore.kernel.org/r/20230627202540.881909-2-yuxiaozhang@google.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-17	Merge branch '40GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-08-16 (iavf, i40e) This series contains updates to iavf and i40e drivers. Piotr adds checks for unsupported Flow Director rules on iavf. Andrii replaces incorrect 'write' messaging on read operations for i40e. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: i40e: fix misleading debug logs iavf: fix FDIR rule fields masks validation ==================== Link: https://lore.kernel.org/r/20230816193308.1307535-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17	ice: split ice_aq_wait_for_event() func into two	Przemek Kitszel
	Mitigate race between registering on wait list and receiving AQ Response from FW. ice_aq_prep_for_event() should be called before sending AQ command, ice_aq_wait_for_event() should be called after sending AQ command, to wait for AQ Response. Please note, that this was found by reading the code, an actual race has not yet materialized. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: embed &ice_rq_event_info event into struct ice_aq_task	Przemek Kitszel
	Expose struct ice_aq_task to callers, what takes burden of memory ownership out from AQ-wait family of functions, and reduces need for heap-based allocations. Embed struct ice_rq_event_info event into struct ice_aq_task (instead of it being a ptr) to remove some more code from the callers. Subsequent commit will improve more based on this one. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: ice_aq_check_events: fix off-by-one check when filling buffer	Przemek Kitszel
	Allow task's event buffer to be filled also in the case that it's size is exactly the size of the message. Fixes: d69ea414c9b4 ("ice: implement device flash update via devlink") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: drop two params from ice_aq_alloc_free_res()	Przemek Kitszel
	Drop @num_entries and @cd params, latter of which was always NULL. Number of entities to alloc is passed in internal buffer, the outer layer (that @num_entries was assigned to) meaning is closer to "the number of requests", which was =1 in all cases. ice_free_hw_res() was always called with 1 as its @num arg. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: use list_for_each_entry() helper	Yang Yingliang
	Convert list_for_each() to list_for_each_entry() where applicable. No functional changed. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: Remove redundant VSI configuration in eswitch setup	Marcin Szycik
	Remove a call to disable VLAN stripping on switchdev control plane VSI, as it is disabled by default. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: move E810T functions to before device agnostic ones	Jacob Keller
	Commit 885fe6932a11 ("ice: Add support for SMA control multiplexer") accidentally placed all of the E810T SMA control functions in the middle of the device agnostic functions section of ice_ptp_hw.c This works fine, but makes it harder for readers to follow. The ice_ptp_hw.c file is laid out such that each hardware family has the specific functions in one block, with the access functions placed at the end of the file. Move the E810T functions so that they are in a block just after the E810 functions. Also move the ice_get_phy_tx_tstamp_ready_e810 which got added at the end of the E810T block. This keeps the functions laid out in a logical order and avoids intermixing the generic access functions with the device specific implementations. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: refactor ice_vsi_is_vlan_pruning_ena	Jan Sokolowski
	As this method became static, and is already called with check for vsi being non-null, an unnecessary check along with superfluous parentheses is removed. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: refactor ice_ptp_hw to make functions static	Jan Sokolowski
	As following methods are not used outside ice_ptp_hw, they can be made static: ice_read_phy_reg_e822 ice_write_phy_reg_e822 ice_ptp_prep_port_adj_e822 Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17	ice: refactor ice_sched to make functions static	Jan Sokolowski
	As ice_sched_set_node_bw_lmt_per_tc is not used outside of ice_sched, it can be made static. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>