summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-04Merge branch 'bpf-xsk-sh-umem'Daniel Borkmann
Tushar Vyavahare says: ==================== Implement a test for the SHARED_UMEM feature in this patch set and make necessary changes/improvements. Ensure that the framework now supports different streams for different sockets. v2->v3: - Set the sock_num at the end of the while loop. - Declare xsk at the top of the while loop. v1->v2: - Remove generate_mac_addresses() and generate mac addresses based on the number of sockets in __test_spec_init() function. [Magnus] - Update Makefile to include find_bit.c for compiling xskxceiver. - Add bitmap_full() function to verify all bits are set to break the while loop in the receive_pkts() and send_pkts() functions. - Replace __test_and_set_bit() function with __set_bit() function. - Add single return check for wait_for_tx_completion() function call. Patch series summary: 1: Move the packet stream from the ifobject struct to the xsk_socket_info struct to enable the use of different streams for different sockets This will facilitate the sending and receiving of data from multiple sockets simultaneously using the SHARED_XDP_UMEM feature. It gives flexibility of send/recive individual traffic on particular socket. 2: Rename the header file to a generic name so that it can be used by all future XDP programs. 3: Move the src_mac and dst_mac fields from the ifobject structure to the xsk_socket_info structure to achieve per-socket MAC address assignment. Require this in order to steer traffic to various sockets in subsequent patches. 4: Improve the receive_pkt() function to enable it to receive packets from multiple sockets. Define a sock_num variable to iterate through all the sockets in the Rx path. Add nb_valid_entries to check that all the expected number of packets are received. 5: The pkt_set() function no longer needs the umem parameter. This commit removes the umem parameter from the pkt_set() function. 6: Iterate over all the sockets in the send pkts function. Update send_pkts() to handle multiple sockets for sending packets. Multiple TX sockets are utilized alternately based on the batch size for improve packet transmission. 7: Modify xsk_update_xskmap() to accept the index as an argument, enabling the addition of multiple sockets to xskmap. 8: Add a new test for testing shared umem feature. This is accomplished by adding a new XDP program and using the multiple sockets. The new XDP program redirects the packets based on the destination MAC address. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2023-10-04selftests/xsk: Add a test for shared umem featureTushar Vyavahare
Add a new test for testing shared umem feature. This is accomplished by adding a new XDP program and using the multiple sockets. The new XDP program redirects the packets based on the destination MAC address. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-9-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Modify xsk_update_xskmap() to accept the index as an argumentTushar Vyavahare
Modify xsk_update_xskmap() to accept the index as an argument, enabling the addition of multiple sockets to xskmap. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-8-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Iterate over all the sockets in the send pkts functionTushar Vyavahare
Update send_pkts() to handle multiple sockets for sending packets. Multiple TX sockets are utilized alternately based on the batch size for improve packet transmission. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-7-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Remove unnecessary parameter from pkt_set() function callTushar Vyavahare
The pkt_set() function no longer needs the umem parameter. This commit removes the umem parameter from the pkt_set() function. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-6-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Iterate over all the sockets in the receive pkts functionTushar Vyavahare
Improve the receive_pkt() function to enable it to receive packets from multiple sockets. Define a sock_num variable to iterate through all the sockets in the Rx path. Add nb_valid_entries to check that all the expected number of packets are received. Revise the function __receive_pkts() to only inspect the receive ring once, handle any received packets, and promptly return. Implement a bitmap to store the value of number of sockets. Update Makefile to include find_bit.c for compiling xskxceiver. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-5-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Move src_mac and dst_mac to the xsk_socket_infoTushar Vyavahare
Move the src_mac and dst_mac fields from the ifobject structure to the xsk_socket_info structure to achieve per-socket MAC address assignment. Require this in order to steer traffic to various sockets in subsequent patches. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-4-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Rename xsk_xdp_metadata.h to xsk_xdp_common.hTushar Vyavahare
Rename the header file to a generic name so that it can be used by all future XDP programs. Ensure that the xsk_xdp_common.h header file includes include guards. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-3-tushar.vyavahare@intel.com
2023-10-04selftests/xsk: Move pkt_stream to the xsk_socket_infoTushar Vyavahare
Move the packet stream from the ifobject struct to the xsk_socket_info struct to enable the use of different streams for different sockets. This will facilitate the sending and receiving of data from multiple sockets simultaneously using the SHARED_XDP_UMEM feature. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-2-tushar.vyavahare@intel.com
2023-10-04drm/i915: Invalidate the TLBs on each GTChris Wilson
With multi-GT devices, the object may have been bound on each GT and so we need to invalidate the TLBs across all GT before releasing the pages back to the system. Fixes: d6c531ab4820 ("drm/i915: Invalidate the TLBs on each GT") Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> CC: Matt Roper <matthew.d.roper@intel.com> CC: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231002140742.933530-1-jonathan.cavitt@intel.com (cherry picked from commit 6b8ace7a14e7926b7b914ccd96a8ac657c0d518c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-10-04drm/i915: Register engines early to avoid type confusionMathias Krause
Commit 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine map") switched from using for_each_engine() to for_each_uabi_engine() to iterate over the user engines. While this seems to be a sensible change, it's only safe to do when the engines are actually chained using the rb-tree structure which is not the case during early driver initialization where it can be either a lock-less list or regular double-linked list. In fact, the modesetting initialization code may end up calling default_engines() through the fb helper code while the engines list is still llist_node-based: i915_driver_probe() -> intel_display_driver_probe() -> intel_fbdev_init() -> drm_fb_helper_init() -> drm_client_init() -> drm_client_open() -> drm_file_alloc() -> i915_driver_open() -> i915_gem_open() -> i915_gem_context_open() -> i915_gem_create_context() -> default_engines() Using for_each_uabi_engine() in default_engines() is therefore wrong, as it would try to interpret the llist as rb-tree, making it find no engine at all, as the rb_left and rb_right members will still be NULL, as they haven't been initialized yet. To fix this type confusion register the engines earlier and at the same time reduce the amount of code that has to deal with the intermediate llist state. Reported-by: sanity checks in grsecurity Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine map") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230928182019.10256-2-minipli@grsecurity.net [tursulin: fixed commit tag typo] (cherry picked from commit 2b562f032fc2594fb3fac22b7a2eb3c1969a7ba3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-10-04drm/i915: Don't set PIPE_CONTROL_FLUSH_L3 for aux invalNirmoy Das
PIPE_CONTROL_FLUSH_L3 is not needed for aux invalidation so don't set that. Fixes: 78a6ccd65fa3 ("drm/i915/gt: Ensure memory quiesced before invalidation") Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.8+ Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Mark Janes <mark.janes@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Acked-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230926142401.25687-1-nirmoy.das@intel.com (cherry picked from commit 03d681412b38558aefe4fb0f46e36efa94bb21ef) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-10-04ASoC: dt-bindings: fsl,micfil: Document #sound-dai-cellsFabio Estevam
imx8mp.dtsi passes #sound-dai-cells = <0> in the micfil node. Document #sound-dai-cells to fix the following schema warning: audio-controller@30ca0000: '#sound-dai-cells' does not match any of the regexes: 'pinctrl-[0-9]+' from schema $id: http://devicetree.org/schemas/sound/fsl,micfil.yaml# Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Adam Ford <aford173@gmail.com> Link: https://lore.kernel.org/r/20231004122935.2250889-1-festevam@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-10-04selftests: netfilter: Extend nft_audit.shPhil Sutter
Add tests for sets and elements and deletion of all kinds. Also reorder rule reset tests: By moving the bulk rule add command up, the two 'reset rules' tests become identical. While at it, fix for a failing bulk rule add test's error status getting lost due to its use in a pipe. Avoid this by using a temporary file. Headings in diff output for failing tests contain no useful data, strip them. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-04selftests: netfilter: test for sctp collision processing in nf_conntrackXin Long
This patch adds a test case to reproduce the SCTP DATA chunk retransmission timeout issue caused by the improper SCTP collision processing in netfilter nf_conntrack_proto_sctp. In this test, client sends a INIT chunk, but the INIT_ACK replied from server is delayed until the server sends a INIT chunk to start a new connection from its side. After the connection is complete from server side, the delayed INIT_ACK arrives in nf_conntrack_proto_sctp. The delayed INIT_ACK should be dropped in nf_conntrack_proto_sctp instead of updating the vtag with the out-of-date init_tag, otherwise, the vtag in DATA chunks later sent by client don't match the vtag in the conntrack entry and the DATA chunks get dropped. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-04netfilter: handle the connecting collision properly in nf_conntrack_proto_sctpXin Long
In Scenario A and B below, as the delayed INIT_ACK always changes the peer vtag, SCTP ct with the incorrect vtag may cause packet loss. Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK 192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772] 192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151] 192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772] 192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] * 192.168.1.2 > 192.168.1.1: [COOKIE ECHO] 192.168.1.1 > 192.168.1.2: [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: [COOKIE ACK] Scenario B: INIT_ACK is delayed until the peer completes its own handshake 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] 192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] * This patch fixes it as below: In SCTP_CID_INIT processing: - clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir]. (Scenario E) - set ct->proto.sctp.init[dir]. In SCTP_CID_INIT_ACK processing: - drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] && ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C) - drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A) In SCTP_CID_COOKIE_ACK processing: - clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir]. (Scenario D) Also, it's important to allow the ct state to move forward with cookie_echo and cookie_ack from the opposite dir for the collision scenarios. There are also other Scenarios where it should allow the packet through, addressed by the processing above: Scenario C: new CT is created by INIT_ACK. Scenario D: start INIT on the existing ESTABLISHED ct. Scenario E: start INIT after the old collision on the existing ESTABLISHED ct. 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] (both side are stopped, then start new connection again in hours) 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742] Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-04netfilter: nft_payload: rebuild vlan header on h_proto accessFlorian Westphal
nft can perform merging of adjacent payload requests. This means that: ether saddr 00:11 ... ether type 8021ad ... is a single payload expression, for 8 bytes, starting at the ethernet source offset. Check that offset+length is fully within the source/destination mac addersses. This bug prevents 'ether type' from matching the correct h_proto in case vlan tag got stripped. Fixes: de6843be3082 ("netfilter: nft_payload: rebuild vlan header when needed") Reported-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-04can: raw: Remove NULL check before dev_{put, hold}Jiapeng Chong
The call netdev_{put, hold} of dev_{put, hold} will check NULL, so there is no need to check before using dev_{put, hold}, remove it to silence the warning: ./net/can/raw.c:497:2-9: WARNING: NULL check before dev_{put, hold} functions is not needed. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6231 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reported-by: Simon Horman <horms@kernel.org> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230825064656.87751-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-10-04Merge branch 'bnxt_en-hwmon-SRIOV'David S. Miller
Michael Chan says: ==================== bnxt_en: hwmon and SRIOV updates The first 7 patches are v2 of the hwmon patches posted about 6 weeks ago on Aug 14. The last 2 patches are SRIOV related updates. Link to v1 hwmon patches: https://lore.kernel.org/netdev/20230815045658.80494-11-michael.chan@broadcom.com/ ==================== Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Update VNIC resource calculation for VFsVikas Gupta
Newer versions of firmware will pre-reserve 1 VNIC for every possible PF and VF function. Update the driver logic to take this into account when assigning VNICs to the VFs. These pre-reserved VNICs for the inactive VFs should be subtracted from the global pool before assigning them to the active VFs. Not doing so may cause discrepancies that ultimately may cause some VFs to have insufficient VNICs to support features such as aRFS. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Support QOS and TPID settings for the SRIOV VLANSreekanth Reddy
Add these missing settings in the .ndo_set_vf_vlan() method. Older firmware does not support the TPID setting so check for proper support. Remove the unused BNXT_VF_QOS flag. Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Event handler for Thermal eventKalesh AP
Newer FW will send a new async event when it detects that the chip's temperature has crossed the configured threshold value. The driver will now notify hwmon and will log a warning message. Link: https://lore.kernel.org/netdev/20230815045658.80494-13-michael.chan@broadcom.com/ Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Use non-standard attribute to expose shutdown temperatureKalesh AP
Implement the sysfs attributes directly in the driver for shutdown threshold temperature and pass an extra attribute group to the hwmon core when registering the hwmon device. Link: https://lore.kernel.org/netdev/20230815045658.80494-12-michael.chan@broadcom.com/ Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Expose threshold temperatures through hwmonKalesh AP
HWRM_TEMP_MONITOR_QUERY response now indicates various threshold temperatures. Expose these threshold temperatures through the hwmon sysfs using this mapping: hwmon_temp_max : bp->warn_thresh_temp hwmon_temp_crit : bp->crit_thresh_temp hwmon_temp_emergency : bp->fatal_thresh_temp hwmon_temp_max_alarm : temp >= bp->warn_thresh_temp hwmon_temp_crit_alarm : temp >= bp->crit_thresh_temp hwmon_temp_emergency_alarm : temp >= bp->fatal_thresh_temp Link: https://lore.kernel.org/netdev/20230815045658.80494-12-michael.chan@broadcom.com/ Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Modify the driver to use hwmon_device_register_with_infoKalesh AP
The use of hwmon_device_register_with_groups() is deprecated. Modified the driver to use hwmon_device_register_with_info(). Driver currently exports only temp1_input through hwmon sysfs interface. But FW has been modified to report more threshold temperatures and driver want to report them through the hwmon interface. Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Move hwmon functions into a dedicated fileKalesh AP
This is in preparation for upcoming patches in the series. Driver has to expose more threshold temperatures through the hwmon sysfs interface. More code will be added and do not want to overload bnxt.c. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Enhance hwmon temperature reportingKalesh AP
Driver currently does hwmon device register and unregister in open and close() respectively. As a result, user will not be able to query hwmon temperature when interface is in ifdown state. Enhance it by moving the hwmon register/unregister to the probe/remove functions. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Update firmware interface to 1.10.2.171Michael Chan
The main changes are the additional thermal thresholds in hwrm_temp_monitor_query_output and the new async event to report thermal errors. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04ibmveth: Remove condition to recompute TCP header checksum.David Wilder
In some OVS environments the TCP pseudo header checksum may need to be recomputed. Currently this is only done when the interface instance is configured for "Trunk Mode". We found the issue also occurs in some Kubernetes environments, these environments do not use "Trunk Mode", therefor the condition is removed. Performance tests with this change show only a fractional decrease in throughput (< 0.2%). Fixes: 7525de2516fb ("ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.") Signed-off-by: David Wilder <dwilder@us.ibm.com> Reviewed-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04platform/x86: touchscreen_dmi: Add info for the BUSH Bush Windows tabletTomasz Swiatek
Add touchscreen info for the BUSH Bush Windows tablet. It was tested using gslx680_ts_acpi module and on patched kernel installed on device. Link: https://github.com/onitake/gsl-firmware/pull/215 Link: https://github.com/systemd/systemd/pull/29268 Signed-off-by: Tomasz Swiatek <swiatektomasz99@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-10-04Merge patch series "can: etas_es58x: clean-up of new GCC W=1 and old ↵Marc Kleine-Budde
checkpatch warnings" Vincent Mailhol <mailhol.vincent@wanadoo.fr> says: The kernel recently added new warnings, one of which triggers a known false positive on the etas_es58x module. In an effort to keep es58x_etas free of any W=12 (excluding those produced by foreign headers), add a workaround to silence it. While at it, this series also fix a checkpatch warning which I knew existed for a long time but was too lazy to tackle. v2 -> v3: * if the parsing of one of the version/revision numbers fail, es58x_parse_product_info() immediately returns. If this occurs early, the other version/revision numbers would still be set to zero (which is now considered a valid version number). Set the version and revision to an invalid number before starting the parsing so that everything is set even if an early return occurs. v1 -> v2: * v1 had two different check logics for the version numbers: - check that none of the sub-version number are zero to make sure the parsing succeeded - check that all of the sub-version number fit the expected digit range to please GCC. v2 simplifies things by merging those two logics together. Link: https://lore.kernel.org/all/20230924110914.183898-1-mailhol.vincent@wanadoo.fr [mkl: fixed typos] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-10-04platform/mellanox: tmfifo: fix kernel-doc warningsRandy Dunlap
Fix kernel-doc notation for structs and struct members to prevent these warnings: mlxbf-tmfifo.c:73: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_vring ' mlxbf-tmfifo.c:128: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_vdev ' mlxbf-tmfifo.c:146: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_irq_info ' mlxbf-tmfifo.c:158: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_io ' mlxbf-tmfifo.c:182: warning: cannot understand function prototype: 'struct mlxbf_tmfifo ' mlxbf-tmfifo.c:208: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_msg_hdr ' mlxbf-tmfifo.c:138: warning: Function parameter or member 'config' not described in 'mlxbf_tmfifo_vdev' mlxbf-tmfifo.c:212: warning: Function parameter or member 'unused' not described in 'mlxbf_tmfifo_msg_hdr' Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc") Fixes: bc05ea63b394 ("platform/mellanox: Add BlueField-3 support in the tmfifo driver") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Closes: lore.kernel.org/r/202309252330.saRU491h-lkp@intel.com Cc: Liming Sun <lsun@mellanox.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Mark Gross <markgross@kernel.org> Cc: Vadim Pasternak <vadimp@nvidia.com> Cc: platform-driver-x86@vger.kernel.org Link: https://lore.kernel.org/r/20230926054013.11450-1-rdunlap@infradead.org Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-10-04platform/x86/intel/ifs: release cpus_read_lock()Jithu Joseph
Couple of error paths in do_core_test() was returning directly without doing a necessary cpus_read_unlock(). Following lockdep warning was observed when exercising these scenarios with PROVE_RAW_LOCK_NESTING enabled: [ 139.304775] ================================================ [ 139.311185] WARNING: lock held when returning to user space! [ 139.317593] 6.6.0-rc2ifs01+ #11 Tainted: G S W I [ 139.324499] ------------------------------------------------ [ 139.330908] bash/11476 is leaving the kernel with locks still held! [ 139.338000] 1 lock held by bash/11476: [ 139.342262] #0: ffffffffaa26c930 (cpu_hotplug_lock){++++}-{0:0}, at: do_core_test+0x35/0x1c0 [intel_ifs] Fix the flow so that all scenarios release the lock prior to returning from the function. Fixes: 5210fb4e1880 ("platform/x86/intel/ifs: Sysfs interface for Array BIST") Cc: stable@vger.kernel.org Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Link: https://lore.kernel.org/r/20230927184824.2566086-1-jithu.joseph@intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-10-04platform/x86: hp-bioscfg: Fix reference leakArmin Wolf
If a duplicate attribute is found using kset_find_obj(), a reference to that attribute is returned which needs to be disposed accordingly using kobject_put(). Use kobject_put() to dispose the duplicate attribute in such a case. As a side note, a very similar bug was fixed in commit 7295a996fdab ("platform/x86: dell-sysman: Fix reference leak"), so it seems that the bug was copied from that driver. Compile-tested only. Fixes: a34fc329b189 ("platform/x86: hp-bioscfg: bioscfg") Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Jorge Lopez <jorge.lopez2@hp.com> Link: https://lore.kernel.org/r/20230925142819.74525-3-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-10-04platform/x86: think-lmi: Fix reference leakArmin Wolf
If a duplicate attribute is found using kset_find_obj(), a reference to that attribute is returned which needs to be disposed accordingly using kobject_put(). Move the setting name validation into a separate function to allow for this change without having to duplicate the cleanup code for this setting. As a side note, a very similar bug was fixed in commit 7295a996fdab ("platform/x86: dell-sysman: Fix reference leak"), so it seems that the bug was copied from that driver. Compile-tested only. Fixes: 1bcad8e510b2 ("platform/x86: think-lmi: Fix issues with duplicate attributes") Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20230925142819.74525-2-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-10-04can: etas_es58x: add missing a blank line after declarationVincent Mailhol
Fix below checkpatch warning: WARNING: Missing a blank line after declarations #2233: FILE: drivers/net/can/usb/etas_es58x/es58x_core.c:2233: + int ret = es58x_init_netdev(es58x_dev, ch_idx); + if (ret) { Fixes: d8f26fd689dd ("can: etas_es58x: remove es58x_get_product_info()") Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20230924110914.183898-3-mailhol.vincent@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-10-04can: etas_es58x: rework the version check logic to silence -Wformat-truncationVincent Mailhol
Following [1], es58x_devlink.c now triggers the following format-truncation GCC warnings: drivers/net/can/usb/etas_es58x/es58x_devlink.c: In function ‘es58x_devlink_info_get’: drivers/net/can/usb/etas_es58x/es58x_devlink.c:201:41: warning: ‘%02u’ directive output may be truncated writing between 2 and 3 bytes into a region of size between 1 and 3 [-Wformat-truncation=] 201 | snprintf(buf, sizeof(buf), "%02u.%02u.%02u", | ^~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:201:30: note: directive argument in the range [0, 255] 201 | snprintf(buf, sizeof(buf), "%02u.%02u.%02u", | ^~~~~~~~~~~~~~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:201:3: note: ‘snprintf’ output between 9 and 12 bytes into a destination of size 9 201 | snprintf(buf, sizeof(buf), "%02u.%02u.%02u", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 202 | fw_ver->major, fw_ver->minor, fw_ver->revision); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:211:41: warning: ‘%02u’ directive output may be truncated writing between 2 and 3 bytes into a region of size between 1 and 3 [-Wformat-truncation=] 211 | snprintf(buf, sizeof(buf), "%02u.%02u.%02u", | ^~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:211:30: note: directive argument in the range [0, 255] 211 | snprintf(buf, sizeof(buf), "%02u.%02u.%02u", | ^~~~~~~~~~~~~~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:211:3: note: ‘snprintf’ output between 9 and 12 bytes into a destination of size 9 211 | snprintf(buf, sizeof(buf), "%02u.%02u.%02u", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 212 | bl_ver->major, bl_ver->minor, bl_ver->revision); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:221:38: warning: ‘%03u’ directive output may be truncated writing between 3 and 5 bytes into a region of size between 2 and 4 [-Wformat-truncation=] 221 | snprintf(buf, sizeof(buf), "%c%03u/%03u", | ^~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:221:30: note: directive argument in the range [0, 65535] 221 | snprintf(buf, sizeof(buf), "%c%03u/%03u", | ^~~~~~~~~~~~~ drivers/net/can/usb/etas_es58x/es58x_devlink.c:221:3: note: ‘snprintf’ output between 9 and 13 bytes into a destination of size 9 221 | snprintf(buf, sizeof(buf), "%c%03u/%03u", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 222 | hw_rev->letter, hw_rev->major, hw_rev->minor); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is not an actual bug because the sscanf() parsing makes sure that the u8 are only two digits long and the u16 only three digits long. Thus below declaration: char buf[max(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))]; allocates just what is needed to represent either of the versions. This warning was known but ignored because, at the time of writing, -Wformat-truncation was not present in the kernel, not even at W=3 [2]. One way to silence this warning is to check the range of all sub version numbers are valid: [0, 99] for u8 and range [0, 999] for u16. The module already has a logic which considers that when all the sub version numbers are zero, the version number is not set. Note that not having access to the device specification, this was an arbitrary decision. This logic can thus be removed in favor of global check that would cover both cases: - the version number is not set (parsing failed) - the version number is not valid (paranoiac check to please gcc) Before starting to parse the product info string, set the version sub-numbers to the maximum unsigned integer thus violating the definitions of struct es58x_sw_version or struct es58x_hw_revision. Then, rework the es58x_sw_version_is_set() and es58x_hw_revision_is_set() functions: remove the check that the sub-numbers are non zero and replace it by a check that they fit in the expected number of digits. This done, rename the functions to reflect the change and rewrite the documentation. While doing so, also add a description of the return value. Finally, the previous version only checked that &es58x_hw_revision.letter was not the null character. Replace this check by an alphanumeric character check to make sure that we never return a special character or a non-printable one and update the documentation of struct es58x_hw_revision accordingly. All those extra checks are paranoid but have the merit to silence the newly introduced W=1 format-truncation warning [1]. [1] commit 6d4ab2e97dcf ("extrawarn: enable format and stringop overflow warnings in W=1") Link: https://git.kernel.org/torvalds/c/6d4ab2e97dcf [2] https://lore.kernel.org/all/CAMZ6Rq+K+6gbaZ35SOJcR9qQaTJ7KR0jW=XoDKFkobjhj8CHhw@mail.gmail.com/ Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Closes: https://lore.kernel.org/linux-can/20230914-carrousel-wrecker-720a08e173e9-mkl@pengutronix.de/ Fixes: 9f06631c3f1f ("can: etas_es58x: export product information through devlink_ops::info_get()") Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20230924110914.183898-2-mailhol.vincent@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-10-04can: sja1000: Fix commentMiquel Raynal
There is likely a copy-paste error here, as the exact same comment appears below in this function, one time calling set_reset_mode(), the other set_normal_mode(). Fixes: 429da1cc841b ("can: Driver for the SJA1000 CAN controller") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/all/20230922155130.592187-1-miquel.raynal@bootlin.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-10-04dmaengine: ti: k3-udma-glue: clean up k3_udma_glue_tx_get_irq() returnDan Carpenter
The k3_udma_glue_tx_get_irq() function currently returns negative error codes on error, zero on error and positive values for success. This complicates life for the callers who need to propagate the error code. Also GCC will not warn about unsigned comparisons when you check: if (unsigned_irq <= 0) All the callers have been fixed now but let's just make this easy going forward. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04net: ti: icssg-prueth: Fix signedness bug in prueth_init_tx_chns()Dan Carpenter
The "tx_chn->irq" variable is unsigned so the error checking does not work correctly. Fixes: 128d5874c082 ("net: ti: icssg-prueth: Add ICSSG ethernet driver") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()Dan Carpenter
This accidentally returns success, but it should return a negative error code. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04dmaengine: idxd: use spin_lock_irqsave before wait_event_lock_irqRex Zhang
In idxd_cmd_exec(), wait_event_lock_irq() explicitly calls spin_unlock_irq()/spin_lock_irq(). If the interrupt is on before entering wait_event_lock_irq(), it will become off status after wait_event_lock_irq() is called. Later, wait_for_completion() may go to sleep but irq is disabled. The scenario is warned in might_sleep(). Fix it by using spin_lock_irqsave() instead of the primitive spin_lock() to save the irq status before entering wait_event_lock_irq() and using spin_unlock_irqrestore() instead of the primitive spin_unlock() to restore the irq status before entering wait_for_completion(). Before the change: idxd_cmd_exec() { interrupt is on spin_lock() // interrupt is on wait_event_lock_irq() spin_unlock_irq() // interrupt is enabled ... spin_lock_irq() // interrupt is disabled spin_unlock() // interrupt is still disabled wait_for_completion() // report "BUG: sleeping function // called from invalid context... // in_atomic() irqs_disabled()" } After applying spin_lock_irqsave(): idxd_cmd_exec() { interrupt is on spin_lock_irqsave() // save the on state // interrupt is disabled wait_event_lock_irq() spin_unlock_irq() // interrupt is enabled ... spin_lock_irq() // interrupt is disabled spin_unlock_irqrestore() // interrupt is restored to on wait_for_completion() // No Call trace } Fixes: f9f4082dbc56 ("dmaengine: idxd: remove interrupt disable for cmd_lock") Signed-off-by: Rex Zhang <rex.zhang@intel.com> Signed-off-by: Lijun Pan <lijun.pan@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230916060619.3744220-1-rex.zhang@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-10-04vringh: don't use vringh_kiov_advance() in vringh_iov_xfer()Stefano Garzarella
In the while loop of vringh_iov_xfer(), `partlen` could be 0 if one of the `iov` has 0 lenght. In this case, we should skip the iov and go to the next one. But calling vringh_kiov_advance() with 0 lenght does not cause the advancement, since it returns immediately if asked to advance by 0 bytes. Let's restore the code that was there before commit b8c06ad4d67d ("vringh: implement vringh_kiov_advance()"), avoiding using vringh_kiov_advance(). Fixes: b8c06ad4d67d ("vringh: implement vringh_kiov_advance()") Cc: stable@vger.kernel.org Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04MAINTAINERS: adjust header file entry in DPLL SUBSYSTEMLukas Bulwahn
Commit 9431063ad323 ("dpll: core: Add DPLL framework base functions") adds the section DPLL SUBSYSTEM in MAINTAINERS and includes a file entry to the non-existing file 'include/net/dpll.h'. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a broken reference. Looking at the file stat of the commit above, this entry clearly intended to refer to 'include/linux/dpll.h'. Adjust this header file entry in DPLL SUBSYSTEM. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04Merge tag 'xfs-fstrim-busy-tag' of ↵Chandan Babu R
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-6.6-fixesC xfs: reduce AGF hold times during fstrim operations A recent log space overflow and recovery failure was root caused to a long running truncate blocking on the AGF and ending up pinning the tail of the log. The filesystem then hung, the machine was rebooted, and log recoery then refused to run because there wasn't enough space in the log for EFI transaction reservation. The reason the long running truncate got blocked on the AGF for so long was that an fstrim was being run. THe underlying block device was large and very slow (10TB ceph rbd volume) and so discarding all the free space in the AG took a really long time. The current fstrim implementation holds the AGF across the entire operations - both the freee space scan and the issuing of all the discards. The discards are synchronous and single depth, so if there are millions of free spaces, we hold the AGF lock across millions of discard operations. It doesn't really need to be said that this is a Bad Thing. This series reworks the fstrim discard path to use the same mechanisms as online discard. This allows discards to be issued asynchronously without holding the AGF locked, enabling higher discard queue depths (much faster on fast devices) and only requiring the AGF lock to be held whilst we are scanning free space. To do this, we make use of busy extents - we lock the AGF, mark all the extents we want to discard as "busy under discard" so that nothing will be allowed to allocate them, and then drop the AGF lock. We then issue discards on the gathered busy extents and on discard completion remove them from the busy list. This results in AGF lock holds times for fstrim dropping to a few milliseconds each batch of free extents we scan, and so the hours long hold times that can currently occur on large, slow, badly fragmented device no longer occur. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'xfs-fstrim-busy-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: abort fstrim if kernel is suspending xfs: reduce AGF hold times during fstrim operations xfs: move log discard work to xfs_discard.c
2023-10-03nbd: don't call blk_mark_disk_dead nbd_clear_sock_ioctlChristoph Hellwig
blk_mark_disk_dead is the proper interface to shut down a block device, but it also makes the disk unusable forever. nbd_clear_sock_ioctl on the other hand wants to shut down the file system, but allow the block device to be used again when when connected to another socket. Switch nbd to use disk_force_media_change and nbd_bdev_reset to go back to a behavior of the old __invalidate_device call, with the added benefit of incrementing the device generation as there is no guarantee the old content comes back when the device is reconnected. Reported-by: Samuel Holland <samuel.holland@sifive.com> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: 0c1c9a27ce90 ("nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl") Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20231003153106.1331363-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-10-04btrfs: error out when reallocating block for defrag using a stale transactionFilipe Manana
At btrfs_realloc_node() we have these checks to verify we are not using a stale transaction (a past transaction with an unblocked state or higher), and the only thing we do is to trigger two WARN_ON(). This however is a critical problem, highly unexpected and if it happens it's most likely due to a bug, so we should error out and turn the fs into error state so that such issue is much more easily noticed if it's triggered. The problem is critical because in btrfs_realloc_node() we COW tree blocks, and using such stale transaction will lead to not persisting the extent buffers used for the COW operations, as allocating tree block adds the range of the respective extent buffers to the ->dirty_pages iotree of the transaction, and a stale transaction, in the unlocked state or higher, will not flush dirty extent buffers anymore, therefore resulting in not persisting the tree block and resource leaks (not cleaning the dirty_pages iotree for example). So do the following changes: 1) Return -EUCLEAN if we find a stale transaction; 2) Turn the fs into error state, with error -EUCLEAN, so that no transaction can be committed, and generate a stack trace; 3) Combine both conditions into a single if statement, as both are related and have the same error message; 4) Mark the check as unlikely, since this is not expected to ever happen. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-04btrfs: error when COWing block from a root that is being deletedFilipe Manana
At btrfs_cow_block() we check if the block being COWed belongs to a root that is being deleted and if so we log an error message. However this is an unexpected case and it indicates a bug somewhere, so we should return an error and abort the transaction. So change this in the following ways: 1) Abort the transaction with -EUCLEAN, so that if the issue ever happens it can easily be noticed; 2) Change the logged message level from error to critical, and change the message itself to print the block's logical address and the ID of the root; 3) Return -EUCLEAN to the caller; 4) As this is an unexpected scenario, that should never happen, mark the check as unlikely, allowing the compiler to potentially generate better code. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-04btrfs: error out when COWing block using a stale transactionFilipe Manana
At btrfs_cow_block() we have these checks to verify we are not using a stale transaction (a past transaction with an unblocked state or higher), and the only thing we do is to trigger a WARN with a message and a stack trace. This however is a critical problem, highly unexpected and if it happens it's most likely due to a bug, so we should error out and turn the fs into error state so that such issue is much more easily noticed if it's triggered. The problem is critical because using such stale transaction will lead to not persisting the extent buffer used for the COW operation, as allocating a tree block adds the range of the respective extent buffer to the ->dirty_pages iotree of the transaction, and a stale transaction, in the unlocked state or higher, will not flush dirty extent buffers anymore, therefore resulting in not persisting the tree block and resource leaks (not cleaning the dirty_pages iotree for example). So do the following changes: 1) Return -EUCLEAN if we find a stale transaction; 2) Turn the fs into error state, with error -EUCLEAN, so that no transaction can be committed, and generate a stack trace; 3) Combine both conditions into a single if statement, as both are related and have the same error message; 4) Mark the check as unlikely, since this is not expected to ever happen. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-04btrfs: always print transaction aborted messages with an error levelFilipe Manana
Commit b7af0635c87f ("btrfs: print transaction aborted messages with an error level") changed the log level of transaction aborted messages from a debug level to an error level, so that such messages are always visible even on production systems where the log level is normally above the debug level (and also on some syzbot reports). Later, commit fccf0c842ed4 ("btrfs: move btrfs_abort_transaction to transaction.c") changed the log level back to debug level when the error number for a transaction abort should not have a stack trace printed. This happened for absolutely no reason. It's always useful to print transaction abort messages with an error level, regardless of whether the error number should cause a stack trace or not. So change back the log level to error level. Fixes: fccf0c842ed4 ("btrfs: move btrfs_abort_transaction to transaction.c") CC: stable@vger.kernel.org # 6.5+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>