linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-12-17	bpftool: Reimplement large insn size limit feature probing	Andrii Nakryiko
	Reimplement bpf_probe_large_insn_limit() in bpftool, as that libbpf API is scheduled for deprecation in v0.8. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-4-andrii@kernel.org
2021-12-17	selftests/bpf: Add libbpf feature-probing API selftests	Andrii Nakryiko
	Add selftests for prog/map/prog+helper feature probing APIs. Prog and map selftests are designed in such a way that they will always test all the possible prog/map types, based on running kernel's vmlinux BTF enum definition. This way we'll always be sure that when adding new BPF program types or map types, libbpf will be always updated accordingly to be able to feature-detect them. BPF prog_helper selftest will have to be manually extended with interesting and important prog+helper combinations, it's easy, but can't be completely automated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-3-andrii@kernel.org
2021-12-17	libbpf: Rework feature-probing APIs	Andrii Nakryiko
	Create three extensible alternatives to inconsistently named feature-probing APIs: - libbpf_probe_bpf_prog_type() instead of bpf_probe_prog_type(); - libbpf_probe_bpf_map_type() instead of bpf_probe_map_type(); - libbpf_probe_bpf_helper() instead of bpf_probe_helper(). Set up return values such that libbpf can report errors (e.g., if some combination of input arguments isn't possible to validate, etc), in addition to whether the feature is supported (return value 1) or not supported (return value 0). Also schedule deprecation of those three APIs. Also schedule deprecation of bpf_probe_large_insn_limit(). Also fix all the existing detection logic for various program and map types that never worked: - BPF_PROG_TYPE_LIRC_MODE2; - BPF_PROG_TYPE_TRACING; - BPF_PROG_TYPE_LSM; - BPF_PROG_TYPE_EXT; - BPF_PROG_TYPE_SYSCALL; - BPF_PROG_TYPE_STRUCT_OPS; - BPF_MAP_TYPE_STRUCT_OPS; - BPF_MAP_TYPE_BLOOM_FILTER. Above prog/map types needed special setups and detection logic to work. Subsequent patch adds selftests that will make sure that all the detection logic keeps working for all current and future program and map types, avoiding otherwise inevitable bit rot. [0] Closes: https://github.com/libbpf/libbpf/issues/312 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Cc: Julia Kartseva <hex@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org
2021-12-17	Revert "xsk: Do not sleep in poll() when need_wakeup set"	Magnus Karlsson
	This reverts commit bd0687c18e635b63233dc87f38058cd728802ab4. This patch causes a Tx only workload to go to sleep even when it does not have to, leading to misserable performance in skb mode. It fixed one rare problem but created a much worse one, so this need to be reverted while I try to craft a proper solution to the original problem. Fixes: bd0687c18e63 ("xsk: Do not sleep in poll() when need_wakeup set") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211217145646.26449-1-magnus.karlsson@gmail.com
2021-12-17	timekeeping: Really make sure wall_to_monotonic isn't positive	Yu Liao
	Even after commit e1d7ba873555 ("time: Always make sure wall_to_monotonic isn't positive") it is still possible to make wall_to_monotonic positive by running the following code: int main(void) { struct timespec time; clock_gettime(CLOCK_MONOTONIC, &time); time.tv_nsec = 0; clock_settime(CLOCK_REALTIME, &time); return 0; } The reason is that the second parameter of timespec64_compare(), ts_delta, may be unnormalized because the delta is calculated with an open coded substraction which causes the comparison of tv_sec to yield the wrong result: wall_to_monotonic = { .tv_sec = -10, .tv_nsec = 900000000 } ts_delta = { .tv_sec = -9, .tv_nsec = -900000000 } That makes timespec64_compare() claim that wall_to_monotonic < ts_delta, but actually the result should be wall_to_monotonic > ts_delta. After normalization, the result of timespec64_compare() is correct because the tv_sec comparison is not longer misleading: wall_to_monotonic = { .tv_sec = -10, .tv_nsec = 900000000 } ts_delta = { .tv_sec = -10, .tv_nsec = 100000000 } Use timespec64_sub() to ensure that ts_delta is normalized, which fixes the issue. Fixes: e1d7ba873555 ("time: Always make sure wall_to_monotonic isn't positive") Signed-off-by: Yu Liao <liaoyu15@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211213135727.1656662-1-liaoyu15@huawei.com
2021-12-17	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "One driver fix: the pm8001 has never actually worked on a system with an IOMMU and this fixes that use case" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: pm8001: Fix phys_to_virt() usage on dma_addr_t
2021-12-17	Merge tag 'for-5.16-rc5-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes, almost all error handling one-liners and for stable. - regression fix in directory logging items - regression fix of extent buffer status bits handling after an error - fix memory leak in error handling path in tree-log - fix freeing invalid anon device number when handling errors during subvolume creation - fix warning when freeing leaf after subvolume creation failure - fix missing blkdev put in device scan error handling - fix invalid delayed ref after subvolume creation failure" * tag 'for-5.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix missing blkdev_put() call in btrfs_scan_one_device() btrfs: fix warning when freeing leaf after subvolume creation failure btrfs: fix invalid delayed ref after subvolume creation failure btrfs: check WRITE_ERR when trying to read an extent buffer btrfs: fix missing last dir item offset update when logging directory btrfs: fix double free of anon_dev after failure to create subvolume btrfs: fix memory leak in __add_inode_ref()
2021-12-17	ipmi: fix initialization when workqueue allocation fails	Thadeu Lima de Souza Cascardo
	If the workqueue allocation fails, the driver is marked as not initialized, and timer and panic_notifier will be left registered. Instead of removing those when workqueue allocation fails, do the workqueue initialization before doing it, and cleanup srcu_struct if it fails. Fixes: 1d49eb91e86e ("ipmi: Move remove_work to dedicated workqueue") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Corey Minyard <cminyard@mvista.com> Cc: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com> Cc: stable@vger.kernel.org Message-Id: <20211217154410.1228673-2-cascardo@canonical.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2021-12-17	ipmi: bail out if init_srcu_struct fails	Thadeu Lima de Souza Cascardo
	In case, init_srcu_struct fails (because of memory allocation failure), we might proceed with the driver initialization despite srcu_struct not being entirely initialized. Fixes: 913a89f009d9 ("ipmi: Don't initialize anything in the core until something uses it") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org Message-Id: <20211217154410.1228673-1-cascardo@canonical.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2021-12-17	iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2	Brett Creeley
	For VIRTCHNL_VF_OFFLOAD_VLAN, PF's would limit the number of VLAN filters a VF was allowed to add. However, by the time the opcode failed, the VLAN netdev had already been added. VIRTCHNL_VF_OFFLOAD_VLAN_V2 added the ability for a PF to tell the VF how many VLAN filters it's allowed to add. Make changes to support that functionality. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 offload enable/disable	Brett Creeley
	The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows the VF to support 802.1Q and 802.1ad VLAN insertion and stripping if successfully negotiated via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS. Multiple changes were needed to support this new functionality. 1. Added new aq_required flags to support any kind of VLAN stripping and insertion offload requests via virtchnl. 2. Added the new method iavf_set_vlan_offload_features() that's used during VF initialization, VF reset, and iavf_set_features() to set the aq_required bits based on the current VLAN offload configuration of the VF's netdev. 3. Added virtchnl handling for VIRTCHNL_OP_ENABLE_STRIPPING_V2, VIRTCHNL_OP_DISABLE_STRIPPING_V2, VIRTCHNL_OP_ENABLE_INSERTION_V2, and VIRTCHNL_OP_ENABLE_INSERTION_V2. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 hotpath	Brett Creeley
	The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows the PF to set the location of the Tx and Rx VLAN tag for insertion and stripping offloads. In order to support this functionality a few changes are needed. 1. Add a new method to cache the VLAN tag location based on negotiated capabilities for the Tx and Rx ring flags. This needs to be called in the initialization and reset paths. 2. Refactor the transmit hotpath to account for the new Tx ring flags. When IAVF_TXR_FLAGS_VLAN_LOC_L2TAG2 is set, then the driver needs to insert the VLAN tag in the L2TAG2 field of the transmit descriptor. When the IAVF_TXRX_FLAGS_VLAN_LOC_L2TAG1 is set, then the driver needs to use the l2tag1 field of the data descriptor (same behavior as before). 3. Refactor the iavf_tx_prepare_vlan_flags() function to simplify transmit hardware VLAN offload functionality by only depending on the skb_vlan_tag_present() function. This can be done because the OS won't request transmit offload for a VLAN unless the driver told the OS it's supported and enabled. 4. Refactor the receive hotpath to account for the new Rx ring flags and VLAN ethertypes. This requires checking the Rx ring flags and descriptor status bits to determine the location of the VLAN tag. Also, since only a single ethertype can be supported at a time, check the enabled netdev features before specifying a VLAN ethertype in __vlan_hwaccel_put_tag(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev config	Brett Creeley
	Based on VIRTCHNL_VF_OFFLOAD_VLAN_V2, the VF can now support more VLAN capabilities (i.e. 802.1AD offloads and filtering). In order to communicate these capabilities to the netdev layer, the VF needs to parse its VLAN capabilities based on whether it was able to negotiation VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or neither of these. In order to support this, add the following functionality: iavf_get_netdev_vlan_hw_features() - This is used to determine the VLAN features that the underlying hardware supports and that can be toggled off/on based on the negotiated capabiltiies. For example, if VIRTCHNL_VF_OFFLOAD_VLAN_V2 was negotiated, then any capability marked with VIRTCHNL_VLAN_TOGGLE can be toggled on/off by the VF. If VIRTCHNL_VF_OFFLOAD_VLAN was negotiated, then only VLAN insertion and/or stripping can be toggled on/off. iavf_get_netdev_vlan_features() - This is used to determine the VLAN features that the underlying hardware supports and that should be enabled by default. For example, if VIRTHCNL_VF_OFFLOAD_VLAN_V2 was negotiated, then any supported capability that has its ethertype_init filed set should be enabled by default. If VIRTCHNL_VF_OFFLOAD_VLAN was negotiated, then filtering, stripping, and insertion should be enabled by default. Also, refactor iavf_fix_features() to take into account the new capabilities. To do this, query all the supported features (enabled by default and toggleable) and make sure the requested change is supported. If VIRTCHNL_VF_OFFLOAD_VLAN_V2 is successfully negotiated, there is no need to check VIRTCHNL_VLAN_TOGGLE here because the driver already told the netdev layer which features can be toggled via netdev->hw_features during iavf_process_config(), so only those features will be requested to change. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation	Brett Creeley
	In order to support the new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability the VF driver needs to rework it's initialization state machine and reset flow. This has to be done because successful negotiation of VIRTCHNL_VF_OFFLOAD_VLAN_V2 requires the VF driver to perform a second capability request via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS before configuring the adapter and its netdev. Add the VIRTCHNL_VF_OFFLOAD_VLAN_V2 bit when sending the VIRTHCNL_OP_GET_VF_RESOURECES message. The underlying PF will either support VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or neither. Both of these offloads should never be supported together. Based on this, add 2 new states to the initialization state machine: __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS __IAVF_INIT_CONFIG_ADAPTER The __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS state is used to request/store the new VLAN capabilities if and only if VIRTCHNL_VLAN_OFFLOAD_VLAN_V2 was successfully negotiated in the __IAVF_INIT_GET_RESOURCES state. The __IAVF_INIT_CONFIG_ADAPTER state is used to configure the adapter/netdev after the resource requests have finished. The VF will move into this state regardless of whether it successfully negotiated VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2. Also, add a the new flag IAVF_FLAG_AQ_GET_OFFLOAD_VLAN_V2_CAPS and set it during VF reset. If VIRTCHNL_VF_OFFLOAD_VLAN_V2 was successfully negotiated then the VF will request its VLAN capabilities via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS during the reset. This is needed because the PF may change/modify the VF's configuration during VF reset (i.e. modifying the VF's port VLAN configuration). This also, required the VF to call netdev_update_features() since its VLAN features may change during VF reset. Make sure to call this under rtnl_lock(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	virtchnl: Add support for new VLAN capabilities	Brett Creeley
	Currently VIRTCHNL only allows for VLAN filtering and offloads to happen on a single 802.1Q VLAN. Add support to filter and offload on inner, outer, and/or inner + outer VLANs. This is done by introducing the new capability VIRTCHNL_VF_OFFLOAD_VLAN_V2. The flow to negotiate this new capability is shown below. 1. VF - sets the VIRTCHNL_VF_OFFLOAD_VLAN_V2 bit in the virtchnl_vf_resource.vf_caps_flags during the VIRTCHNL_OP_GET_VF_RESOURCES request message. The VF should also set the VIRTCHNL_VF_OFFLOAD_VLAN bit in case the PF driver doesn't support the new capability. 2. PF - sets the VLAN capability bit it supports in the VIRTCHNL_OP_GET_VF_RESOURCES response message. This will either be VIRTCHNL_VF_OFFLOAD_VLAN_V2, VIRTCHNL_VF_OFFLOAD_VLAN, or none. 3. VF - If the VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability was ACK'd by the PF, then the VF needs to request the VLAN capabilities of the PF/Device by issuing a VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS request. If the VIRTCHNL_VF_OFFLOAD_VLAN capability was ACK'd then the VF knows only single 802.1Q VLAN filtering/offloads are supported. If no VLAN capability is ACK'd then the PF/Device doesn't support hardware VLAN filtering/offloads for this VF. 4. PF - Populates the virtchnl_vlan_caps structure based on what it allows/supports for that VF and sends that response via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS. After VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is successfully negotiated the VF driver needs to interpret the capabilities supported by the underlying PF/Device. The VF will be allowed to filter/offload the inner 802.1Q, outer (various ethertype), inner 802.1Q + outer (various ethertypes), or none based on which fields are set. The VF will also need to interpret where the VLAN tag should be inserted and/or stripped based on the negotiated capabilities. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	Merge tag 'selinux-pr-20211217' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "Another small SELinux fix for v5.16 to ensure that we don't block on memory allocations while holding a spinlock. This passes all our tests without problem" * tag 'selinux-pr-20211217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix sleeping function called from invalid context
2021-12-17	Merge tag 'riscv-for-linus-5.16-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A handful of DT updates for the SiFive HiFive Unmatched, that fix the regulator handling. These should stop some warning spew. - A pair of fixes for both the SiFive Hifive Unleashed and Unmatched, that correctly hook up the MMC card detect signal. * tag 'riscv-for-linus-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: dts: sifive unmatched: Link the tmp451 with its power supply riscv: dts: sifive unmatched: Fix regulator for board rev3 riscv: dts: sifive unmatched: Expose the PMIC sub-functions riscv: dts: sifive unmatched: Expose the board ID eeprom riscv: dts: sifive unmatched: Name gpio lines riscv: dts: unmatched: Add gpio card detect to mmc-spi-slot riscv: dts: unleashed: Add gpio card detect to mmc-spi-slot
2021-12-17	Merge tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: - Fix for hammering on the delayed run queue timer (me) - bcache regression fix for this merge window (Lin) - Fix a divide-by-zero in the blk-iocost code (Tejun) * tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block: bcache: fix NULL pointer reference in cached_dev_detach_finish block: reduce kblockd_mod_delayed_work_on() CPU consumption iocost: Fix divide-by-zero on donation from low hweight cgroup
2021-12-17	Merge tag 'io_uring-5.16-2021-12-17' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull io_uring fix from Jens Axboe: "Just a single fix, fixing an issue with the worker creation change that was merged last week" * tag 'io_uring-5.16-2021-12-17' of git://git.kernel.dk/linux-block: io-wq: drop wqe lock before creating new worker
2021-12-17	Merge tag 'dmaengine-fix-5.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "A bunch of driver fixes, notably: - uninit variable fix for dw-axi-dmac driver - return value check dw-edma driver - calling wq quiesce inside spinlock and missed completion for idxd driver - mod alias fix for st_fdma driver" * tag 'dmaengine-fix-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: st_fdma: fix MODULE_ALIAS dmaengine: idxd: fix missed completion on abort path dmaengine: ti: k3-udma: Fix smatch warnings dmaengine: idxd: fix calling wq quiesce inside spinlock dmaengine: dw-edma: Fix return value check for dma_set_mask_and_coherent() dmaengine: dw-axi-dmac: Fix uninitialized variable in axi_chan_block_xfer_start()
2021-12-17	ice: xsk: fix cleaned_count setting	Maciej Fijalkowski
	Currently cleaned_count is initialized to ICE_DESC_UNUSED(rx_ring) and later on during the Rx processing it is incremented per each frame that driver consumed. This can result in excessive buffers requested from xsk pool based on that value. To address this, just drop cleaned_count and pass ICE_DESC_UNUSED(rx_ring) directly as a function argument to ice_alloc_rx_bufs_zc(). Idea is to ask for buffers as many as consumed. Let us also call ice_alloc_rx_bufs_zc unconditionally at the end of ice_clean_rx_irq_zc. This has been changed in that way for corresponding ice_clean_rx_irq, but not here. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	ice: xsk: allow empty Rx descriptors on XSK ZC data path	Maciej Fijalkowski
	Commit ac6f733a7bd5 ("ice: allow empty Rx descriptors") stated that ice HW can produce empty descriptors that are valid and they should be processed. Add this support to xsk ZC path to avoid potential processing problems. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor	Maciej Fijalkowski
	The descriptor that ntu is pointing at when we exit ice_alloc_rx_bufs_zc() should not have its corresponding DD bit cleared as descriptor is not allocated in there and it is not valid for HW usage. The allocation routine at the entry will fill the descriptor that ntu points to after it was set to ntu + nb_buffs on previous call. Even the spec says: "The tail pointer should be set to one descriptor beyond the last empty descriptor in host descriptor ring." Therefore, step away from clearing the status_error0 on ntu + nb_buffs descriptor. Fixes: db804cfc21e9 ("ice: Use the xsk batched rx allocation interface") Reported-by: Elza Mathew <elza.mathew@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	ice: remove dead store on XSK hotpath	Alexander Lobakin
	The 'if (ntu == rx_ring->count)' block in ice_alloc_rx_buffers_zc() was previously residing in the loop, but after introducing the batched interface it is used only to wrap-around the NTU descriptor, thus no more need to assign 'xdp'. Fixes: db804cfc21e9 ("ice: Use the xsk batched rx allocation interface") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	ice: xsk: allocate separate memory for XDP SW ring	Maciej Fijalkowski
	Currently, the zero-copy data path is reusing the memory region that was initially allocated for an array of struct ice_rx_buf for its own purposes. This is error prone as it is based on the ice_rx_buf struct always being the same size or bigger than what the zero-copy path needs. There can also be old values present in that array giving rise to errors when the zero-copy path uses it. Fix this by freeing the ice_rx_buf region and allocating a new array for the zero-copy path that has the right length and is initialized to zero. Fixes: 57f7f8b6bc0b ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	Merge tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Mostly amdgpu fixes this week scattered around the driver, otherwise one i915, one ast, one simpledrm. There is a revert in the fb-helper for places userspace was using a string that we tried to change. i915: - Fix a bound check in the DMC fw load. ast: - NULL ptr deref fix simpledrm: - pixel clock units fix fb-helper: - userspace regression revert amdgpu: - Fix RLC register offset - GMC fix - Properly cache SMU FW version on Yellow Carp - Fix missing callback on DCN3.1 - Reset DMCUB before HW init - Fix for GMC powergating on PCO - Fix a possible memory leak in GPU metrics table handling on RN" * tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm: drm/amd/pm: fix a potential gpu_metrics_table memory leak drm/amdgpu: correct the wrong cached state for GMC on PICASSO drm/amd/display: Reset DMCUB before HW init drm/amd/display: Set exit_optimized_pwr_state for DCN31 drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC drm/amdgpu: don't override default ECO_BITs setting drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE drm/i915/display: Fix an unsigned subtraction which can never be negative. drm/ast: potential dereference of null pointer drm: simpledrm: fix wrong unit with pixel clock Revert "drm/fb-helper: improve DRM fbdev emulation device names"
2021-12-17	ice: xsk: return xsk buffers back to pool when cleaning the ring	Maciej Fijalkowski
	Currently we only NULL the xdp_buff pointer in the internal SW ring but we never give it back to the xsk buffer pool. This means that buffers can be leaked out of the buff pool and never be used again. Add missing xsk_buff_free() call to the routine that is supposed to clean the entries that are left in the ring so that these buffers in the umem can be used by other sockets. Also, only go through the space that is actually left to be cleaned instead of a whole ring. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17	mmc: mxc: Use the new PM macros	Paul Cercueil
	Use DEFINE_SIMPLE_DEV_PM_OPS() instead of the SIMPLE_DEV_PM_OPS() macro, along with using pm_sleep_ptr() as this driver doesn't handle runtime PM. This makes it possible to remove the #ifdef CONFIG_PM guard around the suspend/resume functions. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	mmc: jz4740: Use the new PM macros	Paul Cercueil
	- Use DEFINE_SIMPLE_DEV_PM_OPS() instead of the SIMPLE_DEV_PM_OPS() macro. This makes it possible to remove the __maybe_unused flags on the callback functions. - Since we only have callbacks for suspend/resume, we can conditionally compile the dev_pm_ops structure for when CONFIG_PM_SLEEP is enabled; so use the pm_sleep_ptr() macro instead of pm_ptr(). Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	ACPI: NUMA: Process hotpluggable memblocks when !CONFIG_MEMORY_HOTPLUG	Vitaly Kuznetsov
	Some systems (e.g. Hyper-V guests) have all their memory marked as hotpluggable in SRAT. acpi_numa_memory_affinity_init(), however, ignores all such regions when !CONFIG_MEMORY_HOTPLUG and this is unfortunate as memory affinity (NUMA) information gets lost. 'Hot Pluggable' flag in SRAT only means that "system hardware supports hot-add and hot-remove of this memory region", it doesn't prevent memory from being cold-plugged there. Ignore 'Hot Pluggable' bit instead of skipping the whole memory affinity information when !CONFIG_MEMORY_HOTPLUG. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	ACPI: PM: Remove redundant cache flushing	Kirill A. Shutemov
	ACPICA code takes care about cache flushing on S1/S2/S3 in acpi_hw_extended_sleep() and acpi_hw_legacy_sleep(). acpi_suspend_enter() calls into ACPICA code via acpi_enter_sleep_state() for S1 or x86_acpi_suspend_lowlevel() for S3. acpi_sleep_prepare() call tree: __acpi_pm_prepare() acpi_pm_prepare() acpi_suspend_ops::prepare_late() acpi_hibernation_ops::pre_snapshot() acpi_hibernation_ops::prepare() acpi_suspend_begin_old() acpi_suspend_begin_old::begin() acpi_hibernation_begin_old() acpi_hibernation_ops_old::acpi_hibernation_begin_old() acpi_power_off_prepare() pm_power_off_prepare() Hibernation (S4) and Power Off (S5) don't require cache flushing, so the only interesting callsites are acpi_suspend_ops::prepare_late() and acpi_suspend_begin_old::begin(). Both of them have cache flush on ->enter() operation in acpi_suspend_enter(). Remove redundant ACPI_FLUSH_CPU_CACHE() in acpi_sleep_prepare() and acpi_suspend_enter(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	ACPI: processor: idle: Only flush cache on entering C3	Kirill A. Shutemov
	According to ACPI 6.4, Section 8.2, CPU cache flushing required on entering the C3 power state. Avoid flushing the cache on entering other C-states. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	drm/amdgpu: add support for IP discovery gc_info table v2	Alex Deucher
	Used on gfx9 based systems. Fixes incorrect CU counts reported in the kernel log. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-12-17	drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly ↵	chen gong
	enabled Play a video on the raven (or PCO, raven2) platform, and then do the S3 test. When resume, the following error will be reported: amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring vcn_dec test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] ERROR resume of IP block <vcn_v1_0> failed -110 amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 [why] When playing the video: The power state flag of the vcn block is set to POWER_STATE_ON. When doing suspend: There is no change to the power state flag of the vcn block, it is still POWER_STATE_ON. When doing resume: Need to open the power gate of the vcn block and set the power state flag of the VCN block to POWER_STATE_ON. But at this time, the power state flag of the vcn block is already POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm: avoid duplicate powergate/ungate setting" patch will return the amdgpu_dpm_set_powergating_by_smu function directly. As a result, the gate of the power was not opened, causing the subsequent ring test to fail. [how] In the suspend function of the vcn block, explicitly change the power state flag of the vcn block to POWER_STATE_OFF. BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-12-17	drm/amd/pm: Fix xgmi link control on aldebaran	Lijo Lazar
	Fix the message argument. 0: Allow power down 1: Disallow power down Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-17	drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence	Huang Rui
	The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to indicate the job embedded fence. [ 156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0 [ 156.131804] #PF: supervisor read access in kernel mode [ 156.131811] #PF: error_code(0x0000) - not-present page [ 156.131817] PGD 0 P4D 0 [ 156.131824] Oops: 0000 [#1] PREEMPT SMP PTI [ 156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G OE 5.16.0-rc1-custom #1 [ 156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016 [ 156.131848] RIP: 0010:strlen+0x0/0x20 [ 156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206 [ 156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b [ 156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0 [ 156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000 [ 156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0 [ 156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007 [ 156.131914] FS: 0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000 [ 156.131923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0 [ 156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.131949] Call Trace: [ 156.131953] <TASK> [ 156.131957] ? trace_event_raw_event_dma_fence+0xcc/0x200 [ 156.131973] ? ring_buffer_unlock_commit+0x23/0x130 [ 156.131982] dma_fence_init+0x92/0xb0 [ 156.131993] amdgpu_fence_emit+0x10d/0x2b0 [amdgpu] [ 156.132302] amdgpu_ib_schedule+0x2f9/0x580 [amdgpu] [ 156.132586] amdgpu_job_run+0xed/0x220 [amdgpu] v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot) Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-17	ACPI: Use acpi_fetch_acpi_dev() instead of acpi_bus_get_device()	Rafael J. Wysocki
	Modify the ACPI code to use acpi_fetch_acpi_dev() instead of acpi_bus_get_device() where applicable. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-12-17	ACPI: scan: Introduce acpi_fetch_acpi_dev()	Rafael J. Wysocki
	Introduce acpi_fetch_acpi_dev() as a more reasonable replacement for acpi_bus_get_device() and modify the code in scan.c to use it instead of the latter. No expected functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-12-17	device property: Drop fwnode_graph_get_remote_node()	Sakari Ailus
	fwnode_graph_get_remote_node() is only used by the tegra-video driver. Convert it to use newer fwnode_graph_get_endpoint_by_id() and drop now-unused fwnode_graph_get_remote_node(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	device property: Use fwnode_graph_for_each_endpoint() macro	Sakari Ailus
	Now that we have fwnode_graph_for_each_endpoint() macro, use it instead of calling fwnode_graph_get_next_endpoint() directly. It manages the iterator variable for the user without manual intervention. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	device property: Implement fwnode_graph_get_endpoint_count()	Sakari Ailus
	Add fwnode_graph_get_endpoint_count() function to provide generic implementation of of_graph_get_endpoint_count(). The former by default only counts endpoints to available devices which is consistent with the rest of the fwnode graph API. By providing FWNODE_GRAPH_DEVICE_DISABLED flag, also unconnected endpoints and endpoints to disabled devices are counted. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	Documentation: ACPI: Update references	Sakari Ailus
	Update references for the ACPI _DSD documentation. In particular: - Substitute _DSD property and hierarchical data extension documents with the newer DSD guide that replaces both, and use its HTML form. - Refer to the latest ACPI spec. - Add data node reference documentation reference to graph documentation. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	Documentation: ACPI: Fix data node reference documentation	Sakari Ailus
	The data node reference documentation was missing a package that must contain the property values, instead property name and multiple values being present in a single package. This is not aligned with the _DSD spec. Fix it by adding the package for the values. Also add the missing "reg" properties to two numbered nodes. Fixes: b10134a3643d ("ACPI: property: Document hierarchical data extension references") Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	device property: Fix documentation for FWNODE_GRAPH_DEVICE_DISABLED	Sakari Ailus
	FWNODE_GRAPH_DEVICE_DISABLED flag was meant for also returning endpoints connected to disabled devices, but it also may return endpoints that are not connected. Fix this in documentation. Also fwnode_graph_get_endpoint_by_id() was affeced by this. Also improve the language a little bit. Fixes: 0fcc2bdc8aff ("device property: Add fwnode_graph_get_endpoint_by_id()") Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	device property: Fix fwnode_graph_devcon_match() fwnode leak	Sakari Ailus
	For each endpoint it encounters, fwnode_graph_devcon_match() checks whether the endpoint's remote port parent device is available. If it is not, it ignores the endpoint but does not put the reference to the remote endpoint port parent fwnode. For available devices the fwnode handle reference is put as expected. Put the reference for unavailable devices now. Fixes: 637e9e52b185 ("device connection: Find device connections also from device graphs") Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17	PM: sleep: Fix error handling in dpm_prepare()	Rafael J. Wysocki
	Commit 2aa36604e824 ("PM: sleep: Avoid calling put_device() under dpm_list_mtx") forgot to update the while () loop termination condition to also break the loop if error is nonzero, which causes the loop to become infinite if device_prepare() returns an error for one device. Add the missing !error check. Fixes: 2aa36604e824 ("PM: sleep: Avoid calling put_device() under dpm_list_mtx") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: All applicable <stable@vger.kernel.org>
2021-12-17	ARM: dts: armada-38x: Add generic compatible to UART nodes	Marek Behún
	Add generic compatible string "ns16550a" to serial port nodes of Armada 38x. This makes it possible to use earlycon. Fixes: 0d3d96ab0059 ("ARM: mvebu: add Device Tree description of the Armada 380/385 SoCs") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2021-12-17	arm64: dts: marvell: cn9130: enable CP0 GPIO controllers	Robert Marko
	CN9130 has a built-in CP115 which has 2 GPIO controllers, but unlike in Armada 7k and 8k both are left disabled by the SoC DTSI. This first of all makes no sense as they are always present due to being SoC built-in and its an issue as boards like CN9130-CRB use the CPO GPIO2 pins for regulators and SD card support without enabling them first. So, enable both of them like Armada 7k and 8k do. Fixes: 6b8970bd8d7a ("arm64: dts: marvell: Add support for Marvell CN9130 SoC support") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2021-12-17	arm64: dts: marvell: cn9130: add GPIO and SPI aliases	Robert Marko
	CN9130 has one CP115 built in, which like the CP110 has 2 GPIO and 2 SPI controllers built-in. However, unlike the Armada 7k and 8k the SoC DTSI doesn't add the required aliases as both the Orion SPI driver and MVEBU GPIO drivers require the aliases to be present. So add the required aliases for GPIO and SPI controllers. Fixes: 6b8970bd8d7a ("arm64: dts: marvell: Add support for Marvell CN9130 SoC support") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2021-12-17	arm64: dts: marvell: armada-37xx: Add xtal clock to comphy node	Pali Rohár
	Kernel driver phy-mvebu-a3700-comphy.c needs to know the rate of the reference xtal clock. So add missing xtal clock source into comphy device tree node. If the property is not present, the driver defaults to 25 MHz xtal rate (which, as far as we know, is used by all the existing boards). Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>