linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-07-28	drm/amd/display: Clean up some inconsistent indenting	Jiapeng Chong
	No functional modification involved. smatch warning: drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn201/dcn201_clk_mgr.c:107 dcn201_update_clocks() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amd/display: Clean up some inconsistent indenting	Jiapeng Chong
	No functional modification involved. smatch warnings: drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c:716 dcn314_clk_mgr_construct() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amd/display: Clean up some inconsistent indenting	Jiapeng Chong
	No functional modification involved. smatch warnings: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_mpc.c:306 mpc32_get_shaper_current() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amd/display: Clean up some inconsistent indenting	Jiapeng Chong
	No functional modification involved. smatch warning: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_hwseq.c:910 dcn32_init_hw() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amdkfd: Split giant svm range	Philip Yang
	Giant svm range split to smaller ranges, align the range start address to max svm range pages to improve MMU TLB usage. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amdkfd: Set svm range max pages	Philip Yang
	This will be used to split giant svm range into smaller ranges, to support VRAM overcommitment by giant range and improve GPU retry fault recover on giant range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amdgpu: Allow TTM to evict svm bo from same process	Philip Yang
	To support SVM range VRAM overcommitment, TTM should be able to evict svm bo of same process to system memory, to get space to alloc new svm bo. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amdgpu: Fix the incomplete product number	Roy Sun
	The comments say that the product number is a 16-digit HEX string so the buffer needs to be at least 17 characters to hold the NUL terminator. Expand the buffer size to 20 to avoid the alignment issues. The comment:Product number should only be 16 characters. Any more,and something could be wrong. Cap it at 16 to be safe Signed-off-by: Roy Sun <Roy.Sun@amd.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amdgpu: use adev_to_drm for consistency	Guchun Chen
	Keep code consistency when accessing drm_device from amdgpu driver. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	drm/amdgpu/dc/dce: fix repeated words in comments	wangjianli
	Delete the redundant word 'in'. Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28	Documentation: KUnit: Fix example with compilation error	Maíra Canal
	The Parameterized Testing example contains a compilation error, as the signature for the description helper function is void()(const struct sha1_test_case , char ), and the struct is non-const. This is warned by Clang: error: initialization of ‘void ()(struct sha1_test_case , char )’ from incompatible pointer type ‘void ()(const struct sha1_test_case , char )’ [-Werror=incompatible-pointer-types] 33 \| KUNIT_ARRAY_PARAM(sha1, cases, case_to_desc); \| ^~~~~~~~~~~~ ../include/kunit/test.h:1339:70: note: in definition of macro ‘KUNIT_ARRAY_PARAM’ 1339 \| void (__get_desc)(typeof(__next), char *) = get_desc; \ Signed-off-by: Maíra Canal <mairacanal@riseup.net> Reviewed-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-28	Merge tag 'net-5.19-final' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter, no known blockers for the release. Current release - regressions: - wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop(), fix taking the lock before its initialized - Bluetooth: mgmt: fix double free on error path Current release - new code bugs: - eth: ice: fix tunnel checksum offload with fragmented traffic Previous releases - regressions: - tcp: md5: fix IPv4-mapped support after refactoring, don't take the pure v6 path - Revert "tcp: change pingpong threshold to 3", improving detection of interactive sessions - mld: fix netdev refcount leak in mld_{query \| report}_work() due to a race - Bluetooth: - always set event mask on suspend, avoid early wake ups - L2CAP: fix use-after-free caused by l2cap_chan_put - bridge: do not send empty IFLA_AF_SPEC attribute Previous releases - always broken: - ping6: fix memleak in ipv6_renew_options() - sctp: prevent null-deref caused by over-eager error paths - virtio-net: fix the race between refill work and close, resulting in NAPI scheduled after close and a BUG() - macsec: - fix three netlink parsing bugs - avoid breaking the device state on invalid change requests - fix a memleak in another error path Misc: - dt-bindings: net: ethernet-controller: rework 'fixed-link' schema - two more batches of sysctl data race adornment" * tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits) stmmac: dwmac-mediatek: fix resource leak in probe ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr net: ping6: Fix memleak in ipv6_renew_options(). net/funeth: Fix fun_xdp_tx() and XDP packet reclaim sctp: leave the err path free in sctp_stream_init to sctp_stream_free sfc: disable softirqs for ptp TX ptp: ocp: Select CRC16 in the Kconfig. tcp: md5: fix IPv4-mapped support virtio-net: fix the race between refill work and close mptcp: Do not return EINPROGRESS when subflow creation succeeds Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put Bluetooth: Always set event mask on suspend Bluetooth: mgmt: Fix double free on error path wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() ice: do not setup vlan for loopback VSI ice: check (DD \| EOF) bits on Rx descriptor rather than (EOP \| RS) ice: Fix VSIs unable to share unicast MAC ice: Fix tunnel checksum offload with fragmented traffic ice: Fix max VLANs available for VF netfilter: nft_queue: only allow supported familes and hooks ...
2022-07-28	ice: allow toggling loopback mode via ndo_set_features callback	Maciej Fijalkowski
	Add support for NETIF_F_LOOPBACK. This feature can be set via: $ ethtool -K eth0 loopback <on\|off> Feature can be useful for local data path tests. Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	ice: compress branches in ice_set_features()	Maciej Fijalkowski
	Instead of rather verbose comparison of current netdev->features bits vs the incoming ones from user, let us compress them by a helper features set that will be the result of netdev->features XOR features. This way, current, extensive branches: if (features & NETIF_F_BIT && !(netdev->features & NETIF_F_BIT)) set_feature(true); else if (!(features & NETIF_F_BIT) && netdev->features & NETIF_F_BIT) set_feature(false); can become: netdev_features_t changed = netdev->features ^ features; if (changed & NETIF_F_BIT) set_feature(!!(features & NETIF_F_BIT)); This is nothing new as currently several other drivers use this approach, which I find much more convenient. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	ice: Fix promiscuous mode not turning off	Michal Wilczynski
	When trust is turned off for the VF, the expectation is that promiscuous and allmulticast filters are removed. Currently default VSI filter is not getting cleared in this flow. Example: ip link set enp236s0f0 vf 0 trust on ip link set enp236s0f0v0 promisc on ip link set enp236s0f0 vf 0 trust off /* promiscuous mode is still enabled on VF0 */ Remove switch filters for both cases. This commit fixes above behavior by removing default VSI filters and allmulticast filters when vf-true-promisc-support is OFF. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	ice: Introduce enabling promiscuous mode on multiple VF's	Michal Wilczynski
	In current implementation default VSI switch filter is only able to forward traffic to a single VSI. This limits promiscuous mode with private flag 'vf-true-promisc-support' to a single VF. Enabling it on the second VF won't work. Also allmulticast support doesn't seem to be properly implemented when vf-true-promisc-support is true. Use standard ice_add_rule_internal() function that already implements forwarding to multiple VSI's instead of constructing AQ call manually. Add switch filter for allmulticast mode when vf-true-promisc-support is enabled. The same filter is added regardless of the flag - it doesn't matter for this case. Remove unnecessary fields in switch structure. From now on book keeping will be done by ice_add_rule_internal(). Refactor unnecessarily passed function arguments. To test: 1) Create 2 VM's, and two VF's. Attach VF's to VM's. 2) Enable promiscuous mode on both of them and check if traffic is seen on both of them. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	tools/power turbostat: version 2022.07.28	Len Brown
	update version number Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: do not decode ACC for ICX and SPR	Artem Bityutskiy
	The ACC (automatic C-state conversion) feature was available on Sky Lake and Cascade Lake Xeons (SKX and CLX), but it is not available on Ice Lake and Sapphire Rapids Xeons (ICX and SPR). Therefore, stop decoding it for ICX and SPR. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: fix SPR PC6 limits	Artem Bityutskiy
	Sapphire Rapids Xeon (SPR) supports 2 flavors of PC6 - PC6N (non-retention) and PC6R (retention). Before this patch we used ICX package C-state limits, which was wrong, because ICX has only one PC6 flavor. With this patch, we use SKX PC6 limits for SPR, because they are the same. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: cleanup 'automatic_cstate_conversion_probe()'	Artem Bityutskiy
	The 'automatic_cstate_conversion_probe()' function has a too long 'if' statement, convert it to a 'switch' statement in order to improve code readability a bit. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: separate SPR from ICX	Artem Bityutskiy
	Before this patch, SPR platform was considered identical to ICX platform. This patch separates SPR support from ICX. This patch is a preparation for adding SPR-specific package C-state limits support. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbosstat: fix comment	Jiang Jian
	remove duplicate "the" in comment Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: Support RAPTORLAKE P	George D Sworo
	Add initial support for Raptorlake model Signed-off-by: George D Sworo <george.d.sworo@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: add support for ALDERLAKE_N	Zhang Rui
	Add support for ALDERLAKE_N platform. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: dump secondary Turbo-Ratio-Limit	Len Brown
	Intel Performance Hybrid processors have a 2nd MSR describing the turbo limits enforced on the Ecores. Note, TRL and Secondary-TRL are usually R/O information, but on overclock-capable parts, they can be written. Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: simplify dump_turbo_ratio_limits()	Len Brown
	code cleanup only. no functional change. Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: dump CPUID.7.EDX.Hybrid	Len Brown
	CPUID leaf 7 EDX now tells us if the processor has hybrid CPUs Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: update turbostat.8	Len Brown
	Update turbostat.8 to reflect new uncore frequency output (UncMHz) Also, refresh examples. Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: Show uncore frequency	Len Brown
	When CONFIG_INTEL_UNCORE_FREQ_CONTROL is effective, (Linux 5.9 and later), print the current (and default) min and max uncore frequency limits. When that driver provides the current uncore frequency (Linux 5.18 and later), print a UncMHz column reflecting the current uncore frequency. Note that UncMHz is an instantaneous sample, not an average. eg. $ sudo ./turbostat -S --show frequency ... Uncore Frequency pkg0 die0: 800 - 3900 MHz (800 - 3900 MHz) ... Avg_MHz Busy% Bzy_MHz TSC_MHz UncMHz 28 0.70 4049 3095 3900 Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: Fix file pointer leak	Colin Ian King
	Currently if a fscanf fails then an early return leaks an open file pointer. Fix this by fclosing the file before the return. Detected using static analysis with cppcheck: tools/power/x86/turbostat/turbostat.c:2039:3: error: Resource leak: fp [resourceLeak] Fixes: eae97e053fe3 ("tools/power turbostat: Support thermal throttle count print") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Chen Yu <yu.c.chen@intel.com> Reviewed-by: Tom Rix <trix@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: replace strncmp with single character compare	Colin Ian King
	Using strncmp for a single character comparison is overly complicated, just use a simpler single character comparison instead. Also stops static analyzers (such as cppcheck) from complaining about strncmp on non-null terminated strings. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: print the kernel boot commandline	Chen Yu
	It would be handy to have cmdline in turbostat output. For example, according to the turbostat output, there are no C-states requested. In this case the user is very curious if something like intel_idle.max_cstate=0 was used, or may be idle=none too. It is also curious whether things like intel_pstate=nohwp were used. Print the boot command line accordingly: turbostat version 21.05.04 - Len Brown <lenb@kernel.org> Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.16.0+ root=UUID= b42359ed-1e05-42eb-8757-6bf2a1c19070 ro quiet splash vt.handoff=7 Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	tools/power turbostat: Introduce support for RaptorLake	Zhang Rui
	RaptorLake is compatible with AlderLake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28	igb: convert .adjfreq to .adjfine	Jacob Keller
	The 82576 PTP implementation still uses .adjfreq instead of using the newer .adjfine. This implementation uses a pre-simplified calculation since the base increment value for the 82576 is just 16 * 2^19. Converting this into scaled_ppm is tricky, and makes the intent a bit less clear. Simply convert to the normal flow of multiplying the base increment value by the scaled_ppm and then dividing by 1000000ULL << 16. This can be implemented using mul_u64_u64_div_u64 which can avoid the possible overflow that might occur for large adjustments. Use of .adjfine can improve the precision of small adjustments and gets us one driver closer to removing the old implementation from the kernel entirely. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs()	Kai Huang
	Now kvm_tdp_mmu_zap_leafs() only zaps leaf SPTEs but not any non-root pages within that GFN range anymore, so the comment around it isn't right. Fix it by shifting the comment from tdp_mmu_zap_leafs() instead of duplicating it, as tdp_mmu_zap_leafs() is static and is only called by kvm_tdp_mmu_zap_leafs(). Opportunistically tweak the blurb about SPTEs being cleared to (a) say "zapped" instead of "cleared" because "cleared" will be wrong if/when KVM allows a non-zero value for non-present SPTE (i.e. for Intel TDX), and (b) to clarify that a flush is needed if and only if a SPTE has been zapped since MMU lock was last acquired. Fixes: f47e5bbbc92f ("KVM: x86/mmu: Zap only TDP MMU leafs in zap range and mmu_notifier unmap") Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <20220728030452.484261-1-kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28	KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog	Jarkko Sakkinen
	As Virtual Machine Save Area (VMSA) is essential in troubleshooting attestation, dump it to the klog with the KERN_DEBUG level of priority. Cc: Jarkko Sakkinen <jarkko@kernel.org> Suggested-by: Harald Hoyer <harald@profian.com> Signed-off-by: Jarkko Sakkinen <jarkko@profian.com> Message-Id: <20220728050919.24113-1-jarkko@profian.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28	ixgbe: convert .adjfreq to .adjfine	Jacob Keller
	Convert the ixgbe PTP frequency adjustment implementations from .adjfreq to .adjfine. This allows using the scaled parts per million adjustment from the PTP core and results in a more precise adjustment for small corrections. To avoid overflow, use mul_u64_u64_div_u64 to perform the calculation. On X86 platforms, this will use instructions that perform the operations with 128bit intermediate values. For other architectures, the implementation will limit the loss of precision as much as possible. This change slightly improves the precision of frequency adjustments for all ixgbe based devices, and gets us one driver closer to being able to remove the older .adjfreq implementation from the kernel. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	i40e: convert .adjfreq to .adjfine	Jacob Keller
	The i40e driver currently implements the .adjfreq handler for frequency adjustments. This takes the adjustment parameter in parts per billion. The PTP core supports .adjfine which provides an adjustment in scaled parts per million. This has a higher resolution and can result in more precise adjustments for small corrections. Convert the existing .adjfreq implementation to the newer .adjfine implementation. This is trivial since it just requires changing the divisor from 1000000000ULL to (1000000ULL << 16) in the mul_u64_u64_div_u64 call. This improves the precision of the adjustments and gets us one driver closer to removing the old .adjfreq support from the kernel. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	i40e: use mul_u64_u64_div_u64 for PTP frequency calculation	Jacob Keller
	The i40e device has a different clock rate depending on the current link speed. This requires using a different increment rate for the PTP clock registers. For slower link speeds, the base increment value is larger. Directly multiplying the larger increment value by the parts per billion adjustment might overflow. To avoid this, the i40e implementation defaults to using the lower increment value and then multiplying the adjustment afterwards. This causes a loss of precision for lower link speeds. We can fix this by using mul_u64_u64_div_u64 instead of performing the multiplications using standard C operations. On X86, this will use special instructions that perform the multiplication and division with 128bit intermediate values. For other architectures, the fallback implementation will limit the loss of precision for large values. Small adjustments don't overflow anyways and won't lose precision at all. This allows first multiplying the base increment value and then performing the adjustment calculation, since we no longer fear overflowing. It also makes it easier to convert to the even more precise .adjfine implementation in a following change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	e1000e: convert .adjfreq to .adjfine	Jacob Keller
	The PTP implementation for the e1000e driver uses the older .adjfreq method. This method takes an adjustment in parts per billion. The newer .adjfine implementation uses scaled_ppm. The use of scaled_ppm allows for finer grained adjustments and is preferred over using the older implementation. Make use of mul_u64_u64_div_u64 in order to handle possible overflow of the multiplication used to calculate the desired adjustment to the hardware increment value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	e1000e: remove unnecessary range check in e1000e_phc_adjfreq	Jacob Keller
	The e1000e_phc_adjfreq function validates that the input delta is within the maximum range. This is already handled by the core PTP code and this is a duplicate and thus unnecessary check. It also complicates refactoring to use the newer .adjfine implementation, where the input is no longer specified in parts per billion. Remove the range validation check. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	ice: implement adjfine with mul_u64_u64_div_u64	Jacob Keller
	The PTP frequency adjustment code needs to determine an appropriate adjustment given an input scaled_ppm adjustment. We calculate the adjustment to the register by multiplying the base (nominal) increment value by the scaled_ppm and then dividing by the scaled one million value. For very large adjustments, this might overflow. To avoid this, both the scaled_ppm and divisor values are downshifted. We can avoid that on X86 architectures by using mul_u64_u64_div_u64. This helper function will perform the multiplication and division with 128bit intermediate values. We know that scaled_ppm is never larger than the divisor so this operation will never result in an overflow. This improves the accuracy of the calculations for large adjustment values on X86. It is likely an improvement on other architectures as well because the default implementation of mul_u64_u64_div_u64 is smarter than the original approach taken in the ice code. Additionally, this implementation is easier to read, using fewer local variables and lines of code to implement. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28	KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT	Sean Christopherson
	Treat the NX bit as valid when using NPT, as KVM will set the NX bit when the NX huge page mitigation is enabled (mindblowing) and trigger the WARN that fires on reserved SPTE bits being set. KVM has required NX support for SVM since commit b26a71a1a5b9 ("KVM: SVM: Refuse to load kvm_amd if NX support is not available") for exactly this reason, but apparently it never occurred to anyone to actually test NPT with the mitigation enabled. ------------[ cut here ]------------ spte = 0x800000018a600ee7, level = 2, rsvd bits = 0x800f0000001fe000 WARNING: CPU: 152 PID: 15966 at arch/x86/kvm/mmu/spte.c:215 make_spte+0x327/0x340 [kvm] Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022 RIP: 0010:make_spte+0x327/0x340 [kvm] Call Trace: <TASK> tdp_mmu_map_handle_target_level+0xc3/0x230 [kvm] kvm_tdp_mmu_map+0x343/0x3b0 [kvm] direct_page_fault+0x1ae/0x2a0 [kvm] kvm_tdp_page_fault+0x7d/0x90 [kvm] kvm_mmu_page_fault+0xfb/0x2e0 [kvm] npf_interception+0x55/0x90 [kvm_amd] svm_invoke_exit_handler+0x31/0xf0 [kvm_amd] svm_handle_exit+0xf6/0x1d0 [kvm_amd] vcpu_enter_guest+0xb6d/0xee0 [kvm] ? kvm_pmu_trigger_event+0x6d/0x230 [kvm] vcpu_run+0x65/0x2c0 [kvm] kvm_arch_vcpu_ioctl_run+0x355/0x610 [kvm] kvm_vcpu_ioctl+0x551/0x610 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> ---[ end trace 0000000000000000 ]--- Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220723013029.1753623-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28	KVM: x86: Do not block APIC write for non ICR registers	Suravee Suthikulpanit
	The commit 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") introduces logic to prevent APIC write for offset other than ICR in kvm_apic_write_nodecode() function. This breaks x2AVIC support, which requires KVM to trap and emulate x2APIC MSR writes. Therefore, removes the warning and modify to logic to allow MSR write. Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") Cc: Zeng Guang <guang.zeng@intel.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <20220725053356.4275-1-suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28	KVM: SVM: Do not virtualize MSR accesses for APIC LVTT register	Suravee Suthikulpanit
	AMD does not support APIC TSC-deadline timer mode. AVIC hardware will generate GP fault when guest kernel writes 1 to bits [18] of the APIC LVTT register (offset 0x32) to set the timer mode. (Note: bit 18 is reserved on AMD system). Therefore, always intercept and let KVM emulate the MSR accesses. Fixes: f3d7c8aa6882 ("KVM: SVM: Fix x2APIC MSRs interception") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <20220725033428.3699-1-suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28	stmmac: dwmac-mediatek: fix resource leak in probe	Dan Carpenter
	If mediatek_dwmac_clks_config() fails, then call stmmac_remove_config_dt() before returning. Otherwise it is a resource leak. Fixes: fa4b3ca60e80 ("stmmac: dwmac-mediatek: fix clock issue") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YuJ4aZyMUlG6yGGa@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28	ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr	Ziyang Xuan
	Change net device's MTU to smaller than IPV6_MIN_MTU or unregister device while matching route. That may trigger null-ptr-deref bug for ip6_ptr probability as following. ========================================================= BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134 Read of size 4 at addr 0000000000000308 by task ping6/263 CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14 Call trace: dump_backtrace+0x1a8/0x230 show_stack+0x20/0x70 dump_stack_lvl+0x68/0x84 print_report+0xc4/0x120 kasan_report+0x84/0x120 __asan_load4+0x94/0xd0 find_match.part.0+0x70/0x134 __find_rr_leaf+0x408/0x470 fib6_table_lookup+0x264/0x540 ip6_pol_route+0xf4/0x260 ip6_pol_route_output+0x58/0x70 fib6_rule_lookup+0x1a8/0x330 ip6_route_output_flags_noref+0xd8/0x1a0 ip6_route_output_flags+0x58/0x160 ip6_dst_lookup_tail+0x5b4/0x85c ip6_dst_lookup_flow+0x98/0x120 rawv6_sendmsg+0x49c/0xc70 inet_sendmsg+0x68/0x94 Reproducer as following: Firstly, prepare conditions: $ip netns add ns1 $ip netns add ns2 $ip link add veth1 type veth peer name veth2 $ip link set veth1 netns ns1 $ip link set veth2 netns ns2 $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1 $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2 $ip netns exec ns1 ifconfig veth1 up $ip netns exec ns2 ifconfig veth2 up $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1 $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1 Secondly, execute the following two commands in two ssh windows respectively: $ip netns exec ns1 sh $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done $ip netns exec ns1 sh $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly, then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check. cpu0 cpu1 fib6_table_lookup __find_rr_leaf addrconf_notify [ NETDEV_CHANGEMTU ] addrconf_ifdown RCU_INIT_POINTER(dev->ip6_ptr, NULL) find_match ip6_ignore_linkdown So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to fix the null-ptr-deref bug. Fixes: dcd1f572954f ("net/ipv6: Remove fib6_idev") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28	net: ping6: Fix memleak in ipv6_renew_options().	Kuniyuki Iwashima
	When we close ping6 sockets, some resources are left unfreed because pingv6_prot is missing sk->sk_prot->destroy(). As reported by syzbot [0], just three syscalls leak 96 bytes and easily cause OOM. struct ipv6_sr_hdr hdr; char data[24] = {0}; int fd; hdr = (struct ipv6_sr_hdr )data; hdr->hdrlen = 2; hdr->type = IPV6_SRCRT_TYPE_4; fd = socket(AF_INET6, SOCK_DGRAM, NEXTHDR_ICMP); setsockopt(fd, IPPROTO_IPV6, IPV6_RTHDR, data, 24); close(fd); To fix memory leaks, let's add a destroy function. Note the socket() syscall checks if the GID is within the range of net.ipv4.ping_group_range. The default value is [1, 0] so that no GID meets the condition (1 <= GID <= 0). Thus, the local DoS does not succeed until we change the default value. However, at least Ubuntu/Fedora/RHEL loosen it. $ cat /usr/lib/sysctl.d/50-default.conf ... -net.ipv4.ping_group_range = 0 2147483647 Also, there could be another path reported with these options, and some of them require CAP_NET_RAW. setsockopt IPV6_ADDRFORM (inet6_sk(sk)->pktoptions) IPV6_RECVPATHMTU (inet6_sk(sk)->rxpmtu) IPV6_HOPOPTS (inet6_sk(sk)->opt) IPV6_RTHDRDSTOPTS (inet6_sk(sk)->opt) IPV6_RTHDR (inet6_sk(sk)->opt) IPV6_DSTOPTS (inet6_sk(sk)->opt) IPV6_2292PKTOPTIONS (inet6_sk(sk)->opt) getsockopt IPV6_FLOWLABEL_MGR (inet6_sk(sk)->ipv6_fl_list) For the record, I left a different splat with syzbot's one. unreferenced object 0xffff888006270c60 (size 96): comm "repro2", pid 231, jiffies 4294696626 (age 13.118s) hex dump (first 32 bytes): 01 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00 ....D........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000f6bc7ea9>] sock_kmalloc (net/core/sock.c:2564 net/core/sock.c:2554) [<000000006d699550>] do_ipv6_setsockopt.constprop.0 (net/ipv6/ipv6_sockglue.c:715) [<00000000c3c3b1f5>] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:1024) [<000000007096a025>] __sys_setsockopt (net/socket.c:2254) [<000000003a8ff47b>] __x64_sys_setsockopt (net/socket.c:2265 net/socket.c:2262 net/socket.c:2262) [<000000007c409dcb>] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [<00000000e939c4a9>] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [0]: https://syzkaller.appspot.com/bug?extid=a8430774139ec3ab7176 Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.") Reported-by: syzbot+a8430774139ec3ab7176@syzkaller.appspotmail.com Reported-by: Ayushman Dutta <ayudutta@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220728012220.46918-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28	cgroup: Skip subtree root in cgroup_update_dfl_csses()	Waiman Long
	The cgroup_update_dfl_csses() function updates css associations when a cgroup's subtree_control file is modified. Any changes made to a cgroup's subtree_control file, however, will only affect its descendants but not the cgroup itself. So there is no point in migrating csses associated with that cgroup. We can skip them instead. Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-07-28	KVM: selftests: Verify VMX MSRs can be restored to KVM-supported values	Sean Christopherson
	Verify that KVM allows toggling VMX MSR bits to be "more" restrictive, and also allows restoring each MSR to KVM's original, less restrictive value. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220607213604.3346000-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>