Age | Commit message (Collapse) | Author |
|
netif_device_attach() will unpause the queues so we can't call
it before __alx_open(). This went undetected until
commit b0999223f224 ("alx: add ability to allocate and free
alx_napi structures") but now if stack tries to xmit immediately
on resume before __alx_open() we'll crash on the NAPI being null:
BUG: kernel NULL pointer dereference, address: 0000000000000198
CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G OE 5.10.0-3-amd64 #1 Debian 5.10.13-1
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77-D3H, BIOS F15 11/14/2013
RIP: 0010:alx_start_xmit+0x34/0x650 [alx]
Code: 41 56 41 55 41 54 55 53 48 83 ec 20 0f b7 57 7c 8b 8e b0
0b 00 00 39 ca 72 06 89 d0 31 d2 f7 f1 89 d2 48 8b 84 df
RSP: 0018:ffffb09240083d28 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffffa04d80ae7800 RCX: 0000000000000004
RDX: 0000000000000000 RSI: ffffa04d80afa000 RDI: ffffa04e92e92a00
RBP: 0000000000000042 R08: 0000000000000100 R09: ffffa04ea3146700
R10: 0000000000000014 R11: 0000000000000000 R12: ffffa04e92e92100
R13: 0000000000000001 R14: ffffa04e92e92a00 R15: ffffa04e92e92a00
FS: 0000000000000000(0000) GS:ffffa0508f600000(0000) knlGS:0000000000000000
i915 0000:00:02.0: vblank wait timed out on crtc 0
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000198 CR3: 000000004460a001 CR4: 00000000001706f0
Call Trace:
dev_hard_start_xmit+0xc7/0x1e0
sch_direct_xmit+0x10f/0x310
Cc: <stable@vger.kernel.org> # 4.9+
Fixes: bc2bebe8de8e ("alx: remove WoL support")
Reported-by: Zbynek Michl <zbynek.michl@gmail.com>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Zbynek Michl <zbynek.michl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Trim all 4 bytes of the received FCS; not just 2 of them. Leaving 2
bytes of the FCS on the frame breaks DSA tailing tag drivers.
Fixes: a8db76d40e4d ("lan743x: boost performance on cpu archs w/o dma cache snooping")
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"Fix DM verity target's optional Forward Error Correction (FEC) for
Reed-Solomon roots that are unaligned to block size"
* tag 'for-5.12/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm verity: fix FEC for RS roots unaligned to block size
dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size
|
|
When using jumbo packets and overrunning rx queue with napi enabled,
the following sequence is observed in gfar_add_rx_frag:
| lstatus | | skb |
t | lstatus, size, flags | first | len, data_len, *ptr |
---+--------------------------------------+-------+-----------------------+
13 | 18002348, 9032, INTERRUPT LAST | 0 | 9600, 8000, f554c12e |
12 | 10000640, 1600, INTERRUPT | 0 | 8000, 6400, f554c12e |
11 | 10000640, 1600, INTERRUPT | 0 | 6400, 4800, f554c12e |
10 | 10000640, 1600, INTERRUPT | 0 | 4800, 3200, f554c12e |
09 | 10000640, 1600, INTERRUPT | 0 | 3200, 1600, f554c12e |
08 | 14000640, 1600, INTERRUPT FIRST | 0 | 1600, 0, f554c12e |
07 | 14000640, 1600, INTERRUPT FIRST | 1 | 0, 0, f554c12e |
06 | 1c000080, 128, INTERRUPT LAST FIRST | 1 | 0, 0, abf3bd6e |
05 | 18002348, 9032, INTERRUPT LAST | 0 | 8000, 6400, c5a57780 |
04 | 10000640, 1600, INTERRUPT | 0 | 6400, 4800, c5a57780 |
03 | 10000640, 1600, INTERRUPT | 0 | 4800, 3200, c5a57780 |
02 | 10000640, 1600, INTERRUPT | 0 | 3200, 1600, c5a57780 |
01 | 10000640, 1600, INTERRUPT | 0 | 1600, 0, c5a57780 |
00 | 14000640, 1600, INTERRUPT FIRST | 1 | 0, 0, c5a57780 |
So at t=7 a new packets is started but not finished, probably due to rx
overrun - but rx overrun is not indicated in the flags. Instead a new
packets starts at t=8. This results in skb->len to exceed size for the LAST
fragment at t=13 and thus a negative fragment size added to the skb.
This then crashes:
kernel BUG at include/linux/skbuff.h:2277!
Oops: Exception in kernel mode, sig: 5 [#1]
...
NIP [c04689f4] skb_pull+0x2c/0x48
LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844
Call Trace:
[ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable)
[ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4
[ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c
[ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0
[ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc
[ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70
[ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc
[ec4bff08] [c0062718] kthread+0x140/0x144
[ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c
This patch fixes this by checking for computed LAST fragment size, so a
negative sized fragment is never added.
In order to prevent the newer rx frame from getting corrupted, the FIRST
flag is checked to discard the incomplete older frame.
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RXMAC_BC_FRM_CNT_COUNT added to mp->rx_bcasts twice in a row
in niu_xmac_interrupt(). Remove the second addition.
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
"len > sp->mtu" checked twice in a row in sp_encaps().
Remove the second check.
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The (0xBAF70000 & 0x00FFF000) << 6 should be (0xf70 << 18).
Fixes: 561535b0f239 ("r8169: fix OCP access on RTL8117")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ibmvnic_remove locks multiple spinlocks while disabling interrupts:
spin_lock_irqsave(&adapter->state_lock, flags);
spin_lock_irqsave(&adapter->rwi_lock, flags);
As reported by coccinelle, the second _irqsave() overwrites the value
saved in 'flags' by the first _irqsave(), therefore when the second
_irqrestore() comes,the value in 'flags' is not valid,the value saved
by the first _irqsave() has been lost.
This likely leads to IRQs remaining disabled. So remove the second
_irqsave():
spin_lock_irqsave(&adapter->state_lock, flags);
spin_lock(&adapter->rwi_lock);
Generated by: ./scripts/coccinelle/locks/flags.cocci
./drivers/net/ethernet/ibm/ibmvnic.c:5413:1-18:
ERROR: nested lock+irqsave that reuses flags from line 5404.
Fixes: 4a41c421f367 ("ibmvnic: serialize access to work queue on remove")
Signed-off-by: Junlin Yang <yangjunlin@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull block fixes from Jens Axboe:
- NVMe fixes:
- more device quirks (Julian Einwag, Zoltán Böszörményi, Pascal
Terjan)
- fix a hwmon error return (Daniel Wagner)
- fix the keep alive timeout initialization (Martin George)
- ensure the model_number can't be changed on a used subsystem
(Max Gurtovoy)
- rsxx missing -EFAULT on copy_to_user() failure (Dan)
- rsxx remove unused linux.h include (Tian)
- kill unused RQF_SORTED (Jean)
- updated outdated BFQ comments (Joseph)
- revert work-around commit for bd_size_lock, since we removed the
offending user in this merge window (Damien)
* tag 'block-5.12-2021-03-05' of git://git.kernel.dk/linux-block:
nvmet: model_number must be immutable once set
nvme-fabrics: fix kato initialization
nvme-hwmon: Return error code when registration fails
nvme-pci: add quirks for Lexar 256GB SSD
nvme-pci: mark Kingston SKC2000 as not supporting the deepest power state
nvme-pci: mark Seagate Nytro XM1440 as QUIRK_NO_NS_DESC_LIST.
rsxx: Return -EFAULT if copy_to_user() fails
block/bfq: update comments and default value in docs for fifo_expire
rsxx: remove unused including <linux/version.h>
block: Drop leftover references to RQF_SORTED
block: revert "block: fix bd_size_lock use"
|
|
Issue seen when enumerating multiple Intel mGbE interfaces in EHL.
[ 6.898141] intel-eth-pci 0000:00:1d.2: enabling device (0000 -> 0002)
[ 6.900971] intel-eth-pci 0000:00:1d.2: Fail to register stmmac-clk
[ 6.906434] intel-eth-pci 0000:00:1d.2: User ID: 0x51, Synopsys ID: 0x52
We fix it by making the clock name to be unique following the format
of stmmac-pci_name(pci_dev) so that we can differentiate the clock for
these Intel mGbE interfaces in EHL platform as follow:
/sys/kernel/debug/clk/stmmac-0000:00:1d.1
/sys/kernel/debug/clk/stmmac-0000:00:1d.2
/sys/kernel/debug/clk/stmmac-0000:00:1e.4
Fixes: 58da0cfa6cf1 ("net: stmmac: create dwmac-intel.c to contain all Intel platform")
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Co-developed-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For Intel mGbE controller, MAC VLAN filter delete operation will time-out
if serdes power-down sequence happened first during driver remove() with
below message.
[82294.764958] intel-eth-pci 0000:00:1e.4 eth2: stmmac_dvr_remove: removing driver
[82294.778677] intel-eth-pci 0000:00:1e.4 eth2: Timeout accessing MAC_VLAN_Tag_Filter
[82294.779997] intel-eth-pci 0000:00:1e.4 eth2: failed to kill vid 0081/0
[82294.947053] intel-eth-pci 0000:00:1d.2 eth1: stmmac_dvr_remove: removing driver
[82295.002091] intel-eth-pci 0000:00:1d.1 eth0: stmmac_dvr_remove: removing driver
Therefore, we delay the serdes power-down to be after unregister_netdev()
which triggers the VLAN filter delete.
Fixes: b9663b7ca6ff ("net: stmmac: Enable SERDES power up/down sequence")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When iavf_process_config() fails, no error return code of
iavf_init_get_resources() is assigned.
To fix this bug, err is assigned with the return value of
iavf_process_config(), and then err is checked.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When bdx_read_mac() fails, no error return code of bdx_probe()
is assigned.
To fix this bug, err is assigned with -EFAULT as error return code.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes a bug that the moderation config will not be
applied when calling mlx4_en_reset_config. For example, when
turning on rx timestamping, mlx4_en_reset_config() will be called,
causing the NIC to forget previous moderation config.
This fix is in phase with a previous fix:
commit 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss
after set_ringparam is called")
Tested: Before this patch, on a host with NIC using mlx4, run
netserver and stream TCP to the host at full utilization.
$ sar -I SUM 1
INTR intr/s
14:03:56 sum 48758.00
After rx hwtstamp is enabled:
$ sar -I SUM 1
14:10:38 sum 317771.00
We see the moderation is not working properly and issued 7x more
interrupts.
After the patch, and turned on rx hwtstamp, the rate of interrupts
is as expected:
$ sar -I SUM 1
14:52:11 sum 49332.00
Fixes: 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called")
Signed-off-by: Kevin(Yudong) Yang <yyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
CC: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix the usage of device links in the runtime PM core code and
update the DTPM (Dynamic Thermal Power Management) feature added
recently.
Specifics:
- Make the runtime PM core code avoid attempting to suspend supplier
devices before updating the PM-runtime status of a consumer to
'suspended' (Rafael Wysocki).
- Fix DTPM (Dynamic Thermal Power Management) root node
initialization and label that feature as EXPERIMENTAL in Kconfig
(Daniel Lezcano)"
* tag 'pm-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
powercap/drivers/dtpm: Add the experimental label to the option description
powercap/drivers/dtpm: Fix root node initialization
PM: runtime: Update device status before letting suppliers suspend
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Fix a sleeping-while-atomic issue in the AMD IOMMU code
- Disable lazy IOTLB flush for untrusted devices in the Intel VT-d
driver
- Fix status code definitions for Intel VT-d
- Fix IO Page Fault issue in Tegra IOMMU driver
* tag 'iommu-fixes-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Fix status code for Allocate/Free PASID command
iommu: Don't use lazy flush for untrusted device
iommu/tegra-smmu: Fix mc errors on tegra124-nyan
iommu/amd: Fix sleeping in atomic in increase_address_space()
|
|
After fixing nested FPU contexts caused by 41401ac67791 we're still seeing
complaints about spurious kernel_fpu_end(). As it turns out this was
already fixed for dcn20 in commit f41ed88cbd ("drm/amdgpu/display:
use GFP_ATOMIC in dcn20_validate_bandwidth_internal") but never moved
forward to dcn21.
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit 41401ac67791 added FPU wrappers to dcn21_validate_bandwidth(),
which was correct. Unfortunately a nested function alredy contained
DC_FP_START()/DC_FP_END() calls, which results in nested FPU context
enter/exit and complaints by kernel_fpu_begin_mask().
This can be observed e.g. with 5.10.20, which backported 41401ac67791
and now emits the following warning on boot:
WARNING: CPU: 6 PID: 858 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa5/0xc0
Call Trace:
dcn21_calculate_wm+0x47/0xa90 [amdgpu]
dcn21_validate_bandwidth_fp+0x15d/0x2b0 [amdgpu]
dcn21_validate_bandwidth+0x29/0x40 [amdgpu]
dc_validate_global_state+0x3c7/0x4c0 [amdgpu]
The warning is emitted due to the additional DC_FP_START/END calls in
patch_bounding_box(), which is inlined into dcn21_calculate_wm(),
its only caller. Removing the calls brings the code in line with
dcn20 and makes the warning disappear.
Fixes: 41401ac67791 ("drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()")
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
SISLANDS_SMC_SWSTATE
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].
Refactor the code according to the use of a flexible-array member in
struct SISLANDS_SMC_SWSTATE, instead of a one-element array, and use
the struct_size() helper to calculate the size for the allocation.
Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warnings:
drivers/gpu/drm/radeon/si_dpm.c: In function ‘si_convert_power_state_to_smc’:
drivers/gpu/drm/radeon/si_dpm.c:2350:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2350 | smc_state->levels[i].dpm2.MaxPS = (u8)((SISLANDS_DPM2_MAX_PULSE_SKIP * (max_sclk - min_sclk)) / max_sclk);
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2351:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2351 | smc_state->levels[i].dpm2.NearTDPDec = SISLANDS_DPM2_NEAR_TDP_DEC;
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2352:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2352 | smc_state->levels[i].dpm2.AboveSafeInc = SISLANDS_DPM2_ABOVE_SAFE_INC;
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2353:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2353 | smc_state->levels[i].dpm2.BelowSafeInc = SISLANDS_DPM2_BELOW_SAFE_INC;
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2354:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2354 | smc_state->levels[i].dpm2.PwrEfficiencyRatio = cpu_to_be16(pwr_efficiency_ratio);
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:5105:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
5105 | smc_state->levels[i + 1].aT = cpu_to_be32(a_t);
| ~~~~~~~~~~~~~~~~~^~~~~~~
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Build-tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/603f9a8f.aDLrpMFzzSApzVYQ%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The continue statement in a for-loop is redudant and can be removed.
Clean up the code to address this.
Addresses-Coverity: ("Continue as no effect")
Fixes: b6f91fc183f7 ("drm/amdgpu/display: buffer INTERRUPT_LOW_IRQ_CONTEXT interrupt work")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The variable status is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix the following coccicheck warnings:
./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:956:52-57: WARNING:
conversion to bool not needed here.
./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:8311:16-21: WARNING:
conversion to bool not needed here.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There seem devices that don't work with the aux channel backlight
control. For allowing such users to test with the other backlight
control method, provide a new module option, aux_backlight, to specify
enabling or disabling the aux backport support explicitly. As
default, the aux support is detected by the hardware capability.
v2: make the backlight option generic in case we add future
backlight types (Alex)
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Need to fetch it via aux.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It just spams the logs.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Avoid the extra wrapper function.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We set up the parameters, but never called the atom table.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This has been stable for a while.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Trap handler is set per-process per-device and is unrelated
to queue management.
Move implementation closer to TMA setup code.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Existing, buggy user mode breaks when SRAM ECC is correctly reported as
"enabled". To avoid breaking existing user mode, deprecate that bit and
leave it as 0. Define a new bit to report the actual SRAM ECC mode that
new, correct user mode can use in the future.
Fixes: 7ec177bdcfc1 ("drm/amdkfd: fix set kfd node ras properties value")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If have memory leak, maybe it will have issue in
ttm_bo_force_list_clean-> ttm_mem_evict_first.
Set adev->gart.ptr to null to avoid to call
amdgpu_gmc_set_pte_pde to cause ptr issue pointer when
calling amdgpu_gart_unbind in amdgpu_bo_fini which is after gart_fini.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When unloading driver after killing some applications, it will hit sdma
flush tlb job timeout which is called by ttm_bo_delay_delete. So
to avoid the job submit after fence driver fini, call ttm_bo_lock_delayed_workqueue
before fence driver fini. And also put drm_sched_fini before waiting fence.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
when try to shutdown guest vm in sriov mode, virt data
exchange is not fini. After vram lost, trying to write
vram could hang cpu.
[How]
add fini virt data exchange in ip_suspend
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Jack Zhang <Jack.Zhang1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add VM_HOLE/DOORBELL_INVALID_BE/POLL_TIMEOUT/SRBMWRITE
interrupt info printing.
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
SDMA 4_x asics share the same MGCG/MGLS setting.
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
To read back crc by sending command READ_ROI_CRC to
PSP TA to ask it to read out crc of crc window.
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
To have crc window being unchanged, we have dmcu to keep monitoring crc
window registers. In order not to have driver and dmcu change crc
registers at the same time, have work of changing crc window to be done
by dmcu fw.
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
Add additional MCP_SCP commands for starting/stopping updaing crc
window at DMCU
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
To support feature that calculates CRTC CRC value on specific
region (crc window).
[How]
1. Use debugfs to specify crtc crc window
2. Use vline0 IRQ to write crtc crc window
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
Find out that referring to crtc_state->crc_src is not thread safe.
Move crc_src from dm_crtc_state to dm_irq_params to fix this.
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The register mmOTG1_OTG_BLANK_CONTROL was missing BASE_IDX value.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
We use DMCUB outbox0 interrupt to log DMCUB trace buffer events
as Linux kernel traces, so need to add some irq source related
defination in the header files;
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove unnecessary comments, enable restore mode using
'|=' operator, fixes the alignment to improve the code
readability.
v2: Move all restoration flag check to bitwise '&' operator
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sienna cichlid needs one vf mode which allows vf to set and get
clock status from guest vm. So now expose the required interface
and allow some smu request on VF mode. Also since this asic blocked
direct MMIO access, use KIQ to send SMU request under sriov vf.
OD use same command as getting pp table which is not allowed for
sienna cichlid, so remove OD feature under sriov vf.
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Monk Liu<monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the gpu_metrics interface implementations to use the latest
upgraded data structures.
V2: fit the data type change of energy_accumulator
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To make sure they are naturally aligned. Also updating the
data type for link_speed/width for future PCIE5 support.
V2: define new structures with minor version bumped
V3: update data type of energy_accumulator as 64bit and
drop unnecessary padding members
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Prevent that an IO request is build during device shutdown initiated by
a driver unbind. This request will never be able to be processed or
canceled and will hang forever. This will lead also to a hanging unbind.
Fix by checking not only if the device is in READY state but also check
that there is no device offline initiated before building a new IO request.
Fixes: e443343e509a ("s390/dasd: blk-mq conversion")
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Tested-by: Bjoern Walk <bwalk@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In case of an unbind of the DASD device driver the function
dasd_generic_remove() is called which shuts down the device.
Among others this functions removes the int_handler from the cdev.
During shutdown the device cancels all outstanding IO requests and waits
for completion of the clear request.
Unfortunately the clear interrupt will never be received when there is no
interrupt handler connected.
Fix by moving the int_handler removal after the call to the state machine
where no request or interrupt is outstanding.
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Tested-by: Bjoern Walk <bwalk@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In rxe_comp.c in rxe_completer() the function free_pkt() did not clear skb
which triggered a warning at 'done:' and could possibly at 'exit:'. The
WARN_ONCE() calls are not actually needed. The call to free_pkt() is
moved to the end to clearly show that all skbs are freed.
Fixes: 899aba891cab ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()")
Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
rxe_rcv_mcast_pkt() dropped a reference to ib_device when no error
occurred causing an underflow on the reference counter. This code is
cleaned up to be clearer and easier to read.
Fixes: 899aba891cab ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()")
Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|