Age | Commit message (Collapse) | Author |
|
The hns driver wrap around the consumer index of AEQ and CEQ when they
reach to two times of queue entries number for owner mechanism, actually,
it is unnecessary to wrap around since the hardware itself will mask it
before use.
Link: https://lore.kernel.org/r/1612517974-31867-12-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
All fields of WQE will be rewrote, so the memset is unnecessary. And when
SQ is working in OWNER mode, the pipeline may prefetch the WQEs beyond PI,
the memset operation may flip the owner bit too early, then the pipeline
may get a wrong WQ.
Link: https://lore.kernel.org/r/1612517974-31867-11-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use macros instead of magic numbers to represent shift of dma_handle_wqe,
dma_handle_idx and UDP destination port number of RoCEv2.
Link: https://lore.kernel.org/r/1612517974-31867-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
hns_roce_device.h is not specific to hardware, some definitions are only
used for HIP06, they should be moved into hns_roce_hw_v1.h.
Link: https://lore.kernel.org/r/1612517974-31867-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently, the driver updates doorbell looks like this:
post()
{
wqe.field = 0x111;
wmb();
update_wq_db();
}
update_wq_db()
{
db.field = 0x222;
__raw_writeq(db, db_reg);
}
writeq() is a better choice than __raw_writeq() because it calls dma_wmb()
to barrier in ARM64, and dma_wmb() is better than wmb() for ROCEE device.
This patch removes all wmb() before updating doorbell of SQ/RQ/CQ/SRQ by
replacing __raw_writeq() with writeq() to improve performence. The new
process looks like this:
post()
{
wqe.field = 0x111;
update_wq_db();
}
update_wq_db()
{
db.field = 0x222;
writeq(db, db_reg);
}
Link: https://lore.kernel.org/r/1612517974-31867-8-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Since HIP09 does not require this function, it should be masked.
Link: https://lore.kernel.org/r/1612517974-31867-7-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This feature should only be enabled by querying capability from firmware.
Fixes: ba6bb7e97421 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1612517974-31867-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add the mapped page count checking flow to avoid invalid page size when
creating MTR.
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing")
Link: https://lore.kernel.org/r/1612517974-31867-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This bit should be in type of enum ib_sig_type, or there will be a sparse
warning.
Fixes: bfe860351e31 ("RDMA/hns: Fix cast from or to restricted __le32 for driver")
Link: https://lore.kernel.org/r/1612517974-31867-3-git-send-email-liweihang@huawei.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
ULP usually set IB(V)_QP_AV when trying to modify QP to RTR if they want
to record sgid index into QPC. For UD QPs, it is useless because it will
be included in WQE. For RC QPs, it will be filled in
hns_roce_set_path(). So sgid index shouldn't be filled by default. Then
hns_get_gid_index() is moved to hns_roce_hw_v1.c because it is only called
in it.
Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1612517974-31867-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Direct wqe is a mechanism to fill wqe directly into the hardware. In the
case of light load, the wqe will be filled into pcie bar space of the
hardware, this will reduce one memory access operation and therefore
reduce the latency.
Link: https://lore.kernel.org/r/1611997513-27107-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
mlx5-updates-2021-02-04
Vlad Buslov says:
=================
Implement support for VF tunneling
Abstract
Currently, mlx5 only supports configuration with tunnel endpoint IP address on
uplink representor. Remove implicit and explicit assumptions of tunnel always
being terminated on uplink and implement necessary infrastructure for
configuring tunnels on VF representors and updating rules on such tunnels
according to routing changes.
SW TC model
From TC perspective VF tunnel configuration requires two rules in both
directions:
TX rules
1. Rule that redirects packets from UL to VF rep that has the tunnel
endpoint IP address:
$ tc -s filter show dev enp8s0f0 ingress
filter protocol ip pref 4 flower chain 0
filter protocol ip pref 4 flower chain 0 handle 0x1
dst_mac 16:c9:a0:2d:69:2c
src_mac 0c:42:a1:58:ab:e4
eth_type ipv4
ip_flags nofrag
in_hw in_hw_count 1
action order 1: mirred (Egress Redirect to device enp8s0f0_0) stolen
index 3 ref 1 bind 1 installed 377 sec used 0 sec
Action statistics:
Sent 114096 bytes 952 pkt (dropped 0, overlimits 0 requeues 0)
Sent software 0 bytes 0 pkt
Sent hardware 114096 bytes 952 pkt
backlog 0b 0p requeues 0
cookie 878fa48d8c423fc08c3b6ca599b50a97
no_percpu
used_hw_stats delayed
2. Rule that decapsulates the tunneled flow and redirects to destination VF
representor:
$ tc -s filter show dev vxlan_sys_4789 ingress
filter protocol ip pref 4 flower chain 0
filter protocol ip pref 4 flower chain 0 handle 0x1
dst_mac ca:2e:a7:3f:f5:0f
src_mac 0a:40:bd:30:89:99
eth_type ipv4
enc_dst_ip 7.7.7.5
enc_src_ip 7.7.7.1
enc_key_id 98
enc_dst_port 4789
enc_tos 0
ip_flags nofrag
in_hw in_hw_count 1
action order 1: tunnel_key unset pipe
index 2 ref 1 bind 1 installed 434 sec used 434 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
used_hw_stats delayed
action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen
index 4 ref 1 bind 1 installed 434 sec used 0 sec
Action statistics:
Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0)
Sent software 0 bytes 0 pkt
Sent hardware 129936 bytes 1082 pkt
backlog 0b 0p requeues 0
cookie ac17cf398c4c69e4a5b2f7aabd1b88ff
no_percpu
used_hw_stats delayed
RX rules
1. Rule that encapsulates the tunneled flow and redirects packets from
source VF rep to tunnel device:
$ tc -s filter show dev enp8s0f0_1 ingress
filter protocol ip pref 4 flower chain 0
filter protocol ip pref 4 flower chain 0 handle 0x1
dst_mac 0a:40:bd:30:89:99
src_mac ca:2e:a7:3f:f5:0f
eth_type ipv4
ip_tos 0/0x3
ip_flags nofrag
in_hw in_hw_count 1
action order 1: tunnel_key set
src_ip 7.7.7.5
dst_ip 7.7.7.1
key_id 98
dst_port 4789
nocsum
ttl 64 pipe
index 1 ref 1 bind 1 installed 411 sec used 411 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
no_percpu
used_hw_stats delayed
action order 2: mirred (Egress Redirect to device vxlan_sys_4789) stolen
index 1 ref 1 bind 1 installed 411 sec used 0 sec
Action statistics:
Sent 5615833 bytes 4028 pkt (dropped 0, overlimits 0 requeues 0)
Sent software 0 bytes 0 pkt
Sent hardware 5615833 bytes 4028 pkt
backlog 0b 0p requeues 0
cookie bb406d45d343bf7ade9690ae80c7cba4
no_percpu
used_hw_stats delayed
2. Rule that redirects from tunnel device to UL rep:
$ tc -s filter show dev vxlan_sys_4789 ingress
filter protocol ip pref 4 flower chain 0
filter protocol ip pref 4 flower chain 0 handle 0x1
dst_mac ca:2e:a7:3f:f5:0f
src_mac 0a:40:bd:30:89:99
eth_type ipv4
enc_dst_ip 7.7.7.5
enc_src_ip 7.7.7.1
enc_key_id 98
enc_dst_port 4789
enc_tos 0
ip_flags nofrag
in_hw in_hw_count 1
action order 1: tunnel_key unset pipe
index 2 ref 1 bind 1 installed 434 sec used 434 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
used_hw_stats delayed
action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen
index 4 ref 1 bind 1 installed 434 sec used 0 sec
Action statistics:
Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0)
Sent software 0 bytes 0 pkt
Sent hardware 129936 bytes 1082 pkt
backlog 0b 0p requeues 0
cookie ac17cf398c4c69e4a5b2f7aabd1b88ff
no_percpu
used_hw_stats delayed
HW offloads model
For hardware offload the goal is to mach packet on both rules without exposing
it to software on tunnel endpoint VF. In order to achieve this for tx, TC
implementation marks encap rules with tunnel endpoint on mlx5 VF of same eswitch
with MLX5_ESW_DEST_CHAIN_WITH_SRC_PORT_CHANGE flag and adds header modification
rule to overwrite packet source port to the value of tunnel VF. Eswitch code is
modified to recirculate such packets after source port value is changed, which
allows second tx rules to match.
For rx path indirect table infrastructure is used to allow fully processing VF
tunnel traffic in hardware. To implement such pipeline driver needs to program
the hardware after matching on UL rule to overwrite source vport from UL to
tunnel VF and recirculate the packet to the root table to allow matching on the
rule installed on tunnel VF. For this, indirect table matches all encapsulated
traffic by tunnel parameters and all other IP traffic is sent to tunnel VF by
the miss rule. Such configuration will cause packet to appear on VF representor
instead of VF itself if packet has been matches by indirect table rule based on
tunnel parameters but missed on second rule (after recirculation). Handle such
case by marking packets processed by indirect table with special 0xFFF value in
reg_c1 and extending slow table with additional flow group that matches on
reg_c0 (source port value set by indirect tables) and reg_c1 (special 0xFFF
mark). When creating offloads fdb tables, install one rule per VF vport to match
on recirculated miss packets and redirect them to appropriate VF vport.
Routing events
In order to support routing changes and migration of tunnel device between
different endpoint VFs, implement routing infrastructure and update it with FIB
events. Routing entry table is introduced to mlx5 TC. Every rx and tx VF tunnel
rule is attached to a routing entry, which is shared for rules of same tunnel.
On FIB event the work is scheduled to delete/recreate all rules of affected
tunnel.
Note: only vxlan tunnel type is supported by this series.
=================
|
|
The size of tx_valid_cpus was calculated under the assumption that the
numa nodes identifiers are continuous, which is not the case in all archs
as this could lead to the following panic when trying to access an invalid
tx_valid_cpus index, avoid the following panic by using nr_node_ids
instead of num_online_nodes() to allocate the tx_valid_cpus size.
Kernel attempted to read user page (8) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000008
Faulting instruction address: 0xc0080000081b4a90
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: siw(+) rfkill rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm sunrpc ib_umad rdma_cm ib_cm iw_cm i40iw ib_uverbs ib_core i40e ses enclosure scsi_transport_sas ipmi_powernv ibmpowernv at24 ofpart ipmi_devintf regmap_i2c ipmi_msghandler powernv_flash uio_pdrv_genirq uio mtd opal_prd zram ip_tables xfs libcrc32c sd_mod t10_pi ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto aacraid drm_panel_orientation_quirks dm_mod
CPU: 40 PID: 3279 Comm: modprobe Tainted: G W X --------- --- 5.11.0-0.rc4.129.eln108.ppc64le #2
NIP: c0080000081b4a90 LR: c0080000081b4a2c CTR: c0000000007ce1c0
REGS: c000000027fa77b0 TRAP: 0300 Tainted: G W X --------- --- (5.11.0-0.rc4.129.eln108.ppc64le)
MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 44224882 XER: 00000000
CFAR: c0000000007ce200 DAR: 0000000000000008 DSISR: 40000000 IRQMASK: 0
GPR00: c0080000081b4a2c c000000027fa7a50 c0080000081c3900 0000000000000040
GPR04: c000000002023080 c000000012e1c300 000020072ad70000 0000000000000001
GPR08: c000000001726068 0000000000000008 0000000000000008 c0080000081b5758
GPR12: c0000000007ce1c0 c0000007fffc3000 00000001590b1e40 0000000000000000
GPR16: 0000000000000000 0000000000000001 000000011ad68fc8 00007fffcc09c5c8
GPR20: 0000000000000008 0000000000000000 00000001590b2850 00000001590b1d30
GPR24: 0000000000043d68 000000011ad67a80 000000011ad67a80 0000000000100000
GPR28: c000000012e1c300 c0000000020271c8 0000000000000001 c0080000081bf608
NIP [c0080000081b4a90] siw_init_cpulist+0x194/0x214 [siw]
LR [c0080000081b4a2c] siw_init_cpulist+0x130/0x214 [siw]
Call Trace:
[c000000027fa7a50] [c0080000081b4a2c] siw_init_cpulist+0x130/0x214 [siw] (unreliable)
[c000000027fa7a90] [c0080000081b4e68] siw_init_module+0x40/0x2a0 [siw]
[c000000027fa7b30] [c0000000000124f4] do_one_initcall+0x84/0x2e0
[c000000027fa7c00] [c000000000267ffc] do_init_module+0x7c/0x350
[c000000027fa7c90] [c00000000026a180] __do_sys_init_module+0x210/0x250
[c000000027fa7db0] [c0000000000387e4] system_call_exception+0x134/0x230
[c000000027fa7e10] [c00000000000d660] system_call_common+0xf0/0x27c
Instruction dump:
40810044 3d420000 e8bf0000 e88a82d0 3d420000 e90a82c8 792a1f24 7cc4302a
7d2642aa 79291f24 7d25482a 7d295214 <7d4048a8> 7d4a3b78 7d4049ad 40c2fff4
Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface")
Link: https://lore.kernel.org/r/20210201112922.141085-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It is likely that this is a leftover from T3 driver heritage. cxgb4 uses
the PCI core VPD access code that handles detection of VPD capabilities.
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The post_recv only supports QP types of RC, GSI and UD.
Link: https://lore.kernel.org/r/1611997090-48820-13-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The SRQ in the hns driver consists of the following four parts:
* wqe buf: the buffer to store WQE.
* wqe_idx buf: the cqe of SRQ may be not generated in the order of wqe, so
the wqe_idx corresponding to the idle WQE needs to be pushed into the
index queue which is a FIFO, then it instructs the hardware to obtain
the corresponding WQE.
* bitmap: bitmap is used to generate and release wqe_idx. When the user
has a new WR, the driver finds the idx of the idle wqe in bitmap. When
the CQE of wqe is generated, the driver will release the idx.
* wr_id buf: wr_id buf is used to store the user's wr_id, then return it
to the user when poll_cq verb is invoked.
The process of post SRQ recv is refactored to make preceding code clearer.
Link: https://lore.kernel.org/r/1611997090-48820-12-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The HIP09 requires the driver to clear the unused data segments in wqe
buffer to make the hns ROCEE stop reading the remaining invalid sges for
RQ.
Link: https://lore.kernel.org/r/1611997090-48820-11-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Refactor post recv flow by removing unnecessary checking and removing
duplicated code.
Link: https://lore.kernel.org/r/1611997090-48820-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use new register operation interfaces to simplify the process of write SRQ
Context.
Link: https://lore.kernel.org/r/1611997090-48820-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Reduce parameter numbers of write_srqc() and move some related code into
it from alloc_srqc().
Link: https://lore.kernel.org/r/1611997090-48820-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Split the SRQ creation process into multiple steps and encapsulate them
into functions.
Link: https://lore.kernel.org/r/1611997090-48820-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Each SRQs contain an reserved WQE, it is inappropriate and should be
removed.
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-6-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When an error occurs, the qp_table must be cleared, regardless of whether
the SRQ feature is enabled.
Fixes: 5c1f167af112 ("RDMA/hns: Init SRQ table for hip08")
Link: https://lore.kernel.org/r/1611997090-48820-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
According to the IB Specification, srq_limit shouldn't be configured
during SRQ creation. If a user set srq_limit at this time, the driver
should forced it to zero, or the result of creating SRQ will conflict with
the result of querying SRQ.
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If a user posts WR by wr_list, the head pointer of idx_queue won't be
updated until all wqes are filled, so the judgment of whether head equals
to tail will get a wrong result. Fix above issue and move the head and
tail pointer from the srq structure into the idx_queue structure. After
idx_queue is filled with wqe idx, the head pointer of it will increase.
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The RQ/SRQ of HIP08 needs one special sge to stop receive reliably. So the
driver needs to allocate at least one SGE when creating RQ/SRQ and ensure
that at least one SGE is filled with the special value during post_recv.
Besides, the kernel driver should only do this for kernel ULP. For
userspace ULP, the userspace driver will allocate the reserved SGE in
buffer, and the kernel driver just needs to pin the corresponding size of
memory based on the userspace driver's requirements.
Link: https://lore.kernel.org/r/1611997090-48820-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The variable r is defined at the beginning and initialized
to 0 until the function returns r, and the variable r is
not reassigned.Therefore, we do not need to define the
variable r, just return 0 directly at the end of the function.
Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The BXT/GLK DPLL can't generate certain frequencies. We already
reject the 233-240MHz range on both. But on GLK the DPLL max
frequency was bumped from 300MHz to 594MHz, so now we get to
also worry about the 446-480MHz range (double the original
problem range). Reject any frequency within the higher
problematic range as well.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3000
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210203093044.30532-1-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
(cherry picked from commit 41751b3e5c1ac656a86f8d45a8891115281b729e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Flush; invalidate; change registers; invalidate; flush.
Will this finally work on every device? Or will Baytrail complain again?
On the positive side, we immediately see the benefit of having hsw-gt1 in
CI.
Fixes: ace44e13e577 ("drm/i915/gt: Clear CACHE_MODE prior to clearing residuals")
Testcase: igt/gem_render_tiled_blits # hsw-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210125220247.31701-1-chris@chris-wilson.co.uk
(cherry picked from commit d30bbd62b1bfd9e0a33c3583c5a9e5d66f60cbd7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
the machine stops responding milliseconds after receipt of the reset
request [GDRT]. By disabling the cached atomics, the hang do not occur
and we presume the GPU would reset normally for similar hangs.
Sadly this is a shotgun approach, but since the impact is critical it is
better to err on the safe side and work back from there.
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlesktrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210125220152.24070-1-chris@chris-wilson.co.uk
Cc: stable@vger.kernel.org
(cherry picked from commit b267c7ae0ad5b437b068f46919b17f85000154b4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Push the hibernate pm routines next to the suspend pm routines in
gem/i915_gem_pm.c. This has the side-effect of putting the wbinvd()
abusers next to each other.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 30d2bfd09383 ("drm/i915/gem: Almagamate clflushes on freeze")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210123145543.10533-1-chris@chris-wilson.co.uk
(cherry picked from commit 6d8f02207420e76db693a00ccb44792474e297fc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
This code will Oops when it tries to i915_gem_object_free(obj) because
"obj" is an error pointer.
Fixes: 97d553963250 ("drm/i915/region: convert object_create into object_init")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/YA6FkPn5S4ZDUGxq@mwanda
(cherry picked from commit ad8db423a30f0ac39a5483dfd726058135ff2bd2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Smatch found an uninitialized variable bug in this code:
drivers/gpu/drm/i915/gvt/cmd_parser.c:3191 intel_gvt_update_reg_whitelist()
error: uninitialized symbol 'ret'.
The first thing that Smatch complains about is that "ret" isn't set if
we don't enter the "for_each_engine(engine, &dev_priv->gt, id) {" loop.
Presumably we always have at least one engine so that's a false
positive.
But it's definitely a bug to not set "ret" if i915_gem_object_pin_map()
fails.
Let's fix the bug and silence the false positive.
Fixes: 493f30cd086e ("drm/i915/gvt: parse init context to update cmd accessible reg whitelist")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/YA6F3oF8mRaNQWjb@mwanda
(cherry picked from commit 784f70e17e6bc423a04fb6524634a76f68ab1192)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Let's not encourage everybody to build i915's debug code, and certainly
not the build robots who need to scrutinise the production build. Since
CI will complain if the debug build is broken, having the other build
bots focus on the builds we don't cover ourselves should improve the
build coverage.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 4f86975f539d ("drm/i915: Add DEBUG_GEM to the recommended CI config")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122091058.5145-1-chris@chris-wilson.co.uk
(cherry picked from commit c442f658299d59b327a4bf21457ec8ece936f133)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Support for this machine was just removed, so drop the now unused clk
driver, too.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20210114151630.128830-3-u.kleine-koenig@pengutronix.de
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
rxe_udp_encap_recv() drops the reference to rxe->ib_dev taken by
rxe_get_dev_from_net() which should be held until each received skb is
freed. This patch moves the calls to ib_device_put() to each place a
received skb is freed. It also takes references to the ib_device for each
cloned skb created to process received multicast packets.
Fixes: 4c173f596b3f ("RDMA/rxe: Use ib_device_get_by_netdev() instead of open coding")
Link: https://lore.kernel.org/r/20210128233318.2591-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This GDSC enables (or cuts!) power to the Multimedia Subsystem IOMMU
(mmss smmu), which has bootloader pre-set secure contexts.
In the event of a complete power loss, the secure contexts will be
reset and the hypervisor will crash the SoC.
To prevent this, and get a working multimedia subsystem, set this
GDSC as always on.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Link: https://lore.kernel.org/r/20210114221059.483390-10-angelogioacchino.delregno@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Hardware clock gating is supported on some of the clocks declared in
there: ignoring that it does exist may lead to unstabilities on some
firmwares.
Add the HWCG registers where applicable to stop potential crashes.
This was verified on a smartphone shipped with a recent MSM8998
firmware, which will experience random crashes without this change.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Link: https://lore.kernel.org/r/20210114221059.483390-9-angelogioacchino.delregno@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
All of the GPLLs in the MSM8998 Global Clock Controller are Fabia PLLs
and not generic alphas: this was producing bad effects over the entire
clock tree of MSM8998, where any GPLL child clock was declaring a false
clock rate, due to their parent also showing the same.
The issue resides in the calculation of the clock rate for the specific
Alpha PLL type, where Fabia has a different register layout; switching
the MSM8998 GPLLs to the correct Alpha Fabia PLL type fixes the rate
(calculation) reading. While at it, also make these PLLs fixed since
their rate is supposed to *never* be changed while the system runs, as
this would surely crash the entire SoC.
Now all the children of all the PLLs are also complying with their
specified clock table and system stability is improved.
Fixes: b5f5f525c547 ("clk: qcom: Add MSM8998 Global Clock Control (GCC) driver")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Link: https://lore.kernel.org/r/20210114221059.483390-7-angelogioacchino.delregno@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The GPU IOMMU depends on this clock and the hypervisor will crash
the SoC if this clock gets disabled because the secure contexts
that have been set on this IOMMU by the bootloader will become
unaccessible (or they get reset).
Mark this clock as critical to avoid this issue when the Adreno
GPU is enabled.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Link: https://lore.kernel.org/r/20210114221059.483390-6-angelogioacchino.delregno@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
To achieve CPR-Hardened functionality this clock must be on: add it
in order to be able to get it managed by the CPR3 driver.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Link: https://lore.kernel.org/r/20210114221059.483390-5-angelogioacchino.delregno@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
This clock enables the GPLL0 output to the multimedia subsystem
clock controller.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Link: https://lore.kernel.org/r/20210114221059.483390-3-angelogioacchino.delregno@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
This commit adds a regulator supply hook to mmcx-reg missing from
- mvs0c_gdsc
- mvs1c_gdsc
- mvs0_gdsc
- mvs1_gdsc
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210204150120.1521959-5-bryan.odonoghue@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
This patch adds the missing video_cc_mvs0_clk entry to
videocc-sm8250 replicating in upstream the explicit entry for this clock in
downstream.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210204150120.1521959-4-bryan.odonoghue@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
This patch adds the missing video_cc_mvs0_div_clk_src entry to
videocc-sm8250 replicating in upstream the explicit entry for this clock in
downstream.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210204150120.1521959-3-bryan.odonoghue@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
This adds Global Clock controller (GCC) driver for SM8350 SoC
Signed-off-by: Vivek Aknurwar <viveka@codeaurora.org>
Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
[vkoul: rebase and tidy up for upstream]
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210127070811.152690-6-vkoul@kernel.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Lucid 5LPE is a slightly different Lucid PLL with different offsets and
porgramming sequence so add support for these
Signed-off-by: Vivek Aknurwar <viveka@codeaurora.org>
Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
[vkoul: rebase and tidy up for upstream]
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210127070811.152690-4-vkoul@kernel.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Trion 5LPE set rate uses code similar to alpha_pll_trion_set_rate() but
with different registers. Modularize these by moving out latch and latch
ack bits so that we can reuse the function.
Suggested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20210127070811.152690-3-vkoul@kernel.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Driver uses regval variable for holding register values, replace with a
shorter one val
Suggested-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20210127070811.152690-2-vkoul@kernel.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Add clocks, resets and some of the GDSC provided by the global clock
controller found in the Qualcomm SC8180x platform.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210126043155.1847823-2-bjorn.andersson@linaro.org
[sboyd@kernel.org: Drop F macro as it's already defined]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|