linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-02-08	RDMA/hns: Remove unnecessary wrap around for EQ's consumer index	Yixian Liu
	The hns driver wrap around the consumer index of AEQ and CEQ when they reach to two times of queue entries number for owner mechanism, actually, it is unnecessary to wrap around since the hardware itself will mask it before use. Link: https://lore.kernel.org/r/1612517974-31867-12-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Avoid unnecessary memset on WQEs in post_send	Lang Cheng
	All fields of WQE will be rewrote, so the memset is unnecessary. And when SQ is working in OWNER mode, the pipeline may prefetch the WQEs beyond PI, the memset operation may flip the owner bit too early, then the pipeline may get a wrong WQ. Link: https://lore.kernel.org/r/1612517974-31867-11-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Remove some magic numbers	Xinhao Liu
	Use macros instead of magic numbers to represent shift of dma_handle_wqe, dma_handle_idx and UDP destination port number of RoCEv2. Link: https://lore.kernel.org/r/1612517974-31867-10-git-send-email-liweihang@huawei.com Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Move HIP06 related definitions into hns_roce_hw_v1.h	Lang Cheng
	hns_roce_device.h is not specific to hardware, some definitions are only used for HIP06, they should be moved into hns_roce_hw_v1.h. Link: https://lore.kernel.org/r/1612517974-31867-9-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Replace wmb&__raw_writeq with writeq	Lang Cheng
	Currently, the driver updates doorbell looks like this: post() { wqe.field = 0x111; wmb(); update_wq_db(); } update_wq_db() { db.field = 0x222; __raw_writeq(db, db_reg); } writeq() is a better choice than __raw_writeq() because it calls dma_wmb() to barrier in ARM64, and dma_wmb() is better than wmb() for ROCEE device. This patch removes all wmb() before updating doorbell of SQ/RQ/CQ/SRQ by replacing __raw_writeq() with writeq() to improve performence. The new process looks like this: post() { wqe.field = 0x111; update_wq_db(); } update_wq_db() { db.field = 0x222; writeq(db, db_reg); } Link: https://lore.kernel.org/r/1612517974-31867-8-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Skip qp_flow_control_init() for HIP09	Yixing Liu
	Since HIP09 does not require this function, it should be masked. Link: https://lore.kernel.org/r/1612517974-31867-7-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Disable RQ inline by default	Lijun Ou
	This feature should only be enabled by querying capability from firmware. Fixes: ba6bb7e97421 ("RDMA/hns: Add interfaces to get pf capabilities from firmware") Link: https://lore.kernel.org/r/1612517974-31867-5-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Add mapped page count checking for MTR	Xi Wang
	Add the mapped page count checking flow to avoid invalid page size when creating MTR. Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing") Link: https://lore.kernel.org/r/1612517974-31867-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Fix type of sq_signal_bits	Weihang Li
	This bit should be in type of enum ib_sig_type, or there will be a sparse warning. Fixes: bfe860351e31 ("RDMA/hns: Fix cast from or to restricted __le32 for driver") Link: https://lore.kernel.org/r/1612517974-31867-3-git-send-email-liweihang@huawei.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Avoid filling sgid index when modifying QP to RTR	Weihang Li
	ULP usually set IB(V)_QP_AV when trying to modify QP to RTR if they want to record sgid index into QPC. For UD QPs, it is useless because it will be included in WQE. For RC QPs, it will be filled in hns_roce_set_path(). So sgid index shouldn't be filled by default. Then hns_get_gid_index() is moved to hns_roce_hw_v1.c because it is only called in it. Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC") Link: https://lore.kernel.org/r/1612517974-31867-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Add support of direct wqe	Yixing Liu
	Direct wqe is a mechanism to fill wqe directly into the hardware. In the case of light load, the wqe will be filled into pcie bar space of the hardware, this will reduce one memory access operation and therefore reduce the latency. Link: https://lore.kernel.org/r/1611997513-27107-1-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	Merge tag 'mlx5-updates-2021-02-04' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2021-02-04 Vlad Buslov says: ================= Implement support for VF tunneling Abstract Currently, mlx5 only supports configuration with tunnel endpoint IP address on uplink representor. Remove implicit and explicit assumptions of tunnel always being terminated on uplink and implement necessary infrastructure for configuring tunnels on VF representors and updating rules on such tunnels according to routing changes. SW TC model From TC perspective VF tunnel configuration requires two rules in both directions: TX rules 1. Rule that redirects packets from UL to VF rep that has the tunnel endpoint IP address: $ tc -s filter show dev enp8s0f0 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac 16:c9:a0:2d:69:2c src_mac 0c:42:a1:58:ab:e4 eth_type ipv4 ip_flags nofrag in_hw in_hw_count 1 action order 1: mirred (Egress Redirect to device enp8s0f0_0) stolen index 3 ref 1 bind 1 installed 377 sec used 0 sec Action statistics: Sent 114096 bytes 952 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 114096 bytes 952 pkt backlog 0b 0p requeues 0 cookie 878fa48d8c423fc08c3b6ca599b50a97 no_percpu used_hw_stats delayed 2. Rule that decapsulates the tunneled flow and redirects to destination VF representor: $ tc -s filter show dev vxlan_sys_4789 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac ca:2e:a7:3f:f5:0f src_mac 0a:40:bd:30:89:99 eth_type ipv4 enc_dst_ip 7.7.7.5 enc_src_ip 7.7.7.1 enc_key_id 98 enc_dst_port 4789 enc_tos 0 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 installed 434 sec used 434 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen index 4 ref 1 bind 1 installed 434 sec used 0 sec Action statistics: Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 129936 bytes 1082 pkt backlog 0b 0p requeues 0 cookie ac17cf398c4c69e4a5b2f7aabd1b88ff no_percpu used_hw_stats delayed RX rules 1. Rule that encapsulates the tunneled flow and redirects packets from source VF rep to tunnel device: $ tc -s filter show dev enp8s0f0_1 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac 0a:40:bd:30:89:99 src_mac ca:2e:a7:3f:f5:0f eth_type ipv4 ip_tos 0/0x3 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key set src_ip 7.7.7.5 dst_ip 7.7.7.1 key_id 98 dst_port 4789 nocsum ttl 64 pipe index 1 ref 1 bind 1 installed 411 sec used 411 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 no_percpu used_hw_stats delayed action order 2: mirred (Egress Redirect to device vxlan_sys_4789) stolen index 1 ref 1 bind 1 installed 411 sec used 0 sec Action statistics: Sent 5615833 bytes 4028 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 5615833 bytes 4028 pkt backlog 0b 0p requeues 0 cookie bb406d45d343bf7ade9690ae80c7cba4 no_percpu used_hw_stats delayed 2. Rule that redirects from tunnel device to UL rep: $ tc -s filter show dev vxlan_sys_4789 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac ca:2e:a7:3f:f5:0f src_mac 0a:40:bd:30:89:99 eth_type ipv4 enc_dst_ip 7.7.7.5 enc_src_ip 7.7.7.1 enc_key_id 98 enc_dst_port 4789 enc_tos 0 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 installed 434 sec used 434 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen index 4 ref 1 bind 1 installed 434 sec used 0 sec Action statistics: Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 129936 bytes 1082 pkt backlog 0b 0p requeues 0 cookie ac17cf398c4c69e4a5b2f7aabd1b88ff no_percpu used_hw_stats delayed HW offloads model For hardware offload the goal is to mach packet on both rules without exposing it to software on tunnel endpoint VF. In order to achieve this for tx, TC implementation marks encap rules with tunnel endpoint on mlx5 VF of same eswitch with MLX5_ESW_DEST_CHAIN_WITH_SRC_PORT_CHANGE flag and adds header modification rule to overwrite packet source port to the value of tunnel VF. Eswitch code is modified to recirculate such packets after source port value is changed, which allows second tx rules to match. For rx path indirect table infrastructure is used to allow fully processing VF tunnel traffic in hardware. To implement such pipeline driver needs to program the hardware after matching on UL rule to overwrite source vport from UL to tunnel VF and recirculate the packet to the root table to allow matching on the rule installed on tunnel VF. For this, indirect table matches all encapsulated traffic by tunnel parameters and all other IP traffic is sent to tunnel VF by the miss rule. Such configuration will cause packet to appear on VF representor instead of VF itself if packet has been matches by indirect table rule based on tunnel parameters but missed on second rule (after recirculation). Handle such case by marking packets processed by indirect table with special 0xFFF value in reg_c1 and extending slow table with additional flow group that matches on reg_c0 (source port value set by indirect tables) and reg_c1 (special 0xFFF mark). When creating offloads fdb tables, install one rule per VF vport to match on recirculated miss packets and redirect them to appropriate VF vport. Routing events In order to support routing changes and migration of tunnel device between different endpoint VFs, implement routing infrastructure and update it with FIB events. Routing entry table is introduced to mlx5 TC. Every rx and tx VF tunnel rule is attached to a routing entry, which is shared for rules of same tunnel. On FIB event the work is scheduled to delete/recreate all rules of affected tunnel. Note: only vxlan tunnel type is supported by this series. =================
2021-02-08	RDMA/siw: Fix calculation of tx_valid_cpus size	Kamal Heib
	The size of tx_valid_cpus was calculated under the assumption that the numa nodes identifiers are continuous, which is not the case in all archs as this could lead to the following panic when trying to access an invalid tx_valid_cpus index, avoid the following panic by using nr_node_ids instead of num_online_nodes() to allocate the tx_valid_cpus size. Kernel attempted to read user page (8) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000008 Faulting instruction address: 0xc0080000081b4a90 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: siw(+) rfkill rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm sunrpc ib_umad rdma_cm ib_cm iw_cm i40iw ib_uverbs ib_core i40e ses enclosure scsi_transport_sas ipmi_powernv ibmpowernv at24 ofpart ipmi_devintf regmap_i2c ipmi_msghandler powernv_flash uio_pdrv_genirq uio mtd opal_prd zram ip_tables xfs libcrc32c sd_mod t10_pi ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto aacraid drm_panel_orientation_quirks dm_mod CPU: 40 PID: 3279 Comm: modprobe Tainted: G W X --------- --- 5.11.0-0.rc4.129.eln108.ppc64le #2 NIP: c0080000081b4a90 LR: c0080000081b4a2c CTR: c0000000007ce1c0 REGS: c000000027fa77b0 TRAP: 0300 Tainted: G W X --------- --- (5.11.0-0.rc4.129.eln108.ppc64le) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 44224882 XER: 00000000 CFAR: c0000000007ce200 DAR: 0000000000000008 DSISR: 40000000 IRQMASK: 0 GPR00: c0080000081b4a2c c000000027fa7a50 c0080000081c3900 0000000000000040 GPR04: c000000002023080 c000000012e1c300 000020072ad70000 0000000000000001 GPR08: c000000001726068 0000000000000008 0000000000000008 c0080000081b5758 GPR12: c0000000007ce1c0 c0000007fffc3000 00000001590b1e40 0000000000000000 GPR16: 0000000000000000 0000000000000001 000000011ad68fc8 00007fffcc09c5c8 GPR20: 0000000000000008 0000000000000000 00000001590b2850 00000001590b1d30 GPR24: 0000000000043d68 000000011ad67a80 000000011ad67a80 0000000000100000 GPR28: c000000012e1c300 c0000000020271c8 0000000000000001 c0080000081bf608 NIP [c0080000081b4a90] siw_init_cpulist+0x194/0x214 [siw] LR [c0080000081b4a2c] siw_init_cpulist+0x130/0x214 [siw] Call Trace: [c000000027fa7a50] [c0080000081b4a2c] siw_init_cpulist+0x130/0x214 [siw] (unreliable) [c000000027fa7a90] [c0080000081b4e68] siw_init_module+0x40/0x2a0 [siw] [c000000027fa7b30] [c0000000000124f4] do_one_initcall+0x84/0x2e0 [c000000027fa7c00] [c000000000267ffc] do_init_module+0x7c/0x350 [c000000027fa7c90] [c00000000026a180] __do_sys_init_module+0x210/0x250 [c000000027fa7db0] [c0000000000387e4] system_call_exception+0x134/0x230 [c000000027fa7e10] [c00000000000d660] system_call_common+0xf0/0x27c Instruction dump: 40810044 3d420000 e8bf0000 e88a82d0 3d420000 e90a82c8 792a1f24 7cc4302a 7d2642aa 79291f24 7d25482a 7d295214 <7d4048a8> 7d4a3b78 7d4049ad 40c2fff4 Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Link: https://lore.kernel.org/r/20210201112922.141085-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	cxgb4: remove unused vpd_cap_addr	Heiner Kallweit
	It is likely that this is a leftover from T3 driver heritage. cxgb4 uses the PCI core VPD access code that handles detection of VPD capabilities. Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08	RDMA/hns: Add verification of QP type when post_recv	Wenpeng Liang
	The post_recv only supports QP types of RC, GSI and UD. Link: https://lore.kernel.org/r/1611997090-48820-13-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Refactor hns_roce_v2_post_srq_recv()	Wenpeng Liang
	The SRQ in the hns driver consists of the following four parts: * wqe buf: the buffer to store WQE. * wqe_idx buf: the cqe of SRQ may be not generated in the order of wqe, so the wqe_idx corresponding to the idle WQE needs to be pushed into the index queue which is a FIFO, then it instructs the hardware to obtain the corresponding WQE. * bitmap: bitmap is used to generate and release wqe_idx. When the user has a new WR, the driver finds the idx of the idle wqe in bitmap. When the CQE of wqe is generated, the driver will release the idx. * wr_id buf: wr_id buf is used to store the user's wr_id, then return it to the user when poll_cq verb is invoked. The process of post SRQ recv is refactored to make preceding code clearer. Link: https://lore.kernel.org/r/1611997090-48820-12-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Clear remaining unused sges when post_recv	Xi Wang
	The HIP09 requires the driver to clear the unused data segments in wqe buffer to make the hns ROCEE stop reading the remaining invalid sges for RQ. Link: https://lore.kernel.org/r/1611997090-48820-11-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Refactor post recv flow	Xi Wang
	Refactor post recv flow by removing unnecessary checking and removing duplicated code. Link: https://lore.kernel.org/r/1611997090-48820-10-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Use new interfaces to write SRQC	Lang Cheng
	Use new register operation interfaces to simplify the process of write SRQ Context. Link: https://lore.kernel.org/r/1611997090-48820-9-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Refactor code about SRQ Context	Wenpeng Liang
	Reduce parameter numbers of write_srqc() and move some related code into it from alloc_srqc(). Link: https://lore.kernel.org/r/1611997090-48820-8-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Refactor hns_roce_create_srq()	Wenpeng Liang
	Split the SRQ creation process into multiple steps and encapsulate them into functions. Link: https://lore.kernel.org/r/1611997090-48820-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Remove the reserved WQE of SRQ	Wenpeng Liang
	Each SRQs contain an reserved WQE, it is inappropriate and should be removed. Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Link: https://lore.kernel.org/r/1611997090-48820-6-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Fixed wrong judgments in the goto branch	Wenpeng Liang
	When an error occurs, the qp_table must be cleared, regardless of whether the SRQ feature is enabled. Fixes: 5c1f167af112 ("RDMA/hns: Init SRQ table for hip08") Link: https://lore.kernel.org/r/1611997090-48820-5-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Force srq_limit to 0 when creating SRQ	Wenpeng Liang
	According to the IB Specification, srq_limit shouldn't be configured during SRQ creation. If a user set srq_limit at this time, the driver should forced it to zero, or the result of creating SRQ will conflict with the result of querying SRQ. Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Link: https://lore.kernel.org/r/1611997090-48820-4-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Bugfix for checking whether the srq is full when post wr	Wenpeng Liang
	If a user posts WR by wr_list, the head pointer of idx_queue won't be updated until all wqes are filled, so the judgment of whether head equals to tail will get a wrong result. Fix above issue and move the head and tail pointer from the srq structure into the idx_queue structure. After idx_queue is filled with wqe idx, the head pointer of it will increase. Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Link: https://lore.kernel.org/r/1611997090-48820-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	RDMA/hns: Allocate one more recv SGE for HIP08	Lang Cheng
	The RQ/SRQ of HIP08 needs one special sge to stop receive reliably. So the driver needs to allocate at least one SGE when creating RQ/SRQ and ensure that at least one SGE is filled with the special value during post_recv. Besides, the kernel driver should only do this for kernel ULP. For userspace ULP, the userspace driver will allocate the reserved SGE in buffer, and the kernel driver just needs to pin the corresponding size of memory based on the userspace driver's requirements. Link: https://lore.kernel.org/r/1611997090-48820-2-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	nfc: st-nci: Remove unnecessary variable	wengjianfeng
	The variable r is defined at the beginning and initialized to 0 until the function returns r, and the variable r is not reassigned.Therefore, we do not need to define the variable r, just return 0 directly at the end of the function. Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08	drm/i915: Reject 446-480MHz HDMI clock on GLK	Ville Syrjälä
	The BXT/GLK DPLL can't generate certain frequencies. We already reject the 233-240MHz range on both. But on GLK the DPLL max frequency was bumped from 300MHz to 594MHz, so now we get to also worry about the 446-480MHz range (double the original problem range). Reject any frequency within the higher problematic range as well. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3000 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210203093044.30532-1-ville.syrjala@linux.intel.com Reviewed-by: Mika Kahola <mika.kahola@intel.com> (cherry picked from commit 41751b3e5c1ac656a86f8d45a8891115281b729e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	drm/i915/gt: Flush before changing register state	Chris Wilson
	Flush; invalidate; change registers; invalidate; flush. Will this finally work on every device? Or will Baytrail complain again? On the positive side, we immediately see the benefit of having hsw-gt1 in CI. Fixes: ace44e13e577 ("drm/i915/gt: Clear CACHE_MODE prior to clearing residuals") Testcase: igt/gem_render_tiled_blits # hsw-gt1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210125220247.31701-1-chris@chris-wilson.co.uk (cherry picked from commit d30bbd62b1bfd9e0a33c3583c5a9e5d66f60cbd7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	drm/i915: Disable atomics in L3 for gen9	Chris Wilson
	Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as the machine stops responding milliseconds after receipt of the reset request [GDRT]. By disabling the cached atomics, the hang do not occur and we presume the GPU would reset normally for similar hangs. Sadly this is a shotgun approach, but since the impact is critical it is better to err on the safe side and work back from there. Reported-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlesktrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210125220152.24070-1-chris@chris-wilson.co.uk Cc: stable@vger.kernel.org (cherry picked from commit b267c7ae0ad5b437b068f46919b17f85000154b4) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	drm/i915/gem: Move freeze/freeze_late next to suspend/suspend_late	Chris Wilson
	Push the hibernate pm routines next to the suspend pm routines in gem/i915_gem_pm.c. This has the side-effect of putting the wbinvd() abusers next to each other. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 30d2bfd09383 ("drm/i915/gem: Almagamate clflushes on freeze") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210123145543.10533-1-chris@chris-wilson.co.uk (cherry picked from commit 6d8f02207420e76db693a00ccb44792474e297fc) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	drm/i915/gem: Fix oops in error handling code	Dan Carpenter
	This code will Oops when it tries to i915_gem_object_free(obj) because "obj" is an error pointer. Fixes: 97d553963250 ("drm/i915/region: convert object_create into object_init") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/YA6FkPn5S4ZDUGxq@mwanda (cherry picked from commit ad8db423a30f0ac39a5483dfd726058135ff2bd2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	drm/i915/gvt: fix uninitialized return in intel_gvt_update_reg_whitelist()	Dan Carpenter
	Smatch found an uninitialized variable bug in this code: drivers/gpu/drm/i915/gvt/cmd_parser.c:3191 intel_gvt_update_reg_whitelist() error: uninitialized symbol 'ret'. The first thing that Smatch complains about is that "ret" isn't set if we don't enter the "for_each_engine(engine, &dev_priv->gt, id) {" loop. Presumably we always have at least one engine so that's a false positive. But it's definitely a bug to not set "ret" if i915_gem_object_pin_map() fails. Let's fix the bug and silence the false positive. Fixes: 493f30cd086e ("drm/i915/gvt: parse init context to update cmd accessible reg whitelist") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/YA6F3oF8mRaNQWjb@mwanda (cherry picked from commit 784f70e17e6bc423a04fb6524634a76f68ab1192) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	drm/i915: Restrict DRM_I915_DEBUG to developer builds	Chris Wilson
	Let's not encourage everybody to build i915's debug code, and certainly not the build robots who need to scrutinise the production build. Since CI will complain if the debug build is broken, having the other build bots focus on the builds we don't cover ourselves should improve the build coverage. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 4f86975f539d ("drm/i915: Add DEBUG_GEM to the recommended CI config") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122091058.5145-1-chris@chris-wilson.co.uk (cherry picked from commit c442f658299d59b327a4bf21457ec8ece936f133) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-08	clk: Drop unused efm32gg driver	Uwe Kleine-König
	Support for this machine was just removed, so drop the now unused clk driver, too. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210114151630.128830-3-u.kleine-koenig@pengutronix.de Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()	Bob Pearson
	rxe_udp_encap_recv() drops the reference to rxe->ib_dev taken by rxe_get_dev_from_net() which should be held until each received skb is freed. This patch moves the calls to ib_device_put() to each place a received skb is freed. It also takes references to the ib_device for each cloned skb created to process received multicast packets. Fixes: 4c173f596b3f ("RDMA/rxe: Use ib_device_get_by_netdev() instead of open coding") Link: https://lore.kernel.org/r/20210128233318.2591-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08	clk: qcom: mmcc-msm8998: Set bimc_smmu_gdsc always on	AngeloGioacchino Del Regno
	This GDSC enables (or cuts!) power to the Multimedia Subsystem IOMMU (mmss smmu), which has bootloader pre-set secure contexts. In the event of a complete power loss, the secure contexts will be reset and the hypervisor will crash the SoC. To prevent this, and get a working multimedia subsystem, set this GDSC as always on. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Link: https://lore.kernel.org/r/20210114221059.483390-10-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: mmcc-msm8998: Add hardware clockgating registers to some clks	AngeloGioacchino Del Regno
	Hardware clock gating is supported on some of the clocks declared in there: ignoring that it does exist may lead to unstabilities on some firmwares. Add the HWCG registers where applicable to stop potential crashes. This was verified on a smartphone shipped with a recent MSM8998 firmware, which will experience random crashes without this change. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Link: https://lore.kernel.org/r/20210114221059.483390-9-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: gcc-msm8998: Fix Alpha PLL type for all GPLLs	AngeloGioacchino Del Regno
	All of the GPLLs in the MSM8998 Global Clock Controller are Fabia PLLs and not generic alphas: this was producing bad effects over the entire clock tree of MSM8998, where any GPLL child clock was declaring a false clock rate, due to their parent also showing the same. The issue resides in the calculation of the clock rate for the specific Alpha PLL type, where Fabia has a different register layout; switching the MSM8998 GPLLs to the correct Alpha Fabia PLL type fixes the rate (calculation) reading. While at it, also make these PLLs fixed since their rate is supposed to never be changed while the system runs, as this would surely crash the entire SoC. Now all the children of all the PLLs are also complying with their specified clock table and system stability is improved. Fixes: b5f5f525c547 ("clk: qcom: Add MSM8998 Global Clock Control (GCC) driver") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Link: https://lore.kernel.org/r/20210114221059.483390-7-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: gcc-msm8998: Mark gpu_cfg_ahb_clk as critical	AngeloGioacchino Del Regno
	The GPU IOMMU depends on this clock and the hypervisor will crash the SoC if this clock gets disabled because the secure contexts that have been set on this IOMMU by the bootloader will become unaccessible (or they get reset). Mark this clock as critical to avoid this issue when the Adreno GPU is enabled. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Link: https://lore.kernel.org/r/20210114221059.483390-6-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: gcc-msm8998: Add missing hmss_gpll0_clk_src clock	AngeloGioacchino Del Regno
	To achieve CPR-Hardened functionality this clock must be on: add it in order to be able to get it managed by the CPR3 driver. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Link: https://lore.kernel.org/r/20210114221059.483390-5-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: gcc-msm8998: Wire up gcc_mmss_gpll0 clock	AngeloGioacchino Del Regno
	This clock enables the GPLL0 output to the multimedia subsystem clock controller. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Link: https://lore.kernel.org/r/20210114221059.483390-3-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: videocc: Add gdsc mmcx-reg supply hook	Bryan O'Donoghue
	This commit adds a regulator supply hook to mmcx-reg missing from - mvs0c_gdsc - mvs1c_gdsc - mvs0_gdsc - mvs1_gdsc Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210204150120.1521959-5-bryan.odonoghue@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: videocc: Add sm8250 VIDEO_CC_MVS0_CLK	Bryan O'Donoghue
	This patch adds the missing video_cc_mvs0_clk entry to videocc-sm8250 replicating in upstream the explicit entry for this clock in downstream. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210204150120.1521959-4-bryan.odonoghue@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: videocc: Add sm8250 VIDEO_CC_MVS0_DIV_CLK_SRC	Bryan O'Donoghue
	This patch adds the missing video_cc_mvs0_div_clk_src entry to videocc-sm8250 replicating in upstream the explicit entry for this clock in downstream. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210204150120.1521959-3-bryan.odonoghue@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: gcc: Add clock driver for SM8350	Vivek Aknurwar
	This adds Global Clock controller (GCC) driver for SM8350 SoC Signed-off-by: Vivek Aknurwar <viveka@codeaurora.org> Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org> [vkoul: rebase and tidy up for upstream] Signed-off-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210127070811.152690-6-vkoul@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: clk-alpha-pll: Add support for Lucid 5LPE PLL	Vivek Aknurwar
	Lucid 5LPE is a slightly different Lucid PLL with different offsets and porgramming sequence so add support for these Signed-off-by: Vivek Aknurwar <viveka@codeaurora.org> Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org> [vkoul: rebase and tidy up for upstream] Signed-off-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210127070811.152690-4-vkoul@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: clk-alpha-pll: modularize alpha_pll_trion_set_rate()	Vinod Koul
	Trion 5LPE set rate uses code similar to alpha_pll_trion_set_rate() but with different registers. Modularize these by moving out latch and latch ack bits so that we can reuse the function. Suggested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20210127070811.152690-3-vkoul@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: clk-alpha-pll: replace regval with val	Vinod Koul
	Driver uses regval variable for holding register values, replace with a shorter one val Suggested-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20210127070811.152690-2-vkoul@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-02-08	clk: qcom: gcc: Add global clock controller driver for SC8180x	Bjorn Andersson
	Add clocks, resets and some of the GDSC provided by the global clock controller found in the Qualcomm SC8180x platform. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210126043155.1847823-2-bjorn.andersson@linaro.org [sboyd@kernel.org: Drop F macro as it's already defined] Signed-off-by: Stephen Boyd <sboyd@kernel.org>