summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
AgeCommit message (Collapse)Author
2019-08-29net: hns3: implement .process_hw_error for hns3 clientWeihang Li
When hardware or IMP get specified error it may need the client to take some special operations. This patch implements the hns3 client's process_hw_errorx. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Reviewed-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-26net: hns3: add check to number of buffer descriptorsWeihang Li
This patch adds check to number of bds before we allocate memory for them. If we get an invalid bd num in some cases, it will cause a memory overflow. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14net: hns3: some changes of MSI-X bits in PPU(RCB)Weihang Li
This patch modifies print message of rx_q_search_miss from error to dfx to prevent misleading users, because this interrupt may occur if we receive packets during initialization of HNS3 driver. Otherwise, this patch masks 28th bit of PPU_MPF_ABNORMAL_SRC2 which is now meaningless. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14net: hns3: process H/W errors occurred before HNS dev initializationShiju Jose
Presently the HNS driver enables the HNS H/W error interrupts after the dev initialization is completed. However some exceptions such as NCSI errors can occur when the network port driver is not loaded and those errors required reporting to the BMC. Therefore the firmware enabled all the HNS ras error interrupts before the driver is loaded. And in some cases, there will be some H/W errors remained unclear before reboot. Thus the HNS driver needs to process and recover those hw errors occurred before HNS driver is initialized. This patch adds processing of the HNS hw errors(RAS and MSI-X) which occurred before the driver initialization. For RAS, because they are enabled by firmware, so we can detect specific bits, then log and clear them. But for MSI-X which can not be enabled before open vector0 irq, we can't detect the specific error bits, so we just write 1 to all interrupt source registers to clear. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09net: hns3: trigger VF reset if a VF has an over_8bd_nfe_errWeihang Li
We trigger PF reset when a RAS error of NIC named over_8bd_nfe_err occurred before. But it is possible that a VF causes that error, it's reasonable to trigger VF reset instead of PF reset in this case. This patch add detection of vf_id if a over_8bd_nfe_err occurs, if vf_id is 0, we trigger PF reset. Otherwise, we will trigger VF reset on the VF with error. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09net: hns3: log detail error info of ROCEE ECC and AXI errorsXiaofei Tan
This patch logs detail error info of ROCEE ECC and AXI errors for debug purpose, and remove unnecessary reset for ROCEE overflow errors. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-03net: hns3: delay and separate enabling of NIC and ROCE HW errorsWeihang Li
All RAS and MSI-X should be enabled just in the final stage of HNS3 initialization. It means that they should be enabled in hclge_init_xxx_client_instance instead of hclge_ae_dev(). Especially MSI-X, if it is enabled before opening vector0 IRQ, there are some chances that a MSI-X error will cause failure on initialization of NIC client instane. So this patch delays enabling of HW errors. Otherwise, we also separate enabling of ROCE RAS from NIC, because it's not reasonable to enable ROCE RAS if we even don't have a ROCE driver. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-03net: hns3: add handling of two bits in MAC tunnel interruptsWeihang Li
LINK_UP and LINK_DOWN are two bits of MAC tunnel interrupts, but previous HNS3 driver didn't handle them. If they were enabled, value of these two bits will change during link down and link up, which will cause HNS3 driver keep receiving IRQ but can't handle them. This patch adds handling of these two bits of interrupts, we will record and clear them as what we do to other MAC tunnel interrupts. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19net: hns3: Add handling of MAC tunnel interruptionWeihang Li
MAC tnl interruptions are different from other type of RAS and MSI-X errors, because some bits, such as OVF/LR/RF will occur during link up and down. The drivers should clear status of all MAC tnl interruption bits but shouldn't print any message that would mislead the users. In case that link down and re-up in a short time because of some reasons, we record when they occurred, and users can query them by debugfs. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14net: hns3: set dividual reset level for all RAS and MSI-X errorsWeihang Li
According to hardware description, reset level that should be triggered are not consistent in a module. For example, in SSU common errors, the first two bits has no need to do reset, but the other bits need global reset. This patch sets separate reset level for all RAS and MSI-X interrupts by adding a reset_lvel field in struct hclge_hw_error, and fixes some incorrect reset level. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21net: hns3: enable 8~11th bit of mac common msi-x errorWeihang Li
These bits are enabled now and have been test. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21net: hns3: some bugfix of ppu(rcb) ras errorsWeihang Li
The 3rd and 4th of PPU(RCB) PF Abnormal is RAS errors instead of MSI-X like other bits. This patch adds process of handling and logging this two bits. Otherwise, this patch modifies print message of 28th and 29th bit of PPU MPF Abnormal errors, which keep same with other errors now. Fixes: f69b10b317f9 ("net: hns3: handle hw errors of PPU(RCB)") Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of RDMA RAS errorsShiju Jose
This patch handles the RDMA RAS errors. 1. Enable RAS interrupt, print error detail info and clear error status. 2. Do CORE reset to recovery when these non-fatal errors happened. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: handle hw errors of SSUShiju Jose
This patch enables and handles hw errors of the Storage Switch Unit(SSU). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: handle hw errors of PPU(RCB)Shiju Jose
This patch enables and handles hw RAS and MSIx errors of PPU(RCB). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of hw errors of MACShiju Jose
This patch adds enable and handling of hw errors of the MAC block. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of hw errors reported through MSIXSalil Mehta
This patch adds handling for HNS3 hardware errors(non-standard) which are reported through MSIX interrupts and not through PCIe AER channel. These MSIX reported hardware errors are handled using common misc. interrupt handler. Hardware error related registers cannot be cleared in context to the interrupt received as they require *heavy* access to hardware using IMP(Integrated Mangement Processor) commands. Hence, we defer the clearing of such error events till later time. Since, we have defered exact identification of errors we will have to defer the level of receovery/reset which might be required. Hence, a new reset type UNKNOWN reset has been introduced which effectively defers the assertion of the reset till we get hold of kind of errors at later time. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of hw ras errors using new set of commandsShiju Jose
1. This patch adds handling of hw ras errors using new set of common commands. 2. Updated the error message tables to match the register's name and error status returned by the commands. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: rename process_hw_error functionShiju Jose
This patch renames process_hw_error function to handle_hw_ras_error function to match the purpose of the function. This is because hw errors reported through ras and msix interrupts will be handled separately. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: re-enable error interrupts on hw resetShiju Jose
This patch adds calling hclge_hw_error_set_state function to re-enable the error interrupts those will be disabled on the hw reset. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: rename enable error interrupt functionsShiju Jose
This patch - renames the enable error interrupt functions. The reason is that these functions are used for both enable and disable error interrupts. - removes redundant logs from the enable error interrupt functions. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: remove existing process error functions and reorder hw_blk tableShiju Jose
1.The command interface for queryng and clearing hw errors is changed, which requires the new process error functions to be added. This patch removes all the current process error functions and associated definitions. The new functions to handle ras errors would be added in this patch set. 2. Fixed order issue of the hw_blk table. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22net: hns3: Add enable and process hw errors of TM schedulerShiju Jose
This patch enables and process hw errors of TM scheduler and QCN(Quantized Congestion Control). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22net: hns3: Add enable and process hw errors from PPPShiju Jose
This patch enables and process hw errors from the PPP(Programmable Packet Process) block. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22net: hns3: Add enable and process hw errors from IGU, EGU and NCSIShiju Jose
This patch adds enable and processing of hw errors from IGU(Ingress Unit), EGU(Egress Unit) and NCSI(Network Controller Sideband Interface). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22net: hns3: Add enable and process common ecc errorsShiju Jose
This patch adds enable and processing of ecc errors from common HNS blocks, CMDQ(Command Queue), IMP(Integrated Management Processor) and TQP(Task Queue Pair). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22net: hns3: Add support to enable and disable hw errorsShiju Jose
This patch adds functions to enable and disable hw errors. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22net: hns3: Add PCIe AER callback error_detectedShiju Jose
Set of hw errors occurred in the HNS3 are reported to the hns3 driver through PCIe AER and RAS.The error info will be processed and appropriately recovered. This patch adds error_detected callback and error processing. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>