From 4fe204977096e900cb91a3298b05c794ac24f540 Mon Sep 17 00:00:00 2001 From: Niklas Schnelle Date: Wed, 7 Jul 2021 10:42:43 +0200 Subject: s390/pci: refresh function handle in iomap The function handle of a PCI function is updated when disabling or enabling it as well as when the function's availability changes or it enters the error state. Until now this only occurred either while there is no struct pci_dev associated with the function yet or the function became unavailable. This meant that leaving a stale function handle in the iomap either didn't happen because there was no iomap yet or it lead to errors on PCI access but so would the correct disabled function handle. In the future a CLP Set PCI Function Disable/Enable cycle during PCI device recovery may be done while the device is bound to a driver. In this case we must update the iomap associated with the now-stale function handle to ensure that the resulting zPCI instruction references an accurate function handle. Since the function handle is accessed by the PCI accessor helpers without locking use READ_ONCE()/WRITE_ONCE() to mark this access and prevent compiler optimizations that would move the load/store. With that infrastructure in place let's also properly update the function handle in the existing cases. This makes sure that in the future debugging of a zPCI function access through the handle will show an up to date handle reducing the chance of confusion. Also it makes sure we have one single place where a zPCI function handle is updated after initialization. Reviewed-by: Pierre Morel Reviewed-by: Matthew Rosato Signed-off-by: Niklas Schnelle Signed-off-by: Vasily Gorbik --- arch/s390/include/asm/pci.h | 1 + 1 file changed, 1 insertion(+) (limited to 'arch/s390/include/asm') diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index 6b3c366af78e..35adc0cd0e6a 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -213,6 +213,7 @@ bool zpci_is_device_configured(struct zpci_dev *zdev); int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64); int zpci_unregister_ioat(struct zpci_dev *, u8); void zpci_remove_reserved_devices(void); +void zpci_update_fh(struct zpci_dev *zdev, u32 fh); /* CLP */ int clp_setup_writeback_mio(void); -- cgit From da995d538d3a17610d89fea0f5813cf7921b3c2c Mon Sep 17 00:00:00 2001 From: Niklas Schnelle Date: Thu, 1 Jul 2021 15:49:11 +0200 Subject: s390/pci: implement reset_slot for hotplug slot This is done by adding a zpci_hot_reset_device() call which does a low level reset of the PCI function without changing its higher level function state. This way it can be used while the zPCI function is bound to a driver and with DMA tables being controlled either through the IOMMU or DMA APIs which is prohibited when using zpci_disable_device() as that drop existing DMA translations. As this reset, unlike a normal FLR, also calls zpci_clear_irq() we need to implement arch_restore_msi_irqs() and make sure we re-enable IRQs for the PCI function if they were previously disabled. Reviewed-by: Pierre Morel Reviewed-by: Matthew Rosato Signed-off-by: Niklas Schnelle Signed-off-by: Vasily Gorbik --- arch/s390/include/asm/pci.h | 1 + 1 file changed, 1 insertion(+) (limited to 'arch/s390/include/asm') diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index 35adc0cd0e6a..a47068a73969 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -210,6 +210,7 @@ int zpci_deconfigure_device(struct zpci_dev *zdev); void zpci_device_reserved(struct zpci_dev *zdev); bool zpci_is_device_configured(struct zpci_dev *zdev); +int zpci_hot_reset_device(struct zpci_dev *zdev); int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64); int zpci_unregister_ioat(struct zpci_dev *, u8); void zpci_remove_reserved_devices(void); -- cgit From 4cdf2f4e24ff0d345fc36ef6d6aec059333a261e Mon Sep 17 00:00:00 2001 From: Niklas Schnelle Date: Wed, 7 Jul 2021 11:00:01 +0200 Subject: s390/pci: implement minimal PCI error recovery When the platform detects an error on a PCI function or a service action has been performed it is put in the error state and an error event notification is provided to the OS. Currently we treat all error event notifications the same and simply set pdev->error_state = pci_channel_io_perm_failure requiring user intervention such as use of the recover attribute to get the device usable again. Despite requiring a manual step this also has the disadvantage that the device is completely torn down and recreated resulting in higher level devices such as a block or network device being recreated. In case of a block device this also means that it may need to be removed and added to a software raid even if that could otherwise survive with a temporary degradation. This is of course not ideal more so since an error notification with PEC 0x3A indicates that the platform already performed error recovery successfully or that the error state was caused by a service action that is now finished. At least in this case we can assume that the error state can be reset and the function made usable again. So as not to have the disadvantage of a full tear down and recreation we need to coordinate this recovery with the driver. Thankfully there is already a well defined recovery flow for this described in Documentation/PCI/pci-error-recovery.rst. The implementation of this is somewhat straight forward and simplified by the fact that our recovery flow is defined per PCI function. As a reset we use the newly introduced zpci_hot_reset_device() which also takes the PCI function out of the error state. Reviewed-by: Pierre Morel Acked-by: Matthew Rosato Signed-off-by: Niklas Schnelle Signed-off-by: Vasily Gorbik --- arch/s390/include/asm/pci.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'arch/s390/include/asm') diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index a47068a73969..90824be5ce9a 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -296,8 +296,10 @@ void zpci_debug_exit(void); void zpci_debug_init_device(struct zpci_dev *, const char *); void zpci_debug_exit_device(struct zpci_dev *); -/* Error reporting */ +/* Error handling */ int zpci_report_error(struct pci_dev *, struct zpci_report_error_header *); +int zpci_clear_error_state(struct zpci_dev *zdev); +int zpci_reset_load_store_blocked(struct zpci_dev *zdev); #ifdef CONFIG_NUMA -- cgit