linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-03-31	Merge branch 'net_rwsem-fixes'	David S. Miller
	Kirill Tkhai says: ==================== net_rwsem fixes there is wext_netdev_notifier_call()->wireless_nlevent_flush() netdevice notifier, which takes net_rwsem, so we can't take net_rwsem in {,un}register_netdevice_notifier(). Since {,un}register_netdevice_notifier() is executed under pernet_ops_rwsem, net_namespace_list can't change, while we holding it, so there is no need net_rwsem in these functions [1/2]. The same is in [2/2]. We make callers of __rtnl_link_unregister() take pernet_ops_rwsem, and close the race with setup_net() and cleanup_net(), so __rtnl_link_unregister() does not need it. This also fixes the problem of that __rtnl_link_unregister() does not see initializing and exiting nets. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: Do not take net_rwsem in __rtnl_link_unregister()	Kirill Tkhai
	This function calls call_netdevice_notifier(), which also may take net_rwsem. So, we can't use net_rwsem here. This patch makes callers of this functions take pernet_ops_rwsem, like register_netdevice_notifier() does. This will protect the modifications of net_namespace_list, and allows notifiers to take it (they won't have to care about context). Since __rtnl_link_unregister() is used on module load and unload (which are not frequent operations), this looks for me better, than make all call_netdevice_notifier() always executing in "protected net_namespace_list" context. Also, this fixes the problem we had a deal in 328fbe747ad4 "Close race between {un, }register_netdevice_notifier and ...", and guarantees __rtnl_link_unregister() does not skip exitting net. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: Remove net_rwsem from {, un}register_netdevice_notifier()	Kirill Tkhai
	These functions take net_rwsem, while wireless_nlevent_flush() also takes it. But down_read() can't be taken recursive, because of rw_semaphore design, which prevents it to be occupied by only readers forever. Since we take pernet_ops_rwsem in {,un}register_netdevice_notifier(), net list can't change, so these down_read()/up_read() can be removed. Fixes: f0b07bb151b0 "net: Introduce net_rwsem to protect net_namespace_list" Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: hns3: remove unnecessary pci_set_drvdata() and devm_kfree()	Wei Yongjun
	There is no need for explicit calls of devm_kfree(), as the allocated memory will be freed during driver's detach. The driver core clears the driver data to NULL after device_release. Thus, it is not needed to manually clear the device driver data to NULL. So remove the unnecessary pci_set_drvdata() and devm_kfree(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	netdevsim: Change nsim_devlink_setup to return error to caller	David Ahern
	Change nsim_devlink_setup to return any error back to the caller and update nsim_init to handle it. Requested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	Merge branch 'tipc-slim-down-name-table'	David S. Miller
	Jon Maloy says: ==================== tipc: slim down name table We clean up and improve the name binding table: - Replace the memory consuming 'sub_sequence/service range' array with an RB tree. - Introduce support for overlapping service sequences/ranges v2: #1: Fixed a missing initialization reported by David Miller #4: Obsoleted and replaced a few more macros to get a consistent terminology in the API. #5: Added new commit to fix a potential string overflow bug (it is still only in net-next) reported by Arnd Bergmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	tipc: avoid possible string overflow	Jon Maloy
	gcc points out that the combined length of the fixed-length inputs to l->name is larger than the destination buffer size: net/tipc/link.c: In function 'tipc_link_create': net/tipc/link.c:465:26: error: '%s' directive writing up to 32 bytes into a region of size between 26 and 58 [-Werror=format-overflow=] sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str); net/tipc/link.c:465:2: note: 'sprintf' output 11 or more bytes (assuming 75) into a destination of size 60 sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str); A detailed analysis reveals that the theoretical maximum length of a link name is: max self_str + 1 + max if_name + 1 + max peer_str + 1 + max if_name = 16 + 1 + 15 + 1 + 16 + 1 + 15 = 65 Since we also need space for a trailing zero we now set MAX_LINK_NAME to 68. Just to be on the safe side we also replace the sprintf() call with snprintf(). Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address hash values") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	tipc: tipc: rename address types in user api	Jon Maloy
	The three address type structs in the user API have names that in reality reflect the specific, non-Linux environment where they were originally created. We now give them more intuitive names, in accordance with how TIPC is described in the current documentation. struct tipc_portid -> struct tipc_socket_addr struct tipc_name -> struct tipc_service_addr struct tipc_name_seq -> struct tipc_service_range To avoid confusion, we also update some commmets and macro names to match the new terminology. For compatibility, we add macros that map all old names to the new ones. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	tipc: permit overlapping service ranges in name table	Jon Maloy
	With the new RB tree structure for service ranges it becomes possible to solve an old problem; - we can now allow overlapping service ranges in the table. When inserting a new service range to the tree, we use 'lower' as primary key, and when necessary 'upper' as secondary key. Since there may now be multiple service ranges matching an indicated 'lower' value, we must also add the 'upper' value to the functions used for removing publications, so that the correct, corresponding range item can be found. These changes guarantee that a well-formed publication/withdrawal item from a peer node never will be rejected, and make it possible to eliminate the problematic backlog functionality we currently have for handling such cases. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	tipc: refactor name table translate function	Jon Maloy
	The function tipc_nametbl_translate() function is ugly and hard to follow. This can be improved somewhat by introducing a stack variable for holding the publication list to be used and re-ordering the if- clauses for selection of algorithm. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	tipc: replace name table service range array with rb tree	Jon Maloy
	The current design of the binding table has an unnecessary memory consuming and complex data structure. It aggregates the service range items into an array, which is expanded by a factor two every time it becomes too small to hold a new item. Furthermore, the arrays never shrink when the number of ranges diminishes. We now replace this array with an RB tree that is holding the range items as tree nodes, each range directly holding a list of bindings. This, along with a few name changes, improves both readability and volume of the code, as well as reducing memory consumption and hopefully improving cache hit rate. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	Merge branch 'bridge-mtu'	David S. Miller
	Nikolay Aleksandrov says: ==================== net: bridge: MTU handling changes As previously discussed the recent changes break some setups and could lead to packet drops. Thus the first patch reverts the behaviour for the bridge to follow the minimum MTU but also keeps the ability to set the MTU to the maximum (out of all ports) if vlan filtering is enabled. Patch 02 is the bigger change in behaviour - we've always had trouble when configuring bridges and their MTU which is auto tuning on port events (add/del/changemtu), which means config software needs to chase it and fix it after each such event, after patch 02 we allow the user to configure any MTU (ETH_MIN/MAX limited) but once that is done the bridge stops auto tuning and relies on the user to keep the MTU correct. This should be compatible with cases that don't touch the MTU (or set it to the same value), while allowing to configure the MTU and not worry about it changing afterwards. The patches are intentionally split like this, so that if they get accepted and there are any complaints patch 02 can be reverted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: bridge: disable bridge MTU auto tuning if it was set manually	Nikolay Aleksandrov
	As Roopa noted today the biggest source of problems when configuring bridge and ports is that the bridge MTU keeps changing automatically on port events (add/del/changemtu). That leads to inconsistent behaviour and network config software needs to chase the MTU and fix it on each such event. Let's improve on that situation and allow for the user to set any MTU within ETH_MIN/MAX limits, but once manually configured it is the user's responsibility to keep it correct afterwards. In case the MTU isn't manually set - the behaviour reverts to the previous and the bridge follows the minimum MTU. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: bridge: set min MTU on port events and allow user to set max	Nikolay Aleksandrov
	Recently the bridge was changed to automatically set maximum MTU on port events (add/del/changemtu) when vlan filtering is enabled, but that actually changes behaviour in a way which breaks some setups and can lead to packet drops. In order to still allow that maximum to be set while being compatible, we add the ability for the user to tune the bridge MTU up to the maximum when vlan filtering is enabled, but that has to be done explicitly and all port events (add/del/changemtu) lead to resetting that MTU to the minimum as before. Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	Merge branch 'thunderx-DMAC-filtering'	David S. Miller
	Vadim Lomovtsev says: ==================== net: thunderx: implement DMAC filtering support By default CN88XX BGX accepts all incoming multicast and broadcast packets and filtering is disabled. The nic driver doesn't provide an ability to change such behaviour. This series is to implement DMAC filtering management for CN88XX nic driver allowing user to enable/disable filtering and configure specific MAC addresses to filter traffic. Changes from v1: build issues: - update code in order to address compiler warnings; checkpatch.pl reported issues: - update code in order to fit 80 symbols length; - update commit descriptions in order to fit 80 symbols length; ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: add ndo_set_rx_mode callback implementation for VF	Vadim Lomovtsev
	The ndo_set_rx_mode() is called from atomic context which causes messages response timeouts while VF to PF communication via MSIx. To get rid of that we're copy passed mc list, parse flags and queue handling of kernel request to ordered workqueue. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: add workqueue control structures for handle ndo_set_rx_mode ↵	Vadim Lomovtsev
	request The kernel calls ndo_set_rx_mode() callback from atomic context which causes messaging timeouts between VF and PF (as they’re implemented via MSIx). So in order to handle ndo_set_rx_mode() we need to get rid of it. This commit implements necessary workqueue related structures to let VF queue kernel request processing in non-atomic context later. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: add XCAST messages handlers for PF	Vadim Lomovtsev
	This commit is to add message handling for ndo_set_rx_mode() callback at PF side. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: add new messages for handle ndo_set_rx_mode callback	Vadim Lomovtsev
	The kernel calls ndo_set_rx_mode() callback supplying it will all necessary info, such as device state flags, multicast mac addresses list and so on. Since we have only 128 bits to communicate with PF we need to initiate several requests to PF with small/short operation each based on input data. So this commit implements following PF messages codes along with new data structures for them: NIC_MBOX_MSG_RESET_XCAST to flush all filters configured for this particular network interface (VF) NIC_MBOX_MSG_ADD_MCAST to add new MAC address to DMAC filter registers for this particular network interface (VF) NIC_MBOX_MSG_SET_XCAST to apply filtering configuration to filter control register Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: add multicast filter management support	Vadim Lomovtsev
	The ThunderX NIC could be partitioned to up to 128 VFs and thus represented to system. Each VF is mapped to pair BGX:LMAC, and each of VF is configured by kernel individually. Eventually the bunch of VFs could be mapped onto same pair BGX:LMAC and thus could cause several multicast filtering configuration requests to LMAC with the same MAC addresses. This commit is to add ThunderX NIC BGX filtering manipulation routines. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: add MAC address filter tracking for LMAC	Vadim Lomovtsev
	The ThunderX NIC has two Ethernet Interfaces (BGX) each of them could has up to four Logical MACs configured. Each of BGX has 32 filters to be configured for filtering ingress packets. The number of filters available to particular LMAC is from 8 (if we have four LMACs configured per BGX) up to 32 (in case of only one LMAC is configured per BGX). At the same time the NIC could present up to 128 VFs to OS as network interfaces, each of them kernel will configure with set of MAC addresses for filtering. So to prevent dupes in BGX filter registers from different network interfaces it is required to cache and track all filter configuration requests prior to applying them onto BGX filter registers. This commit is to update LMAC structures with control fields to allocate/releasing filters tracking list along with implementing dmac array allocate/release per LMAC. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: thunderx: move filter register related macro into proper place	Vadim Lomovtsev
	The ThunderX NIC has set of registers which allows to configure filter policy for ingress packets. There are three possible regimes of filtering multicasts, broadcasts and unicasts: accept all, reject all and accept filter allowed only. Current implementation has enum with all of them and two generic macro for enabling filtering et all (CAM_ACCEPT) and enabling/disabling broadcast packets, which also should be corrected in order to represent register bits properly. All these values are private for driver and there is no need to ‘publish’ them via header file. This commit is to move filtering register manipulation values from header file into source with explicit assignment of exact register values to them to be used while register configuring. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	Merge branch 'meson8b'	David S. Miller
	Martin Blumenstingl says: ==================== Meson8m2 support for dwmac-meson8b The Meson8m2 SoC is an updated version of the Meson8 SoC. Some of the peripherals are shared with Meson8b (for example the watchdog registers and the internal temperature sensor calibration procedure). Meson8m2 also seems to include the same Gigabit MAC register layout as Meson8b. The registers in the Amlogic dwmac "glue" seem identical between Meson8b and Meson8m2. Manual testing seems to confirm this. To be extra-safe a new compatible string is added because there's no (public) documentation on the Meson8m2 SoC. This will allow us to implement any SoC-specific variations later on (if needed). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	net: stmmac: dwmac-meson8b: Add support for the Meson8m2 SoC	Martin Blumenstingl
	The Meson8m2 SoC uses a similar (potentially even identical) register layout as the Meson8b and GXBB SoCs for the dwmac glue. Add a new compatible string and update the module description to indicate support for these SoCs. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	dt-bindings: net: meson-dwmac: add support for the Meson8m2 SoC	Martin Blumenstingl
	The Meson8m2 SoC uses a similar (potentially even identical) register layout for the dwmac glue as Meson8b and GXBB. Unfortunately there is no documentation available. Testing shows that both, RMII and RGMII PHYs are working if they are configured as on Meson8b. Add a new compatible string to the documentation so differences (if there are any) between Meson8m2 and the other SoCs can be taken care of within the driver. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31	PCI/DPC: Rename from pcie-dpc.c to dpc.c	Bjorn Helgaas
	Rename pcie-dpc.c to dpc.c. The path "drivers/pci/pcie/pcie-dpc.c" has more occurrences of "pci" than necessary. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-31	PCI/IOV: Add missing prototypes for powerpc pcibios interfaces	Mathieu Malaterre
	Add missing prototypes for: resource_size_t pcibios_default_alignment(void); int pcibios_sriov_enable(struct pci_dev pdev, u16 num_vfs); int pcibios_sriov_disable(struct pci_dev pdev); This fixes the following warnings treated as errors when using W=1: arch/powerpc/kernel/pci-common.c:236:17: error: no previous prototype for ‘pcibios_default_alignment’ [-Werror=missing-prototypes] arch/powerpc/kernel/pci-common.c:253:5: error: no previous prototype for ‘pcibios_sriov_enable’ [-Werror=missing-prototypes] arch/powerpc/kernel/pci-common.c:261:5: error: no previous prototype for ‘pcibios_sriov_disable’ [-Werror=missing-prototypes] Also, commit 978d2d683123 ("PCI: Add pcibios_iov_resource_alignment() interface") added a new function but the prototype was located in the main header instead of the CONFIG_PCI_IOV specific section. Move this function next to the newly added ones. Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
2018-03-31	PCI/IOV: Use VF0 cached config registers for other VFs	KarimAllah Ahmed
	Cache some config data from VF0 and use it for all other VFs instead of reading it from the config space of each VF. We assume these items are the same across all associated VFs: Revision ID Class Code Subsystem Vendor ID Subsystem ID This is an optimization when enabling SR-IOV on a device with many VFs. Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> [bhelgaas: changelog, simplify comments, remove unused "device", test CONFIG_PCI_IOV instead of CONFIG_PCI_ATS, rename functions] Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
2018-03-31	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/hwbp: Simplify the perf-hwbp code, fix documentation perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
2018-03-31	Merge branch 'x86-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two UV platform fixes, and a kbuild fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/UV: Fix critical UV MMR address error x86/platform/uv/BAU: Add APIC idt entry x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from KBUILD_CFLAGS
2018-03-31	Merge branch 'x86-pti-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI fixes from Ingo Molnar: "Two fixes: a relatively simple objtool fix that makes Clang built kernels work with ORC debug info, plus an alternatives macro fix" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Fixup alternative_call_2 objtool: Add Clang support
2018-04-01	powerpc/64s: Remove POWER4 support	Nicholas Piggin
	POWER4 has been broken since at least the change 49d09bf2a6 ("powerpc/64s: Optimise MSR handling in exception handling"), which requires mtmsrd L=1 support. This was introduced in ISA v2.01, and POWER4 supports ISA v2.00. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc: Remove unused CPU_FTR_ARCH_201	Nicholas Piggin
	The last usage was removed in c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors") (Dec 2014). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s: Fix POWER9 DD2.2 and above in DT CPU features	Nicholas Piggin
	The CPU_FTR_POWER9_DD2_1 flag is intended to be set for DD2.1 and above (which is what the cputable setup does). Fix DT CPU features quirk setup to match. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Merge with upstream changes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s: Set assembler machine type to POWER4	Nicholas Piggin
	Rather than override the machine type in .S code (which can hide wrong or ambiguous code generation for the target), set the type to power4 for all assembly. This also means we need to be careful not to build power4-only code when we're not building for Book3S, such as the "power7" versions of copyuser/page/memcpy. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix Book3E build, don't build the "power7" variants for non-Book3S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s: Explicitly add vector features to CPU_FTRS_POSSIBLE	Nicholas Piggin
	ALTIVEC and VSX features are not added by to default to the POWERx CPU feature sets because they are intended to be enabled by firmware. Currently they end up in CPU_FTRS_POSSIBLE due to their inclusion in other the set for other CPUs, eg. PPC970. But they should be added individually to the CPU_FTRS_POSSIBLE set, because if we reduce the set of CPUs that are built-for they may disappear from the possible mask. It already contains CPU_FTR_VSX, so add ALTIVEC. The _COMP features should be used because they won't be present if compiled out. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add detail to change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s: Add all POWER9 features to CPU_FTRS_ALWAYS	Nicholas Piggin
	It's not a bug to have features missing in CPU_FTR_ALWAYS, but it is a missed opportunity for optimisation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/boot: Remove duplicate typedefs from libfdt_env.h	Mark Greer
	When building a uImage or zImage using ppc6xx_defconfig and some other defconfigs, the following error occurs with GCC 4.5.1: /arch/powerpc/boot/libfdt_env.h:10:13: error: redefinition of typedef 'uint32_t' /arch/powerpc/boot/types.h:21:13: note: previous declaration of 'uint32_t' was here /arch/powerpc/boot/libfdt_env.h:11:13: error: redefinition of typedef 'uint64_t' /arch/powerpc/boot/types.h:22:13: note: previous declaration of 'uint64_t' was here The problem is that commit 656ad58ef19e (powerpc/boot: Add OPAL console to epapr wrappers) adds typedefs for uint32_t and uint64_t to type.h but doesn't remove the pre-existing (and now duplicate) typedefs from libfdt_env.h. Fix the error by removing the duplicate typedefs from libfdt_env.h Signed-off-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s/idle: avoid sync for KVM state when waking from idle	Nicholas Piggin
	When waking from a CPU idle instruction (e.g., nap or stop), the sync for ordering the KVM secondary thread state can be avoided if there wakeup is coming from a kernel context rather than KVM context. This improves performance for ping-pong benchmark with the stop0 idle state by 0.46% for 2 threads in the same core, and 1.02% for different cores. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug	Nicholas Piggin
	Implement a new function to invoke stop, power9_offline_stop, which is like power9_idle_stop but used by the cpu hotplug code. Move KVM secondary state manipulation code to the offline case. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s: sreset panic if there is no debugger or crash dump handlers	Nicholas Piggin
	system_reset_exception does most of its own crash handling now, invoking the debugger or crash dumps if they are registered. If not, then it goes through to die() to print stack traces, and then is supposed to panic (according to comments). However after die() prints oopses, it does its own handling which doesn't allow system_reset_exception to panic (e.g., it may just kill the current process). This patch causes sreset exceptions to return from die after it prints messages but before acting. This also stops die from invoking the debugger on 0x100 crashes. system_reset_exception similarly calls the debugger. It had been thought this was harmless (because if the debugger was disabled, neither call would fire, and if it was enabled the first call would return). However in some cases like xmon 'X' command, the debugger returns 0, which currently causes it to be entered again (first in system_reset_exception, then in die), which is confusing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/64s: return more carefully from sreset NMI	Nicholas Piggin
	System Reset, being an NMI, must return more carefully than other interrupts. It has traditionally returned via the nromal return from exception path, but that has a number of problems. - r13 does not get restored if returning to kernel. This is for interrupts which may cause a context switch, which sreset will never do. Interrupting OPAL (which uses a different r13) is one place where this causes breakage. - It may cause several other problems returning to kernel with preempt or TIF_EMULATE_STACK_STORE if it hits at the wrong time. It's safer just to have a simple restore and return, like machine check which is the other NMI. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/eeh: Fix race with driver un/bind	Michael Neuling
	The current EEH callbacks can race with a driver unbind. This can result in a backtraces like this: EEH: Frozen PHB#0-PE#1fc detected EEH: PE location: S000009, PHB location: N/A CPU: 2 PID: 2312 Comm: kworker/u258:3 Not tainted 4.15.6-openpower1 #2 Workqueue: nvme-wq nvme_reset_work [nvme] Call Trace: dump_stack+0x9c/0xd0 (unreliable) eeh_dev_check_failure+0x420/0x470 eeh_check_failure+0xa0/0xa4 nvme_reset_work+0x138/0x1414 [nvme] process_one_work+0x1ec/0x328 worker_thread+0x2e4/0x3a8 kthread+0x14c/0x154 ret_from_kernel_thread+0x5c/0xc8 nvme nvme1: Removing after probe failure status: -19 <snip> cpu 0x23: Vector: 300 (Data Access) at [c000000ff50f3800] pc: c0080000089a0eb0: nvme_error_detected+0x4c/0x90 [nvme] lr: c000000000026564: eeh_report_error+0xe0/0x110 sp: c000000ff50f3a80 msr: 9000000000009033 dar: 400 dsisr: 40000000 current = 0xc000000ff507c000 paca = 0xc00000000fdc9d80 softe: 0 irq_happened: 0x01 pid = 782, comm = eehd Linux version 4.15.6-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 2017.11.2-00008-g4b6188e)) #2 SM P Tue Feb 27 12:33:27 PST 2018 enter ? for help eeh_report_error+0xe0/0x110 eeh_pe_dev_traverse+0xc0/0xdc eeh_handle_normal_event+0x184/0x4c4 eeh_handle_event+0x30/0x288 eeh_event_handler+0x124/0x170 kthread+0x14c/0x154 ret_from_kernel_thread+0x5c/0xc8 The first part is an EEH (on boot), the second half is the resulting crash. nvme probe starts the nvme_reset_work() worker thread. This worker thread starts touching the device which see a device error (EEH) and hence queues up an event in the powerpc EEH worker thread. nvme_reset_work() then continues and runs nvme_remove_dead_ctrl_work() which results in unbinding the driver from the device and hence releases all resources. At the same time, the EEH worker thread starts doing the EEH .error_detected() driver callback, which no longer works since the resources have been freed. This fixes the problem in the same way the generic PCIe AER code (in drivers/pci/pcie/aer/aerdrv_core.c) does. It makes the EEH code hold the device_lock() while performing the driver EEH callbacks and associated code. This ensures either the callbacks are no longer register, or if they are registered the driver will not be removed from underneath us. This has been broken forever. The EEH call backs were first introduced in 2005 (in 77bd7415610) but it's not clear if a lock was needed back then. Fixes: 77bd74156101 ("[PATCH] powerpc: PCI Error Recovery: PPC64 core recovery routines") Cc: stable@vger.kernel.org # v2.6.16+ Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/kexec_file: Fix error code when trying to load kdump kernel	Thiago Jung Bauermann
	kexec_file_load() on powerpc doesn't support kdump kernels yet, so it returns -ENOTSUPP in that case. I've recently learned that this errno is internal to the kernel and isn't supposed to be exposed to userspace. Therefore, change to -EOPNOTSUPP which is defined in an uapi header. This does indeed make kexec-tools happier. Before the patch, on ppc64le: # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz kexec_file_load failed: Unknown error 524 After the patch: # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz kexec_file_load failed: Operation not supported Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()") Cc: stable@vger.kernel.org # v4.10+ Reported-by: Dave Young <dyoung@redhat.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Reviewed-by: Simon Horman <horms@verge.net.au> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/mm/32: Remove the reserved memory hack	Jonathan Neuschäfer
	This hack, introduced in commit c5df7f775148 ("powerpc: allow ioremap within reserved memory regions") is now unnecessary. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/wii: Don't rely on the reserved memory hack	Jonathan Neuschäfer
	Because the two memory blocks (usually called MEM1 and MEM2) are not merged anymore, __request_region in kernel/resource.c will correctly allow reserving regions in the physical address space between MEM1 and MEM2, where many important peripherals are (GPIO, MMC, USB, ...). A previous change to __ioremap_caller in arch/powerpc/mm/pgtable_32.c ensures that multiple memblocks are properly considered in ioremap; this makes it unnecessary to set __allow_ioremap_reserved. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/mm/32: Use page_is_ram to check for RAM	Jonathan Neuschäfer
	On systems where there is MMIO space between different blocks of RAM in the physical address space, __ioremap_caller did not allow mapping these MMIO areas, because they were below the end RAM and thus considered RAM as well. Use the memblock-based page_is_ram function, which returns false for such MMIO holes. v2: Keep the check for p < virt_to_phys(high_memory). On 32-bit systems with high memory (memory above physical address 4GiB), the high memory is expected to be available though ioremap. The high_memory variable marks the end of low memory; comparing against it means that only ioremap requests for low RAM will be denied. Reported by Michael Ellerman. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/mm: Use memblock API for PPC32 page_is_ram	Jonathan Neuschäfer
	To support accurate checking for different blocks of memory on PPC32, use the same memblock-based approach that's already used on PPC64 also on PPC32. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/mm: Simplify page_is_ram by using memblock_is_memory	Jonathan Neuschäfer
	Instead of open-coding the search in page_is_ram, call memblock_is_memory. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01	powerpc/wii.dts: Add drive slot LED	Jonathan Neuschäfer
	The Wii has a blue LED in the disk drive slot, which is controlled via a GPIO line. Add this LED to wii.dts, and mark it as a panic-indicator. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>