summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/powernv
AgeCommit message (Collapse)Author
2014-08-14Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull more powerpc updates from Ben Herrenschmidt: "Here are some more powerpc bits for 3.17, essentially fixes. The biggest series, also aimed at -stable, is from Aneesh and is the result of weeks and weeks of debugging to find out why the heck or THP implementation was occasionally triggering multi-hit errors in our level 1 TLB. It ended up being a combination of issues including subtleties as to how we should invalidate those special 'MPSS' pages we use to allow the use of 16M pages inside 4K/64K "base page size" segments (you really have to love our MMU !) Another interesting one in the "OMG" category is the series from Michael adding memory barriers to spin_is_locked(). That's also the result of many days of debugging to figure out why the semaphore code would occasionally crash in ways that made no sense. It ended up being some creative lock stacking that was defeated by the fact that our locks allow a load inside the locked section to be re-ordered with the load of the lock value itself (I'm still of two mind about whether to kill that once and for all by putting a heavier barrier back into our lock implementation...). The fixes come with a long explanation in the cset comments, feel free to read it if you feel like having a headache today" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc/thp: Add tracepoints to track hugepage invalidate powerpc/mm: Use read barrier when creating real_pte powerpc/thp: Use ACCESS_ONCE when loading pmdp powerpc/thp: Invalidate with vpn in loop powerpc/thp: Handle combo pages in invalidate powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte powerpc/thp: Don't recompute vsid and ssize in loop on invalidate powerpc/thp: Add write barrier after updating the valid bit powerpc: reorder per-cpu NUMA information's initialization powerpc/perf/hv-24x7: Use kmem_cache_free powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info powerpc: Hard disable interrupts in xmon powerpc: remove duplicate definition of TEXASR_FS powerpc/pseries: Avoid deadlock on removing ddw powerpc/pseries: Failure on removing device node powerpc/boot: Use correct zlib types for comparison powerpc/powernv: Interface to register/unregister opal dump region printk: Add function to return log buffer address and size powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS powerpc/ppc476: Disable BTAC ...
2014-08-13powerpc/powernv: Interface to register/unregister opal dump regionVasant Hegde
PowerNV platform is capable of capturing host memory region when system crashes (because of host/firmware). We have new OPAL API to register/ unregister memory region to be captured when system crashes. This patch adds support for new API. Also during boot time we register kernel log buffer and unregister before doing kexec. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/powernv: Fix IOMMU group lostGavin Shan
When we take full hotplug to recover from EEH errors, PCI buses could be involved. For the case, the child devices of involved PCI buses can't be attached to IOMMU group properly, which is caused by commit 3f28c5a ("powerpc/powernv: Reduce multi-hit of iommu_add_device()"). When adding the PCI devices of the newly created PCI buses to the system, the IOMMU group is expected to be added in (C). (A) fails to bind the IOMMU group because bus->is_added is false. (B) fails because the device doesn't have binding IOMMU table yet. bus->is_added is set to true at end of (C) and pdev->is_added is set to true at (D). pcibios_add_pci_devices() pci_scan_bridge() pci_scan_child_bus() pci_scan_slot() pci_scan_single_device() pci_scan_device() pci_device_add() pcibios_add_device() A: Ignore device_add() B: Ignore pcibios_fixup_bus() pcibios_setup_bus_devices() pcibios_setup_device() C: Hit pcibios_finish_adding_to_bus() pci_bus_add_devices() pci_bus_add_device() D: Add device If the parent PCI bus isn't involved in hotplug, the IOMMU group is expected to be bound in (B). (A) should fail as the sysfs entries aren't populated. The patch fixes the issue by reverting commit 3f28c5a and remove WARN_ON() in iommu_add_device() to allow calling the function even the specified device already has associated IOMMU group. Cc: <stable@vger.kernel.org> # 3.16+ Reported-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-10Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "This finally applies the stricter sysfs perms checking we pulled out before last merge window. A few stragglers are fixed (thanks linux-next!)" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs files arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs files drivers/video/fbdev/s3c2410fb.c: don't make debug world-writable. ARM: avoid ARM binutils leaking ELF local symbols scripts: modpost: Remove numeric suffix pattern matching scripts: modpost: fix compilation warning sysfs: disallow world-writable files. module: return bool from within_module*() module: add within_module() function modules: Fix build error in moduleloader.h
2014-08-07arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs filesRusty Russell
If you don't have a store function, you're not writable anyway! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-08-07arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs filesRusty Russell
If you don't have a store function, you're not writable anyway! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2014-08-05powerpc/powernv: Invoke opal call to handle hmi.Mahesh Salgaonkar
When we hit the HMI in Linux, invoke opal call to handle/recover from HMI errors in real mode and then in virtual mode during check_irq_replay() invoke opal_poll_events()/opal_do_notifier() to retrieve HMI event from OPAL and act accordingly. Now that we are ready to handle HMI interrupt directly in linux, remove the HMI interrupt registration with firmware. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/book3s: Add basic infrastructure to handle HMI in Linux.Mahesh Salgaonkar
Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Handle compound PE in config accessorsGavin Shan
The PCI config accessors check for PE frozen state and clear it if EEH isn't functional. The patch handles compound PE in config accessors if PHB supports it. For consistency, all PEs will be put into frozen state if any one in compound group gets frozen by hardware. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Handle compound PE for EEHGavin Shan
The patch handles compound PE for EEH backend. If one specific PE in compound group has been frozen, we enforces to freeze all PEs in the group. If we're enable DMA or MMIO for one PE in compound group, DMA or MMIO of all PEs in the group will be enabled. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Handle compound PEGavin Shan
The patch introduces 3 PHB callbacks: compound PE state retrieval, force freezing and unfreezing compound PE. The PCI config accessors and PowerNV EEH backend can use them in subsequent patches. We don't export the capability of compound PE to EEH core, which helps avoiding more complexity to EEH core. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Split ioda_eeh_get_state()Gavin Shan
Function ioda_eeh_get_state() is used to fetch EEH state for PHB or PE. We're going to support compound PE and the function becomes more complicated with that. The patch splits the function into two functions for PHB and PE cases separately to improve readability. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Allow to freeze PEGavin Shan
The patch synchronizes header file with firmware to have new OPAL API opal_pci_eeh_freeze_set(), which is used to freeze the specified PE in order to support "compound" PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Enable M64 aperatus for PHB3Guo Chao
This patch enables M64 aperatus for PHB3. We already had platform hook (ppc_md.pcibios_window_alignment) to affect the PCI resource assignment done in PCI core so that each PE's M32 resource was built on basis of M32 segment size. Similarly, we're using that for M64 assignment on basis of M64 segment size. * We're using last M64 BAR to cover M64 aperatus, and it's shared by all 256 PEs. * We don't support P7IOC yet. However, some function callbacks are added to (struct pnv_phb) so that we can reuse them on P7IOC in future. * PE, corresponding to PCI bus with large M64 BAR device attached, might span multiple M64 segments. We introduce "compound" PE to cover the case. The compound PE is a list of PEs and the master PE is used as before. The slave PEs are just for MMIO isolation. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Aux PE data for error logGavin Shan
The patch allows PE (struct eeh_pe) instance to have auxillary data, whose size is configurable on basis of platform. For PowerNV, the auxillary data will be used to cache PHB diag-data for that PE (frozen PE or fenced PHB). In turn, we can retrieve the diag-data at any later points. It's useful for the case of VFIO PCI devices where the error log should be cached, and then be retrieved by the guest at later point. Also, it can avoid PHB diag-data overwritting if another frozen PE reported and the previous diag-data isn't fetched by guest. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Make diag-data not endian dependentGavin Shan
It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness problems in EEH"). The patch helps to get non-endian-dependent diag-data. Cc: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Replace pr_warning() with pr_warn()Gavin Shan
pr_warn() is equal to pr_warning(), but the former is a bit more formal according to commit fc62f2f ("kernel.h: add pr_warn for symmetry to dev_warn, netdev_warn"). The patch replaces pr_warning() with pr_warn(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Selectively enable IO for error logGavin Shan
According to the experiment I did, PCI config access is blocked on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That means we always get 0xFF's while dumping PCI config space of the frozen PE on P7IOC. We don't have the problem on PHB3. So we have to enable I/O prioir to collecting error log. Otherwise, meaningless 0xFF's are always returned. The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is selectively set to indicate the case for: P7IOC on PowerNV platform, pSeries platform. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Refactor EEH flag accessorsGavin Shan
There are multiple global EEH flags. Almost each flag has its own accessor, which doesn't make sense. The patch refactors EEH flag accessors so that they look unified: eeh_add_flag(): Add EEH flag eeh_clear_flag(): Clear EEH flag eeh_has_flag(): Check if one specific flag has been set eeh_enabled(): Check if EEH functionality has been enabled Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Fix IOMMU table for VFIO devGavin Shan
On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass through one PCI device, whose hose driver ever enable the bypass mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU table. However, EEH needs access the IOMMU table when the device is owned by guest. The patch fixes pdev->dev.archdata.dma_data.iommu_table when passing through the device to guest in pnv_pci_ioda2_set_bypass(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Update dev->dma_mask in pci_set_dma_mask() pathBrian W Hart
powerpc defines various machine-specific routines for handling pci_set_dma_mask(). The routines for machine "PowerNV" may neglect to set dev->dma_mask. This could confuse anyone (e.g. drivers) that consult dev->dma_mask to find the current mask. Set the dma_mask in the PowerNV leaf routine. Signed-off-by: Brian W. Hart <hartb@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: sysfs entries lostMike Qiu
The sysfs entries are lost because of commit 2213fb1 ("powerpc/eeh: Skip eeh sysfs when eeh is disabled"). That commit added condition to create sysfs entries with EEH_ENABLED, which isn't populated when trying to create sysfs entries on PowerNV platform during system boot time. The patch fixes the issue by: * Reoder EEH initialization functions so that they're same on PowerNV/pSeries. * Cache PE's primary bus by PowerNV platform instead of EEH core to avoid kernel crash caused by the function reorder. Another benefit with this is to avoid one eeh_probe_mode_dev() in EEH core. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Avoid event on passed PEGavin Shan
We must not handle EEH error on devices which are passed to somebody else. Instead, we expect that the frozen device owner detects an EEH error and recovers from it. This avoids EEH error handling on passed through devices so the device owner gets a chance to handle them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall()Michael Ellerman
A lot of the code in platforms/powernv is using non-machine initcalls. That means if a kernel built with powernv support runs on another platform, for example pseries, the initcalls will still run. That is usually OK, because the initcalls will check for something in the device tree or elsewhere before doing anything, so on other platforms they will usually just return. But it's fishy for powernv code to be running on other platforms, so switch them all to be machine initcalls. If we want any of them to run on other platforms in future they should move to sysdev. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28Merge branch 'merge' into nextBenjamin Herrenschmidt
Bring in some important fixes from the 3.16 branch
2014-07-28powerpc/powernv: Change BUG_ON to WARN_ON in elog codeVasant Hegde
We can continue to read the error log (up to MAX size) even if we get the elog size more than MAX size. Hence change BUG_ON to WARN_ON. Also updated error message. Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powernv: Add OPAL tracepointsAnton Blanchard
Knowing how long we spend in firmware calls is an important part of minimising OS jitter. This patch adds tracepoints to each OPAL call. If tracepoints are enabled we branch out to a common routine that calls an entry and exit tracepoint. This allows us to write tools that monitor the frequency and duration of OPAL calls, eg: name count total(ms) min(ms) max(ms) avg(ms) period(ms) OPAL_HANDLE_INTERRUPT 5 0.199 0.037 0.042 0.040 12547.545 OPAL_POLL_EVENTS 204 2.590 0.012 0.036 0.013 2264.899 OPAL_PCI_MSI_EOI 2830 3.066 0.001 0.005 0.001 81.166 We use jump labels if configured, which means we only add a single nop instruction to every OPAL call when the tracepoints are disabled. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table()Alexey Kardashevskiy
Since a TCE page size can be other than 4K, make it configurable for P5IOC2 and IODA PHBs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/powernv: Use it_page_shift in TCE buildAlexey Kardashevskiy
This makes use of iommu_table::it_page_shift instead of TCE_SHIFT and TCE_RPN_SHIFT hardcoded values. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/powernv: Use it_page_shift for TCE invalidationAlexey Kardashevskiy
This fixes IODA1/2 to use it_page_shift as it may be bigger than 4K. This changes involved constant values to use "ull" modifier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-25powerpc/powernv: Remove OPAL v1 takeoverMichael Ellerman
In commit 27f4488872d9 "Add OPAL takeover from PowerVM" we added support for "takeover" on OPAL v1 machines. This was a mode of operation where we would boot under pHyp, and query for the presence of OPAL. If detected we would then do a special sequence to take over the machine, and the kernel would end up running in hypervisor mode. OPAL v1 was never a supported product, and was never shipped outside IBM. As far as we know no one is still using it. Newer versions of OPAL do not use the takeover mechanism. Although the query for OPAL should be harmless on machines with newer OPAL, we have seen a machine where it causes a crash in Open Firmware. The code in early_init_devtree() to copy boot_command_line into cmd_line was added in commit 817c21ad9a1f "Get kernel command line accross OPAL takeover", and AFAIK is only used by takeover, so should also be removed. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Dump PE location codeGavin Shan
As Ben suggested, it's meaningful to dump PE's location code for site engineers when hitting EEH errors. The patch introduces function eeh_pe_loc_get() to retireve the location code from dev-tree so that we can output it when hitting EEH errors. If primary PE bus is root bus, the PHB's dev-node would be tried prior to root port's dev-node. Otherwise, the upstream bridge's dev-node of the primary PE bus will be check for the location code directly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Enable POWER8 doorbell IPIsMichael Neuling
This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix killed EEH eventGavin Shan
On PowerNV platform, EEH errors are reported by IO accessors or poller driven by interrupt. After the PE is isolated, we won't produce EEH event for the PE. The current implementation has possibility of EEH event lost in this way: The interrupt handler queues one "special" event, which drives the poller. EEH thread doesn't pick the special event yet. IO accessors kicks in, the frozen PE is marked as "isolated" and EEH event is queued to the list. EEH thread runs because of special event and purge all existing EEH events. However, we never produce an other EEH event for the frozen PE. Eventually, the PE is marked as "isolated" and we don't have EEH event to recover it. The patch fixes the issue to keep EEH events for PEs that have been marked as "isolated" with the help of additional "force" help to eeh_remove_event(). Reported-by: Rolf Brudeseth <rolfb@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Don't escalate non-existing frozen PEGavin Shan
Commit cb5b242c ("powerpc/eeh: Escalate error on non-existing PE") escalates the frozen state on non-existing PE to fenced PHB. It was to improve kdump reliability. After that, commit 361f2a2a ("powrpc/powernv: Reset PHB in kdump kernel") was introduced to issue complete reset on all PHBs to increase the reliability of kdump kernel. Commit cb5b242c becomes unuseful and it would be reverted. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Report frozen parent PE prior to child PEGavin Shan
When we have the corner case of frozen parent and child PE at the same time, we have to handle the frozen parent PE prior to the child. Without clearning the frozen state on parent PE, the child PE can't be recovered successfully. The patch searches the EEH PE hierarchy tree and returns the toppest frozen PE to be handled. It ensures the frozen parent PE will be handled prior to child PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Reduce panic timeout from 180s to 10sAnton Blanchard
We've already dropped the default pseries timeout to 10s, do the same for powernv. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix reading of OPAL msglogJoel Stanley
memory_return_from_buffer returns a signed value, so ret should be ssize_t. Fixes the following issue reported by David Binderman: [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style) Checking if unsigned variable 'ret' is less than zero. [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style) Checking if unsigned variable 'ret' is less than zero. Local variable "ret" is of type size_t. This is always unsigned, so it is pointless to check if it is less than zero. https://bugzilla.kernel.org/show_bug.cgi?id=77551 Fixing this exposes a real bug for the case where the entire count bytes is successfully read from the POS_WRAP case. The second memory_read_from_buffer will return EINVAL, causing the entire read to return EINVAL to userspace, despite the data being copied correctly. The fix is to test for the case where the data has been read and return early. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix endianness problems in EEHGuo Chao
EEH information fetched from OPAL need fix before using in LE environment. To be included in sparse's endian check, declare them as __beXX and access them by accessors. Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powernv: Fix permissions on sysparam sysfs entriesAnton Blanchard
Everyone can write to these files, which is not what we want. Cc: stable@vger.kernel.org # 3.15 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv : Disable subcore for UP configsShreyas B. Prabhu
Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/subcore.c: In function ‘cpu_update_split_mode’: arch/powerpc/platforms/powernv/subcore.c:274:15: error: ‘setup_max_cpus’ undeclared (first use in this function) arch/powerpc/platforms/powernv/subcore.c:285:5: error: lvalue required as left operand of assignment 'setup_max_cpus' variable is relevant only on SMP, so there is no point working around it for UP. Furthermore, subcore itself is relevant only on SMP and hence the better solution is to exclude subcore.o and subcore-asm.o for UP builds. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Include asm/smp.h to fix UP build failureShreyas B. Prabhu
Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/setup.c: In function ‘pnv_kexec_wait_secondaries_down’: arch/powerpc/platforms/powernv/setup.c:179:4: error: implicit declaration of function ‘get_hard_smp_processor_id’ rc = opal_query_cpu_status(get_hard_smp_processor_id(i), The usage of get_hard_smp_processor_id() needs the declaration from <asm/smp.h>. The file setup.c includes <linux/sched.h>, which in-turn includes <linux/smp.h>. However, <linux/smp.h> includes <asm/smp.h> only on SMP configs and hence UP builds fail. Fix this by directly including <asm/smp.h> in setup.c unconditionally. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-10Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "Here is the bulk of the powerpc changes for this merge window. It got a bit delayed in part because I wasn't paying attention, and in part because I discovered I had a core PCI change without a PCI maintainer ack in it. Bjorn eventually agreed it was ok to merge it though we'll probably improve it later and I didn't want to rebase to add his ack. There is going to be a bit more next week, essentially fixes that I still want to sort through and test. The biggest item this time is the support to build the ppc64 LE kernel with our new v2 ABI. We previously supported v2 userspace but the kernel itself was a tougher nut to crack. This is now sorted mostly thanks to Anton and Rusty. We also have a fairly big series from Cedric that add support for 64-bit LE zImage boot wrapper. This was made harder by the fact that traditionally our zImage wrapper was always 32-bit, but our new LE toolchains don't really support 32-bit anymore (it's somewhat there but not really "supported") so we didn't want to rely on it. This meant more churn that just endian fixes. This brings some more LE bits as well, such as the ability to run in LE mode without a hypervisor (ie. under OPAL firmware) by doing the right OPAL call to reinitialize the CPU to take HV interrupts in the right mode and the usual pile of endian fixes. There's another series from Gavin adding EEH improvements (one day we *will* have a release with less than 20 EEH patches, I promise!). Another highlight is the support for the "Split core" functionality on P8 by Michael. This allows a P8 core to be split into "sub cores" of 4 threads which allows the subcores to run different guests under KVM (the HW still doesn't support a partition per thread). And then the usual misc bits and fixes ..." [ Further delayed by gmail deciding that BenH is a dirty spammer. Google knows. ] * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (155 commits) powerpc/powernv: Add missing include to LPC code selftests/powerpc: Test the THP bug we fixed in the previous commit powerpc/mm: Check paca psize is up to date for huge mappings powerpc/powernv: Pass buffer size to OPAL validate flash call powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC() powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC() powerpc/powernv: Set memory_block_size_bytes to 256MB powerpc: Allow ppc_md platform hook to override memory_block_size_bytes powerpc/powernv: Fix endian issues in memory error handling code powerpc/eeh: Skip eeh sysfs when eeh is disabled powerpc: 64bit sendfile is capped at 2GB powerpc/powernv: Provide debugfs access to the LPC bus via OPAL powerpc/serial: Use saner flags when creating legacy ports powerpc: Add cpu family documentation powerpc/xmon: Fix up xmon format strings powerpc/powernv: Add calls to support little endian host powerpc: Document sysfs DSCR interface powerpc: Fix regression of per-CPU DSCR setting powerpc: Split __SYSFS_SPRSETUP macro arch: powerpc/fadump: Cleaning up inconsistent NULL checks ...
2014-06-07powerpc/powernv: Add missing include to LPC codeBenjamin Herrenschmidt
kbuild bot spotted that one: arch/powerpc/platforms/powernv/opal-lpc.c: In function 'opal_lpc_init_debugfs': >> arch/powerpc/platforms/powernv/opal-lpc.c:319:35: error: 'powerpc_debugfs_root' undeclared (first use in this function) root = debugfs_create_dir("lpc", powerpc_debugfs_root); ^ We neet to include the definition explicitely. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Pass buffer size to OPAL validate flash callVasant Hegde
We pass actual buffer size to opal_validate_flash() OPAL API call and in return it contains output buffer size. Commit cc146d1d (Fix little endian issues) missed to set the size param before making OPAL call. So firmware image validation fails. This patch sets size variable before making OPAL call. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Tested-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Set memory_block_size_bytes to 256MBAnton Blanchard
powerpc sets a low SECTION_SIZE_BITS to accomodate small pseries boxes. We default to 16MB memory blocks, and boxes with a lot of memory end up with enormous numbers of sysfs memory nodes. Set a more reasonable default for powernv of 256MB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Fix endian issues in memory error handling codeAnton Blanchard
struct OpalMemoryErrorData is passed to us from firmware, so we have to byteswap it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Provide debugfs access to the LPC bus via OPALBenjamin Herrenschmidt
This provides debugfs files to access the LPC bus on Power8 non-virtualized using the appropriate OPAL firmware calls. The usage is simple: one file per space (IO, MEM and FW), lseek to the address and read/write the data. IO and MEM always generate series of byte accesses. FW can generate word and dword accesses if aligned properly. Based on an original patch from Rob Lippert and reworked. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Add calls to support little endian hostBenjamin Herrenschmidt
When running as a powernv "host" system on P8, we need to switch the endianness of interrupt handlers. This does it via the appropriate call to the OPAL firmware which may result in just switching HID0:HILE but depending on the processor version might need to do a few more things. This call must be done early before any other processor has been brought out of firmware. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-04Merge tag 'devicetree-for-3.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux into next Pull DeviceTree updates from Rob Herring: - Another round of clean-up of FDT related code in architecture code. This removes knowledge of internal FDT details from most architectures except powerpc. - Conversion of kernel's custom FDT parsing code to use libfdt. - DT based initialization for generic serial earlycon. The introduction of generic serial earlycon support went in through the tty tree. - Improve the platform device naming for DT probed devices to ensure unique naming and use parent names instead of a global index. - Fix a race condition in of_update_property. - Unify the various linker section OF match tables and fix several function prototype errors. - Update platform_get_irq_byname to work in deferred probe cases. - 2 binding doc updates * tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (58 commits) of: handle NULL node in next_child iterators of/irq: provide more wrappers for !CONFIG_OF devicetree: bindings: Document micrel vendor prefix dt: bindings: dwc2: fix required value for the phy-names property of_pci_irq: kill useless variable in of_irq_parse_pci() of/irq: do irq resolution in platform_get_irq_byname() of: Add a testcase for of_find_node_by_path() of: Make of_find_node_by_path() handle /aliases of: Create unlocked version of for_each_child_of_node() lib: add glibc style strchrnul() variant of: Handle memory@0 node on PPC32 only pci/of: Remove dead code of: fix race between search and remove in of_update_property() of: Use NULL for pointers of: Stop naming platform_device using dcr address of: Ensure unique names without sacrificing determinism tty/serial: pl011: add DT based earlycon support of/fdt: add FDT serial scanning for earlycon of/fdt: add FDT address translation support serial: earlycon: add DT support ...