summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2016-11-30net: phy: broadcom: Add support code for reading PHY countersFlorian Fainelli
Broadcom PHYs expose a number of PHY error counters: receive errors, false carrier sense, SerDes BER count, local and remote receive errors. Add support code to allow retrieving these error counters. Since the Broadcom PHY library code is used by several drivers, make it possible for them to specify the storage for the software copy of the statistics. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPINGFrancis Yan
This patch exports the sender chronograph stats via the socket SO_TIMESTAMPING channel. Currently we can instrument how long a particular application unit of data was queued in TCP by tracking SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having these sender chronograph stats exported simultaneously along with these timestamps allow further breaking down the various sender limitation. For example, a video server can tell if a particular chunk of video on a connection takes a long time to deliver because TCP was experiencing small receive window. It is not possible to tell before this patch without packet traces. To prepare these stats, the user needs to set SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags while requesting other SOF_TIMESTAMPING TX timestamps. When the timestamps are available in the error queue, the stats are returned in a separate control message of type SCM_TIMESTAMPING_OPT_STATS, in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME, TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30tcp: instrument tcp sender limits chronographsFrancis Yan
This patch implements the skeleton of the TCP chronograph instrumentation on sender side limits: 1) idle (unspec) 2) busy sending data other than 3-4 below 3) rwnd-limited 4) sndbuf-limited The limits are enumerated 'tcp_chrono'. Since a connection in theory can idle forever, we do not track the actual length of this uninteresting idle period. For the rest we track how long the sender spends in each limit. At any point during the life time of a connection, the sender must be in one of the four states. If there are multiple conditions worthy of tracking in a chronograph then the highest priority enum takes precedence over the other conditions. So that if something "more interesting" starts happening, stop the previous chrono and start a new one. The time unit is jiffy(u32) in order to save space in tcp_sock. This implies application must sample the stats no longer than every 49 days of 1ms jiffy. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
2016-11-30kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.Thiago Jung Bauermann
kexec_locate_mem_hole will be used by the PowerPC kexec_file_load implementation to find free memory for the purgatory stack. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-30kexec_file: Change kexec_add_buffer to take kexec_buf as argument.Thiago Jung Bauermann
This is done to simplify the kexec_add_buffer argument list. Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer. In addition, change the type of kexec_buf.buffer from char * to void *. There is no particular reason for it to be a char *, and the change allows us to get rid of 3 existing casts to char * in the code. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-30kexec_file: Allow arch-specific memory walking for kexec_add_bufferThiago Jung Bauermann
Allow architectures to specify a different memory walking function for kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but PowerPC uses the memblock subsystem. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-30Merge tag 'drm-qemu-20161121' of git://git.kraxel.org/linux into drm-nextDave Airlie
drm/virtio: fix busid in a different way, allocate more vbufs. drm/qxl: various bugfixes and cleanups, * tag 'drm-qemu-20161121' of git://git.kraxel.org/linux: (224 commits) drm/virtio: allocate some extra bufs qxl: Allow resolution which are not multiple of 8 qxl: Don't notify userspace when monitors config is unchanged qxl: Remove qxl_bo_init() return value qxl: Call qxl_gem_{init, fini} qxl: Add missing '\n' to qxl_io_log() call qxl: Remove unused prototype qxl: Mark some internal functions as static Revert "drm: virtio: reinstate drm_virtio_set_busid()" drm/virtio: fix busid regression drm: re-export drm_dev_set_unique Linux 4.9-rc5 gp8psk: Fix DVB frontend attach gp8psk: fix gp8psk_usb_in_op() logic dvb-usb: move data_mutex to struct dvb_usb_device iio: maxim_thermocouple: detect invalid storage size in read() aoe: fix crash in page count manipulation lightnvm: invalid offset calculation for lba_shift Kbuild: enable -Wmaybe-uninitialized warnings by default pcmcia: fix return value of soc_pcmcia_regulator_set ...
2016-11-29of_mdio: add helper to deregister fixed-link PHYsJohan Hovold
Add helper to deregister fixed-link PHYs registered using of_phy_register_fixed_link(). Convert the two drivers that care to deregister their fixed-link PHYs to use the new helper, but note that most drivers currently fail to do so. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30iomap: implement direct I/OChristoph Hellwig
This adds a full fledget direct I/O implementation using the iomap interface. Full fledged in this case means all features are supported: AIO, vectored I/O, any iov_iter type including kernel pointers, bvecs and pipes, support for hole filling and async apending writes. It does not mean supporting all the warts of the old generic code. We expect i_rwsem to be held over the duration of the call, and we expect to maintain i_dio_count ourselves, and we pass on any kinds of mapping to the file system for now. The algorithm used is very simple: We use iomap_apply to iterate over the range of the I/O, and then we use the new bio_iov_iter_get_pages helper to lock down the user range for the size of the extent. bio_iov_iter_get_pages can currently lock down twice as many pages as the old direct I/O code did, which means that we will have a better batch factor for everything but overwrites of badly fragmented files. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com> Tested-by: Jens Axboe <axboe@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30locking/lockdep: Provide a type check for lock_is_heldPeter Zijlstra
Christoph requested lockdep_assert_held() variants that distinguish between held-for-read or held-for-write. Provide: int lock_is_held_type(struct lockdep_map *lock, int read) which takes the same argument as lock_acquire(.read) and matches it to the held_lock instance. Use of this function should be gated by the debug_locks variable. When that is 0 the return value of the lock_is_held_type() function is undefined. This is done to allow both negative and positive tests for holding locks. By default we provide (positive) lockdep_assert_held{,_exclusive,_read}() macros. Requested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Jens Axboe <axboe@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30dmaengine: DW DMAC: add multi-block property to device treeEugeniy Paltsev
Several versions of DW DMAC have multi block transfers hardware support. Hardware support of multi block transfers is disabled by default if we use DT to configure DMAC and software emulation of multi block transfers used instead. Add multi-block property, so it is possible to enable hardware multi block transfers (if present) via DT. Switch from per device is_nollp variable to multi_block array to be able enable/disable multi block transfers separately per channel. Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30dmaengine: dma_slave_config: add support for slave port windowPeter Ujfalusi
Some slave devices uses address window instead of single register for read and/or write of data. With the src/dst_port_window_size the address window can be specified and the DMAengine driver should use this information to correctly set up the transfer to loop within the provided window. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30Merge branches 'thermal-core', 'thermal-intel', 'thermal-soc-fixes' and ↵Zhang Rui
'thermal-reorg' into next
2016-11-30block: add bio_iov_iter_get_pages()Kent Overstreet
This is a helper that pins down a range from an iov_iter and adds it to a bio without requiring a separate memory allocation for the page array. It will be used for upcoming direct I/O implementations for block devices and iomap based file systems. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [hch: ported to the iov_iter interface, renamed and added comments. All blame should be directed to me and all fame should go to Kent after this!] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> (cherry picked from commit 9cd56d916aa481ce8f56d9c5302a6ed90c2e0b5f)
2016-11-30Merge branch 'xfs-4.10-misc-fixes-2' into iomap-4.10-directioDave Chinner
2016-11-29net: phy: add an option to disable EEE advertisementjbrunet
This patch adds an option to disable EEE advertisement in the generic PHY by providing a mask of prohibited modes corresponding to the value found in the MDIO_AN_EEE_ADV register. On some platforms, PHY Low power idle seems to be causing issues, even breaking the link some cases. The patch provides a convenient way for these platforms to disable EEE advertisement and work around the issue. Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Yegor Yefremov <yegorslists@googlemail.com> Tested-by: Andreas Färber <afaerber@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29fpga: Clarify how write_init works streaming modesJason Gunthorpe
This interface was designed for streaming, but write_init's buf argument has an unclear purpose. Define it to be the first bytes of the bitstream. Each driver gets to set how many bytes (at most) it wants to see. Short bitstreams will be passed through as-is, while long ones will be truncated. The intent is to allow drivers to peek at the header before the transfer actually starts. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Alan Tull <atull@opensource.altera.com>
2016-11-29driver core: class: add class_groups supportGreg Kroah-Hartman
struct class needs to have a set of default groups that are added, as adding individual attributes does not work well in the long run. So add support for that. Future patches will convert the existing usages of class_attrs to use class_groups and then class_attrs will go away. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-29lightnvm: transform target get/set bad blockJavier González
Since targets are given a virtual target device, it is necessary to translate all communication between targets and the backend device. Implement the translation layer for get/set bad block table. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: use target nvm on target-specific ops.Javier González
On target-specific operations pass on nvm_tgt_dev instead of the generic nvm device. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: introduce max_phys_sects helper functionJavier González
Target devices do not have access to the device driver operations. Introduce a helper function that exposes the max. number of physical sectors supported by the underlying device. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: introduce helpers for generic ops in rrpcJavier González
Avoid calling media manager and device-specific operations directly from rrpc. Create helper functions on lightnvm's core instead. Signed-off-by: Javier González <javier@cnexlabs.com> Made it work with null_blk as well. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: eliminate nvm_lun abstraction in mmJavier González
In order to naturally support multi-target instances on an Open-Channel SSD, targets should own the LUNs they get blocks from and manage provisioning internally. This is done in several steps. Since targets own the LUNs the are instantiated on top of and manage the free block list internally, there is no need for a LUN abstraction in the media manager. LUNs are intrinsically managed as in the physical layout (ch:0,lun:0, ..., ch:0,lun:n, ch:1,lun:0, ch:1,lun:n, ..., ch:m,lun:0, ch:m,lun:n) and given to the targets based on the target creation ioctl. This simplifies LUN management and clears the path for a partition manager to sit directly underneath LightNVM targets. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: eliminate nvm_block abstraction on mmJavier González
In order to naturally support multi-target instances on an Open-Channel SSD, targets should own the LUNs they get blocks from and manage provisioning internally. This is done in several steps. A part of this transformation is that targets manage their blocks internally. This patch eliminates the nvm_block abstraction and moves block management to the target logic. The rrpc target is transformed. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: remove debug lun statistics from gennvmJavier González
Since LUNs are managed internally on targets, the media manager has no access to the free LUN lists. Thus, debug functions that show LUN information on the device should not be implemented on the media manager, but rather on the target in itself. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: remove get_lun operation on gennvmJavier González
Since LUNs are managed internally on the target, there is no need for the media manager to implement a get_lun operation. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: move block provisioning to targetsJavier González
In order to naturally support multi-target instances on an Open-Channel SSD, targets should own the LUNs they get blocks from and manage provisioning internally. This is done in several steps. This patch moves the block provisioning inside of the target and removes the get/put block interface from the media manager. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: manage lun partitions internally in mmJavier González
LUNs are exclusively owned by targets implementing a block device FTL. Doing this reservation requires at the moment a 2-way callback gennvm <-> target. The reason behind this is that LUNs were not assumed to always be exclusively owned by targets. However, this design decision goes against I/O determinism QoS (two targets would mix I/O on the same parallel unit in the device). This patch makes LUN reservation as part of the target creation on the media manager. This makes that LUNs are always exclusively owned by the target instantiated on top of them. LUN stripping and/or sharing should be implemented on the target itself or the layers on top. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: remove gen_lun abstractionJavier González
The gen_lun abstraction in the generic media manager was conceived on the assumption that a single target would instantiated on top of it. This has complicated target design to implement multi-instances. Remove this abstraction and move its logic to nvm_lun, which manages physical lun geometry and operations. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: make address conversion functions globalJavier González
Targets are assumed to used the same generic ppa format, where the address is partitioned on ch:lun:block:pg:pl:sec. Thus, make the function in charge of transforming the ppa address from a linear format to the generic one available to all targets. This function will be needed by the media manager in order to do target mapping translations when targets are divided on different physical partitions. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: cleanup unused target operationsJavier González
Cleanup definition leftovers from old gennvm interface Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: add ECC error codesJavier González
Add ECC error codes to enable the appropriate handling in the target. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: export set bad block tableJavier González
Bad blocks should be managed by block owners. This would be either targets for data blocks or sysblk for system blocks. In order to support this, export two functions: One to mark a block as an specific type (e.g., bad block) and another to update the bad block table on the device. Move bad block management to rrpc. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29lightnvm: enable to send hint to erase commandJavier González
Erases might be subject to host hints. An example is multi-plane programming to erase blocks in parallel. Enable targets to specify this hint. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29nvme: lightnvm: attach lightnvm sysfs to nvme block deviceMatias Bjørling
Previously, LBA read and write were not supported in the lightnvm specification. Now that it supports it, lets use the traditional NVMe gendisk, and attach the lightnvm sysfs geometry export. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29timekeeping: Add a fast and NMI safe boot clockJoel Fernandes
This boot clock can be used as a tracing clock and will account for suspend time. To keep it NMI safe since we're accessing from tracing, we're not using a separate timekeeper with updates to monotonic clock and boot offset protected with seqlocks. This has the following minor side effects: (1) Its possible that a timestamp be taken after the boot offset is updated but before the timekeeper is updated. If this happens, the new boot offset is added to the old timekeeping making the clock appear to update slightly earlier: CPU 0 CPU 1 timekeeping_inject_sleeptime64() __timekeeping_inject_sleeptime(tk, delta); timestamp(); timekeeping_update(tk, TK_CLEAR_NTP...); (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be partially updated. Since the tk->offs_boot update is a rare event, this should be a rare occurrence which postprocessing should be able to handle. Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1480372524-15181-6-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29timekeeping/clocksource_cyc2ns: Document intended range limitationChris Metcalf
The "cycles" argument should not be an absolute clocksource cycle value, as the implementation's arithmetic will overflow relatively easily with wide (64 bit) clocksource counters. For performance, the implementation is simple and fast, since the function is intended for only relatively small delta values of clocksource cycles. [jstultz: Fixed up to merge against HEAD & commit message tweaks, also included rewording suggestion by Ingo] Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Link: http://lkml.kernel.org/r/1480372524-15181-4-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29timekeeping: Ignore the bogus sleep time if pm_trace is enabledChen Yu
Power management suspend/resume tracing (ab)uses the RTC to store suspend/resume information persistently. As a consequence the RTC value is clobbered when timekeeping is resumed and tries to inject the sleep time. Commit a4f8f6667f09 ("timekeeping: Cap array access in timekeeping_debug") plugged a out of bounds array access in the timekeeping debug code which was caused by the clobbered RTC value, but we still use the clobbered RTC value for sleep time injection into kernel timekeeping, which will result in random adjustments depending on the stored "hash" value. To prevent this keep track of the RTC clobbering and ignore the invalid RTC timestamp at resume. If the system resumed successfully clear the flag, which marks the RTC as unusable, warn the user about the RTC clobber and recommend to adjust the RTC with 'ntpdate' or 'rdate'. [jstultz: Fixed up pr_warn formating, and implemented suggestions from Ingo] [ tglx: Rewrote changelog ] Originally-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/1480372524-15181-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29ACPI/IORT: Introduce iort_iommu_configureLorenzo Pieralisi
DT based systems have a generic kernel API to configure IOMMUs for devices (ie of_iommu_configure()). On ARM based ACPI systems, the of_iommu_configure() equivalent can be implemented atop ACPI IORT kernel API, with the corresponding functions to map device identifiers to IOMMUs and retrieve the corresponding IOMMU operations necessary for DMA operations set-up. By relying on the iommu_fwspec generic kernel infrastructure, implement the IORT based IOMMU configuration for ARM ACPI systems and hook it up in the ACPI kernel layer that implements DMA configuration for a device. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ACPI core] Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29iommu/arm-smmu: Add IORT configurationLorenzo Pieralisi
In ACPI based systems, in order to be able to create platform devices and initialize them for ARM SMMU components, the IORT kernel implementation requires a set of static functions to be used by the IORT kernel layer to configure platform devices for ARM SMMU components. Add static configuration functions to the IORT kernel layer for the ARM SMMU components, so that the ARM SMMU driver can initialize its respective platform device by relying on the IORT kernel infrastructure and by adding a corresponding ACPI device early probe section entry. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29ACPI/IORT: Add node match functionLorenzo Pieralisi
Device drivers (eg ARM SMMU) need to know if a specific component is part of the IORT table, so that kernel data structures are not initialized at initcalls time if the respective component is not part of the IORT table. To this end, this patch adds a trivial function that allows detecting if a given IORT node type is present or not in the ACPI table, providing an ACPI IORT equivalent for of_find_matching_node(). Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29ACPI: Implement acpi_dma_configureLorenzo Pieralisi
On DT based systems, the of_dma_configure() API implements DMA configuration for a given device. On ACPI systems an API equivalent to of_dma_configure() is missing which implies that it is currently not possible to set-up DMA operations for devices through the ACPI generic kernel layer. This patch fills the gap by introducing acpi_dma_configure/deconfigure() calls that for now are just wrappers around arch_setup_dma_ops() and arch_teardown_dma_ops() and also updates ACPI and PCI core code to use the newly introduced acpi_dma_configure/acpi_dma_deconfigure functions. Since acpi_dma_configure() is used to configure DMA operations, the function initializes the dma/coherent_dma masks to sane default values if the current masks are uninitialized (also to keep the default values consistent with DT systems) to make sure the device has a complete default DMA set-up. The DMA range size passed to arch_setup_dma_ops() is sized according to the device coherent_dma_mask (starting at address 0x0), mirroring the DT probing path behaviour when a dma-ranges property is not provided for the device being probed; this changes the current arch_setup_dma_ops() call parameters in the ACPI probing case, but since arch_setup_dma_ops() is a NOP on all architectures but ARM/ARM64 this patch does not change the current kernel behaviour on them. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> [pci] Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29iommu: Make of_iommu_set/get_ops() DT agnosticLorenzo Pieralisi
The of_iommu_{set/get}_ops() API is used to associate a device tree node with a specific set of IOMMU operations. The same kernel interface is required on systems booting with ACPI, where devices are not associated with a device tree node, therefore the interface requires generalization. The struct device fwnode member represents the fwnode token associated with the device and the struct it points at is firmware specific; regardless, it is initialized on both ACPI and DT systems and makes an ideal candidate to use it to associate a set of IOMMU operations to a given device, through its struct device.fwnode member pointer, paving the way for representing per-device iommu_ops (ie an iommu instance associated with a device). Convert the DT specific of_iommu_{set/get}_ops() interface to use struct device.fwnode as a look-up token, making the interface usable on ACPI systems and rename the data structures and the registration API so that they are made to represent their usage more clearly. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29ACPI/IORT: Introduce linker section for IORT entries probingLorenzo Pieralisi
Since commit e647b532275b ("ACPI: Add early device probing infrastructure") the kernel has gained the infrastructure that allows adding linker script section entries to execute ACPI driver callbacks (ie probe routines) for all subsystems that register a table entry in the respective kernel section (eg clocksource, irqchip). Since ARM IOMMU devices data is described through IORT tables when booting with ACPI, the ARM IOMMU drivers must be made able to hook ACPI callback routines that are called to probe IORT entries and initialize the respective IOMMU devices. To avoid adding driver specific hooks into IORT table initialization code (breaking therefore code modularity - ie ACPI IORT code must be made aware of ARM SMMU drivers ACPI init callbacks), this patch adds code that allows ARM SMMU drivers to take advantage of the ACPI early probing infrastructure, so that they can add linker script section entries containing drivers callback to be executed on IORT tables detection. Since IORT nodes are differentiated by a type, the callback routines can easily parse the IORT table entries, check the IORT nodes and carry out some actions whenever the IORT node type associated with the driver specific callback is matched. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29ACPI: Add FWNODE_ACPI_STATIC fwnode typeLorenzo Pieralisi
On systems booting with a device tree, every struct device is associated with a struct device_node, that provides its DT firmware representation. The device node can be used in generic kernel contexts (eg IRQ translation, IOMMU streamid mapping), to retrieve the properties associated with the device and carry out kernel operations accordingly. Owing to the 1:1 relationship between the device and its device_node, the device_node can also be used as a look-up token for the device (eg looking up a device through its device_node), to retrieve the device in kernel paths where the device_node is available. On systems booting with ACPI, the same abstraction provided by the device_node is required to provide look-up functionality. The struct acpi_device, that represents firmware objects in the ACPI namespace already includes a struct fwnode_handle of type FWNODE_ACPI as their member; the same abstraction is missing though for devices that are instantiated out of static ACPI tables entries (eg ARM SMMU devices). Add a new fwnode_handle type to associate devices created out of static ACPI table entries to the respective firmware components and create a simple ACPI core layer interface to dynamically allocate and free the corresponding firmware nodes so that kernel subsystems can use it to instantiate the nodes and associate them with the respective devices. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29Merge branch 'kvm-ppc-next' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc PPC KVM update for 4.10: * Support for KVM guests on POWER9 using the hashed page table MMU. * Updates and improvements to the halt-polling support on PPC, from Suraj Jitindar Singh. * An optimization to speed up emulated MMIO, from Yongji Xie. * Various other minor cleanups.
2016-11-29sched/idle: Add support for tasks that inject idlePeter Zijlstra
Idle injection drivers such as Intel powerclamp and ACPI PAD drivers use realtime tasks to take control of CPU then inject idle. There are two issues with this approach: 1. Low efficiency: injected idle task is treated as busy so sched ticks do not stop during injected idle period, the result of these unwanted wakeups can be ~20% loss in power savings. 2. Idle accounting: injected idle time is presented to user as busy. This patch addresses the issues by introducing a new PF_IDLE flag which allows any given task to be treated as idle task while the flag is set. Therefore, idle injection tasks can run through the normal flow of NOHZ idle enter/exit to get the correct accounting as well as tick stop when possible. The implication is that idle task is then no longer limited to PID == 0. Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29cpuidle: Allow enforcing deepest idle state selectionJacob Pan
When idle injection is used to cap power, we need to override the governor's choice of idle states. For this reason, make it possible the deepest idle state selection to be enforced by setting a flag on a given CPU to achieve the maximum potential power draw reduction. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29irqchip/gic-v3-its: Change unsigned types for AArch32 compatibilityVladimir Murzin
Make sure that constants which are supposed to be applied on 64-bit data is actually unsigned long long, so they won't be truncated when used in 32-bit mode. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>