summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2016-11-13iio:core: add a callback to allow drivers to provide _available attributesJonathan Cameron
A large number of attributes can only take a limited range of values. Currently in IIO this is handled by directly registering additional *_available attributes thus providing this information to userspace. It is desirable to provide this information via the core for much the same reason this was done for the actual channel information attributes in the first place. If it isn't there, then it can only really be accessed from userspace. Other in kernel IIO consumers have no access to what valid parameters are. Two forms are currently supported: * list of values in one particular IIO_VAL_* format. e.g. 1.300000 1.500000 1.730000 * range specification with a step size: e.g. [1.000000 0.500000 2.500000] equivalent to 1.000000 1.5000000 2.000000 2.500000 An addition set of masks are used to allow different sharing rules for the *_available attributes generated. This allows for example: in_accel_x_offset in_accel_y_offset in_accel_offset_available. We could have gone with having a specification for each and every info_mask element but that would have meant changing the existing userspace ABI. This approach does not. Signed-off-by: Jonathan Cameron <jic23@kernel.org> [forward ported, added some docs and fixed buffer overflows /peda] Acked-by: Daniel Baluta <daniel.baluta@intel.com> Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-11-13x86/efi: Retrieve and assign Apple device propertiesLukas Wunner
Apple's EFI drivers supply device properties which are needed to support Macs optimally. They contain vital information which cannot be obtained any other way (e.g. Thunderbolt Device ROM). They're also used to convey the current device state so that OS drivers can pick up where EFI drivers left (e.g. GPU mode setting). There's an EFI driver dubbed "AAPL,PathProperties" which implements a per-device key/value store. Other EFI drivers populate it using a custom protocol. The macOS bootloader /System/Library/CoreServices/boot.efi retrieves the properties with the same protocol. The kernel extension AppleACPIPlatform.kext subsequently merges them into the I/O Kit registry (see ioreg(8)) where they can be queried by other kernel extensions and user space. This commit extends the efistub to retrieve the device properties before ExitBootServices is called. It assigns them to devices in an fs_initcall so that they can be queried with the API in <linux/property.h>. Note that the device properties will only be available if the kernel is booted with the efistub. Distros should adjust their installers to always use the efistub on Macs. grub with the "linux" directive will not work unless the functionality of this commit is duplicated in grub. (The "linuxefi" directive should work but is not included upstream as of this writing.) The custom protocol has GUID 91BD12FE-F6C3-44FB-A5B7-5122AB303AE0 and looks like this: typedef struct { unsigned long version; /* 0x10000 */ efi_status_t (*get) ( IN struct apple_properties_protocol *this, IN struct efi_dev_path *device, IN efi_char16_t *property_name, OUT void *buffer, IN OUT u32 *buffer_len); /* EFI_SUCCESS, EFI_NOT_FOUND, EFI_BUFFER_TOO_SMALL */ efi_status_t (*set) ( IN struct apple_properties_protocol *this, IN struct efi_dev_path *device, IN efi_char16_t *property_name, IN void *property_value, IN u32 property_value_len); /* allocates copies of property name and value */ /* EFI_SUCCESS, EFI_OUT_OF_RESOURCES */ efi_status_t (*del) ( IN struct apple_properties_protocol *this, IN struct efi_dev_path *device, IN efi_char16_t *property_name); /* EFI_SUCCESS, EFI_NOT_FOUND */ efi_status_t (*get_all) ( IN struct apple_properties_protocol *this, OUT void *buffer, IN OUT u32 *buffer_len); /* EFI_SUCCESS, EFI_BUFFER_TOO_SMALL */ } apple_properties_protocol; Thanks to Pedro Vilaça for this blog post which was helpful in reverse engineering Apple's EFI drivers and bootloader: https://reverse.put.as/2016/06/25/apple-efi-firmware-passwords-and-the-scbo-myth/ If someone at Apple is reading this, please note there's a memory leak in your implementation of the del() function as the property struct is freed but the name and value allocations are not. Neither the macOS bootloader nor Apple's EFI drivers check the protocol version, but we do to avoid breakage if it's ever changed. It's been the same since at least OS X 10.6 (2009). The get_all() function conveniently fills a buffer with all properties in marshalled form which can be passed to the kernel as a setup_data payload. The number of device properties is dynamic and can change between a first invocation of get_all() (to determine the buffer size) and a second invocation (to retrieve the actual buffer), hence the peculiar loop which does not finish until the buffer size settles. The macOS bootloader does the same. The setup_data payload is later on unmarshalled in an fs_initcall. The idea is that most buses instantiate devices in "subsys" initcall level and drivers are usually bound to these devices in "device" initcall level, so we assign the properties in-between, i.e. in "fs" initcall level. This assumes that devices to which properties pertain are instantiated from a "subsys" initcall or earlier. That should always be the case since on macOS, AppleACPIPlatformExpert::matchEFIDevicePath() only supports ACPI and PCI nodes and we've fully scanned those buses during "subsys" initcall level. The second assumption is that properties are only needed from a "device" initcall or later. Seems reasonable to me, but should this ever not work out, an alternative approach would be to store the property sets e.g. in a btree early during boot. Then whenever device_add() is called, an EFI Device Path would have to be constructed for the newly added device, and looked up in the btree. That way, the property set could be assigned to the device immediately on instantiation. And this would also work for devices instantiated in a deferred fashion. It seems like this approach would be more complicated and require more code. That doesn't seem justified without a specific use case. For comparison, the strategy on macOS is to assign properties to objects in the ACPI namespace (AppleACPIPlatformExpert::mergeEFIProperties()). That approach is definitely wrong as it fails for devices not present in the namespace: The NHI EFI driver supplies properties for attached Thunderbolt devices, yet on Macs with Thunderbolt 1 only one device level behind the host controller is described in the namespace. Consequently macOS cannot assign properties for chained devices. With Thunderbolt 2 they started to describe three device levels behind host controllers in the namespace but this grossly inflates the SSDT and still fails if the user daisy-chained more than three devices. We copy the property names and values from the setup_data payload to swappable virtual memory and afterwards make the payload available to the page allocator. This is just for the sake of good housekeeping, it wouldn't occupy a meaningful amount of physical memory (4444 bytes on my machine). Only the payload is freed, not the setup_data header since otherwise we'd break the list linkage and we cannot safely update the predecessor's ->next link because there's no locking for the list. The payload is currently not passed on to kexec'ed kernels, same for PCI ROMs retrieved by setup_efi_pci(). This can be added later if there is demand by amending setup_efi_state(). The payload can then no longer be made available to the page allocator of course. Tested-by: Lukas Wunner <lukas@wunner.de> [MacBookPro9,1] Tested-by: Pierre Moreau <pierre.morrow@free.fr> [MacBookPro11,3] Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pedro Vilaça <reverser@put.as> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: grub-devel@gnu.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112213237.8804-9-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13efi: Add device path parserLukas Wunner
We're about to extended the efistub to retrieve device properties from EFI on Apple Macs. The properties use EFI Device Paths to indicate the device they belong to. This commit adds a parser which, given an EFI Device Path, locates the corresponding struct device and returns a reference to it. Initially only ACPI and PCI Device Path nodes are supported, these are the only types needed for Apple device properties (the corresponding macOS function AppleACPIPlatformExpert::matchEFIDevicePath() does not support any others). Further node types can be added with little to moderate effort. Apple device properties is currently the only use case of this parser, but Peter Jones intends to use it to match up devices with the ConInDev/ConOutDev/ErrOutDev variables and add sysfs attributes to these devices to say the hardware supports using them as console. Thus, make this parser a separate component which can be selected with config option EFI_DEV_PATH_PARSER. It can in principle be compiled as a module if acpi_get_first_physical_node() and acpi_bus_type are exported (and efi_get_device_by_path() itself is exported). The dependency on CONFIG_ACPI is needed for acpi_match_device_ids(). It can be removed if an empty inline stub is added for that function. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112213237.8804-7-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13efi/arm*/libstub: Invoke EFI_RNG_PROTOCOL to seed the UEFI RNG tableArd Biesheuvel
Invoke the EFI_RNG_PROTOCOL protocol in the context of the stub and install the Linux-specific RNG seed UEFI config table. This will be picked up by the EFI routines in the core kernel to seed the kernel entropy pool. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112213237.8804-6-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13efi: Add support for seeding the RNG from a UEFI config tableArd Biesheuvel
Specify a Linux specific UEFI configuration table that carries some random bits, and use the contents during early boot to seed the kernel's random number generator. This allows much strong random numbers to be generated early on. The entropy is fed to the kernel using add_device_randomness(), which is documented as being appropriate for being called very early. Since UEFI configuration tables may also be consumed by kexec'd kernels, register a reboot notifier that updates the seed in the table. Note that the config table could be generated by the EFI stub or by any other UEFI driver or application (e.g., GRUB), but the random seed table GUID and the associated functionality should be considered an internal kernel interface (unless it is promoted to ABI later on) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112213237.8804-4-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13net: phy: expose phy_aneg_done API for use by driversLendacky, Thomas
Make phy_aneg_done() available to drivers so that the result of the auto-negotiation initiated by phy_start_aneg() can be determined. Remove the local implementation of phy_aneg_done() from the Aeroflex driver and use the phy library version. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12bpf: Fix bpf_redirect to an ipip/ip6tnl devMartin KaFai Lau
If the bpf program calls bpf_redirect(dev, 0) and dev is an ipip/ip6tnl, it currently includes the mac header. e.g. If dev is ipip, the end result is IP-EthHdr-IP instead of IP-IP. The fix is to pull the mac header. At ingress, skb_postpull_rcsum() is not needed because the ethhdr should have been pulled once already and then got pushed back just before calling the bpf_prog. At egress, this patch calls skb_postpull_rcsum(). If bpf_redirect(dev, BPF_F_INGRESS) is called, it also fails now because it calls dev_forward_skb() which eventually calls eth_type_trans(skb, dev). The eth_type_trans() will set skb->type = PACKET_OTHERHOST because the mac address does not match the redirecting dev->dev_addr. The PACKET_OTHERHOST will eventually cause the ip_rcv() errors out. To fix this, ____dev_forward_skb() is added. Joint work with Daniel Borkmann. Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel") Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12bpf, mlx4: fix prog refcount in mlx4_en_try_alloc_resources error pathDaniel Borkmann
Commit 67f8b1dcb9ee ("net/mlx4_en: Refactor the XDP forwarding rings scheme") added a bug in that the prog's reference count is not dropped in the error path when mlx4_en_try_alloc_resources() is failing from mlx4_xdp_set(). We previously took bpf_prog_add(prog, priv->rx_ring_num - 1), that we need to release again. Earlier in the call path, dev_change_xdp_fd() itself holds a reference to the prog as well (hence the '- 1' in the bpf_prog_add()), so a simple atomic_sub() is safe to use here. When an error is propagated, then bpf_prog_put() is called eventually from dev_change_xdp_fd() Fixes: 67f8b1dcb9ee ("net/mlx4_en: Refactor the XDP forwarding rings scheme") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-11Merge tag 'acpi-4.9-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix a recent regression in the 8250_dw serial driver introduced by adding a quirk for the APM X-Gene SoC to it which uncovered an issue related to the handling of built-in device properties in the core ACPI device enumeration code (Heikki Krogerus)" * tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / platform: Add support for build-in properties
2016-11-11Merge branch 'device-properties'Rafael J. Wysocki
* device-properties: ACPI / platform: Add support for build-in properties
2016-11-11block: move poll code to blk-mqJens Axboe
The poll code is blk-mq specific, let's move it to blk-mq.c. This is a prep patch for improving the polling code. Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2016-11-11pstore: Make spinlock per zone instead of globalJoel Fernandes
Currently pstore has a global spinlock for all zones. Since the zones are independent and modify different areas of memory, there's no need to have a global lock, so we should use a per-zone lock as introduced here. Also, when ramoops's ftrace use-case has a FTRACE_PER_CPU flag introduced later, which splits the ftrace memory area into a single zone per CPU, it will eliminate the need for locking. In preparation for this, make the locking optional. Signed-off-by: Joel Fernandes <joelaf@google.com> [kees: updated commit message] Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-11thread_info: include <current.h> for THREAD_INFO_IN_TASKMark Rutland
When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info() macro relies on current having been defined prior to its use. However, not all users of current_thread_info() include <asm/current.h>, and thus current is not guaranteed to be defined. When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that get_current() / current are based upon current_thread_info(), and <asm/current.h> includes <asm/thread_info.h>. Thus always including <asm/current.h> would result in circular dependences on some platforms. To ensure both cases work, this patch includes <asm/current.h>, but only when CONFIG_THREAD_INFO_IN_TASK is selected. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11thread_info: factor out restart_blockMark Rutland
Since commit f56141e3e2d9aabf ("all arches, signal: move restart_block to struct task_struct"), thread_info and restart_block have been logically distinct, yet struct restart_block is still defined in <linux/thread_info.h>. At least one architecture (erroneously) uses restart_block as part of its thread_info, and thus the definition of restart_block must come before the include of <asm/thread_info>. Subsequent patches in this series need to shuffle the order of includes and definitions in <linux/thread_info.h>, and will make this ordering fragile. This patch moves the definition of restart_block out to its own header. This serves as generic cleanup, logically separating thread_info and restart_block, and also makes it easier to avoid fragility. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "15 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib/stackdepot: export save/fetch stack for drivers mm: kmemleak: scan .data.ro_after_init memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB coredump: fix unfreezable coredumping task mm/filemap: don't allow partially uptodate page for pipes mm/hugetlb: fix huge page reservation leak in private mapping error paths ocfs2: fix not enough credit panic Revert "console: don't prefer first registered if DT specifies stdout-path" mm: hwpoison: fix thp split handling in memory_failure() swapfile: fix memory corruption via malformed swapfile mm/cma.c: check the max limit for cma allocation scripts/bloat-o-meter: fix SIGPIPE shmem: fix pageflags after swapping DMA32 object mm, frontswap: make sure allocated frontswap map is assigned mm: remove extra newline from allocation stall warning
2016-11-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "Christoph's and Jan's aio fixes, fixup for generic_file_splice_read (removal of pointless detritus that actually breaks it when used for gfs2 ->splice_read()) and fixup for generic_file_read_iter() interaction with ITER_PIPE destinations." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: splice: remove detritus from generic_file_splice_read() mm/filemap: don't allow partially uptodate page for pipes aio: fix freeze protection of aio writes fs: remove aio_run_iocb fs: remove the never implemented aio_fsync file operation aio: hold an extra file reference over AIO read/write operations
2016-11-11Revert "console: don't prefer first registered if DT specifies stdout-path"Hans de Goede
This reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path"). The reverted commit changes existing behavior on which many ARM boards rely. Many ARM small-board-computers, like e.g. the Raspberry Pi have both a video output and a serial console. Depending on whether the user is using the device as a more regular computer; or as a headless device we need to have the console on either one or the other. Many users rely on the kernel behavior of the console being present on both outputs, before the reverted commit the console setup with no console= kernel arguments on an ARM board which sets stdout-path in dt would look like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 tty0 -WU (E p ) 4:1 Where as after the reverted commit, it looks like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 This commit reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") restoring the original behavior. Fixes: 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11mm, frontswap: make sure allocated frontswap map is assignedVlastimil Babka
Christian Borntraeger reports: With commit 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") kmemleak complains about a memory leak in swapon unreferenced object 0x3e09ba56000 (size 32112640): comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: __vmalloc_node_range+0x194/0x2d8 vzalloc+0x58/0x68 SyS_swapon+0xd60/0x12f8 system_call+0xd6/0x270 Turns out kmemleak is right. We now allocate the frontswap map depending on the kernel config (and no longer on the enablement) swapfile.c: [...] if (IS_ENABLED(CONFIG_FRONTSWAP)) frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long)); but later on this is passed along --> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map); and ignored if frontswap is disabled --> frontswap_init(p->type, frontswap_map); static inline void frontswap_init(unsigned type, unsigned long *map) { if (frontswap_enabled()) __frontswap_init(type, map); } Thing is, that frontswap map is never freed. The leakage is relatively not that bad, because swapon is an infrequent and privileged operation. However, if the first frontswap backend is registered after a swap type has been already enabled, it will WARN_ON in frontswap_register_ops() and frontswap will not be available for the swap type. Fix this by making sure the map is assigned by frontswap_init() as long as CONFIG_FRONTSWAP is enabled. Fixes: 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-11Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-11cpufreq: stats: New sysfs attribute for clearing statisticsMarkus Mayer
Allow CPUfreq statistics to be cleared by writing anything to /sys/.../cpufreq/stats/reset. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-11Merge tag 'topic/drm-misc-2016-11-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next - better atomic state debugging from Rob - fence prep from gustavo - sumits flushed out his backlog of pending dma-buf/fence patches from various people - drm_mm leak debugging plus trying to appease Kconfig (Chris) - a few misc things all over * tag 'topic/drm-misc-2016-11-10' of git://anongit.freedesktop.org/drm-intel: (35 commits) drm: Make DRM_DEBUG_MM depend on STACKTRACE_SUPPORT drm/i915: Restrict DRM_DEBUG_MM automatic selection drm: Restrict stackdepot usage to builtin drm.ko drm/msm: module param to dump state on error irq drm/msm/mdp5: add atomic_print_state support drm/atomic: add debugfs file to dump out atomic state drm/atomic: add new drm_debug bit to dump atomic state drm: add helpers to go from plane state to drm_rect drm: add helper for printing to log or seq_file drm: helper macros to print composite types reservation: revert "wait only with non-zero timeout specified (v3)" v2 drm/ttm: fix ttm_bo_wait dma-buf/fence: revert "don't wait when specified timeout is zero" (v2) dma-buf/fence: make timeout handling in fence_default_wait consistent (v2) drm/amdgpu: add the interface of waiting multiple fences (v4) dma-buf: return index of the first signaled fence (v2) MAINTAINERS: update Sync File Framework files dma-buf/sw_sync: put fence reference from the fence creation dma-buf/sw_sync: mark sync_timeline_create() static drm: Add stackdepot include for DRM_DEBUG_MM ...
2016-11-10block: hook up writeback throttlingJens Axboe
Enable throttling of buffered writeback to make it a lot more smooth, and has way less impact on other system activity. Background writeback should be, by definition, background activity. The fact that we flush huge bundles of it at the time means that it potentially has heavy impacts on foreground workloads, which isn't ideal. We can't easily limit the sizes of writes that we do, since that would impact file system layout in the presence of delayed allocation. So just throttle back buffered writeback, unless someone is waiting for it. The algorithm for when to throttle takes its inspiration in the CoDel networking scheduling algorithm. Like CoDel, blk-wb monitors the minimum latencies of requests over a window of time. In that window of time, if the minimum latency of any request exceeds a given target, then a scale count is incremented and the queue depth is shrunk. The next monitoring window is shrunk accordingly. Unlike CoDel, if we hit a window that exhibits good behavior, then we simply increment the scale count and re-calculate the limits for that scale value. This prevents us from oscillating between a close-to-ideal value and max all the time, instead remaining in the windows where we get good behavior. Unlike CoDel, blk-wb allows the scale count to to negative. This happens if we primarily have writes going on. Unlike positive scale counts, this doesn't change the size of the monitoring window. When the heavy writers finish, blk-bw quickly snaps back to it's stable state of a zero scale count. The patch registers a sysfs entry, 'wb_lat_usec'. This sets the latency target to me met. It defaults to 2 msec for non-rotational storage, and 75 msec for rotational storage. Setting this value to '0' disables blk-wb. Generally, a user would not have to touch this setting. We don't enable WBT on devices that are managed with CFQ, and have a non-root block cgroup attached. If we have a proportional share setup on this particular disk, then the wbt throttling will interfere with that. We don't have a strong need for wbt for that case, since we will rely on CFQ doing that for us. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-10block: add scalable completion tracking of requestsJens Axboe
For legacy block, we simply track them in the request queue. For blk-mq, we track them on a per-sw queue basis, which we can then sum up through the hardware queues and finally to a per device state. The stats are tracked in, roughly, 0.1s interval windows. Add sysfs files to display the stats. The feature is off by default, to avoid any extra overhead. In-kernel users of it can turn it on by setting QUEUE_FLAG_STATS in the queue flags. We currently don't turn it on if someone just reads any of the stats files, that is something we could add as well. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-10libceph: initialize last_linger_id with a large integerIlya Dryomov
osdc->last_linger_id is a counter for lreq->linger_id, which is used for watch cookies. Starting with a large integer should ease the task of telling apart kernel and userspace clients. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-11-10nvme: introduce struct nvme_requestChristoph Hellwig
This adds a shared per-request structure for all NVMe I/O. This structure is embedded as the first member in all NVMe transport drivers request private data and allows to implement common functionality between the drivers. The first use is to replace the current abuse of the SCSI command passthrough fields in struct request for the NVMe command passthrough, but it will grow a field more fields to allow implementing things like common abort handlers in the future. The passthrough commands are handled by having a pointer to the SQE (struct nvme_command) in struct nvme_request, and the union of the possible result fields, which had to be turned from an anonymous into a named union for that purpose. This avoids having to pass a reference to a full CQE around and thus makes checking the result a lot more lightweight. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-10drivers: base: cacheinfo: fix x86 with CONFIG_OF enabledSudeep Holla
With CONFIG_OF enabled on x86, we get the following error on boot: " Failed to find cpu0 device node Unable to detect cache hierarchy from DT for CPU 0 " and the cacheinfo fails to get populated in the corresponding sysfs entries. This is because cache_setup_of_node looks for of_node for setting up the shared cpu_map without checking that it's already populated in the architecture specific callback. In order to indicate that the shared cpu_map is already populated, this patch introduces a boolean `cpu_map_populated` in struct cpu_cacheinfo that can be used by the generic code to skip cache_shared_cpu_map_setup. This patch also sets that boolean for x86. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10fpga: fpga-region: device tree control for FPGAAlan Tull
FPGA Regions support programming FPGA under control of the Device Tree. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10fpga: add fpga bridge frameworkAlan Tull
This framework adds API functions for enabling/ disabling FPGA bridges under kernel control. This allows the Linux kernel to disable FPGA bridges during FPGA reprogramming and to enable FPGA bridges when FPGA reprogramming is done. This framework is be manufacturer-agnostic, allowing it to be used in interfaces that use the FPGA Manager Framework to reprogram FPGA's. The functions are: * of_fpga_bridge_get * fpga_bridge_put Get/put an exclusive reference to a FPGA bridge. * fpga_bridge_enable * fpga_bridge_disable Enable/Disable traffic through a bridge. * fpga_bridge_register * fpga_bridge_unregister Register/unregister a device-specific low level FPGA Bridge driver. Get an exclusive reference to a bridge and add it to a list: * fpga_bridge_get_to_list To enable/disable/put a set of bridges that are on a list: * fpga_bridges_enable * fpga_bridges_disable * fpga_bridges_put Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10fpga-mgr: add fpga image information structAlan Tull
This patch adds a minor change in the FPGA Manager API to hold information that is specific to an FPGA image file. This change is expected to bring little, if any, pain. The socfpga and zynq drivers are fixed up in this patch. An FPGA image file will have particulars that affect how the image is programmed to the FPGA. One example is that current 'flags' currently has one bit which shows whether the FPGA image was built for full reconfiguration or partial reconfiguration. Another example is timeout values for enabling or disabling the bridges in the FPGA. As the complexity of the FPGA design increases, the bridges in the FPGA may take longer times to enable or disable. This patch adds a new 'struct fpga_image_info', moves the current 'u32 flags' to it. Two other image-specific u32's are added for the bridge enable/disable timeouts. The FPGA Manager API functions are changed, replacing the 'u32 flag' parameter with a pointer to struct fpga_image_info. Subsequent patches fix the existing low level FPGA manager drivers. Signed-off-by: Alan Tull <atull@opensource.altera.com> Acked-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10fpga: add method to get fpga manager from deviceAlan Tull
The intent is to provide a non-DT method of getting ahold of a FPGA manager to do some FPGA programming. This patch refactors of_fpga_mgr_get() to reuse most of it while adding a new method fpga_mgr_get() for getting a pointer to a fpga manager struct, given the device. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10of/overlay: add of overlay notificationsAlan Tull
This patch add of overlay notifications. When DT overlays are being added, some drivers/subsystems need to see device tree overlays before the changes go into the live tree. This is distinct from reconfig notifiers that are post-apply or post-remove and which issue very granular notifications without providing access to the context of a whole overlay. The following 4 notificatons are issued: OF_OVERLAY_PRE_APPLY OF_OVERLAY_POST_APPLY OF_OVERLAY_PRE_REMOVE OF_OVERLAY_POST_REMOVE In the case of pre-apply notification, if the notifier returns error, the overlay will be rejected. This patch exports two functions for registering/unregistering notifications: of_overlay_notifier_register(struct notifier_block *nb) of_overlay_notifier_unregister(struct notifier_block *nb) The of_mutex is held during these notifications. The notification data includes pointers to the overlay target and the overlay: struct of_overlay_notify_data { struct device_node *overlay; struct device_node *target; }; Signed-off-by: Alan Tull <atull@opensource.altera.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10netfilter: ipset: Count non-static extension memory for userspaceJozsef Kadlecsik
Non-static (i.e. comment) extension was not counted into the memory size. A new internal counter is introduced for this. In the case of the hash types the sizes of the arrays are counted there as well so that we can avoid to scan the whole set when just the header data is requested. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Add element count to all set types headerJozsef Kadlecsik
It is better to list the set elements for all set types, thus the header information is uniform. Element counts are therefore added to the bitmap and list types. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Regroup ip_set_put_extensions and add externJozsef Kadlecsik
Cleanup: group ip_set_put_extensions and ip_set_get_extensions together and add missing extern. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Split extensions into separate filesJozsef Kadlecsik
Cleanup to separate all extensions into individual files. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Use kmalloc() in comment extension helperJozsef Kadlecsik
Allocate memory with kmalloc() rather than kzalloc(): the string is immediately initialized so it is unnecessary to zero out the allocated memory area. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Improve skbinfo get/init helpersJozsef Kadlecsik
Use struct ip_set_skbinfo in struct ip_set_ext instead of open coded fields and assign structure members in get/init helpers instead of copying members one by one. Explicitly note that struct ip_set_skbinfo must be padded to prevent non-aligned access in the extension blob. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Headers file cleanupJozsef Kadlecsik
Group counter helper functions together. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Mark some helper args as const.Jozsef Kadlecsik
Mark some of the helpers arguments as const. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Remove extra whitespaces in ip_set.hJozsef Kadlecsik
Remove unnecessary whitespaces. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10base: soc: Provide a dummy implementation of soc_device_match()Geert Uytterhoeven
Provide a dummy implementation of soc_device_match(), to allow compiling drivers that may be used on SoCs both with and without CONFIG_SOC_BUS, and for compile testing. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2016-11-10base: soc: Introduce soc_device_match() interfaceArnd Bergmann
We keep running into cases where device drivers want to know the exact version of the a SoC they are currently running on. In the past, this has usually been done through a vendor specific API that can be called by a driver, or by directly accessing some kind of version register that is not part of the device itself but that belongs to a global register area of the chip. Common reasons for doing this include: - A machine is not using devicetree or similar for passing data about on-chip devices, but just announces their presence using boot-time platform devices, and the machine code itself does not care about the revision. - There is existing firmware or boot loaders with existing DT binaries with generic compatible strings that do not identify the particular revision of each device, but the driver knows which SoC revisions include which part. - A prerelease version of a chip has some quirks and we are using the same version of the bootloader and the DT blob on both the prerelease and the final version. An update of the DT binding seems inappropriate because that would involve maintaining multiple copies of the dts and/or bootloader. This patch introduces the soc_device_match() interface that is meant to work like of_match_node() but instead of identifying the version of a device, it identifies the SoC itself using a vendor-agnostic interface. Unlike of_match_node(), we do not do an exact string compare but instead use glob_match() to allow wildcards in strings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-09remoteproc: Introduce subdevicesBjorn Andersson
A subdevice is an abstract entity that can be used to tie actions to the booting and shutting down of a remote processor. The subdevice object is expected to be embedded in concrete implementations, allowing for a variety of use cases to be implemented. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-11-09ptp: Introduce a high resolution frequency adjustment method.Richard Cochran
The internal PTP Hardware Clock (PHC) interface limits the resolution for frequency adjustments to one part per billion. However, some hardware devices allow finer adjustment, and making use of the increased resolution improves synchronization measurably on such devices. This patch adds an alternative method that allows finer frequency tuning by passing the scaled ppm value to PHC drivers. This value comes from user space, and it has a resolution of about 0.015 ppb. We also deprecate the older method, anticipating its removal once existing drivers have been converted over. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Suggested-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09net: napi_hash_add() is no longer exportedEric Dumazet
There are no more users except from net/core/dev.c napi_hash_add() can now be static. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: sr: add core files for SR HMAC supportDavid Lebrun
This patch adds the necessary functions to compute and check the HMAC signature of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and hmac(sha256). In order to avoid dynamic memory allocation for each HMAC computation, a per-cpu ring buffer is allocated for this purpose. A new per-interface sysctl called seg6_require_hmac is added, allowing a user-defined policy for processing HMAC-signed SR-enabled packets. A value of -1 means that the HMAC field will always be ignored. A value of 0 means that if an HMAC field is present, its validity will be enforced (the packet is dropped is the signature is incorrect). Finally, a value of 1 means that any SR-enabled packet that does not contain an HMAC signature or whose signature is incorrect will be dropped. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: sr: add support for SRH encapsulation and injection with lwtunnelsDavid Lebrun
This patch creates a new type of interfaceless lightweight tunnel (SEG6), enabling the encapsulation and injection of SRH within locally emitted packets and forwarded packets. >From a configuration viewpoint, a seg6 tunnel would be configured as follows: ip -6 ro ad fc00::1/128 encap seg6 mode encap segs fc42::1,fc42::2,fc42::3 dev eth0 Any packet whose destination address is fc00::1 would thus be encapsulated within an outer IPv6 header containing the SRH with three segments, and would actually be routed to the first segment of the list. If `mode inline' was specified instead of `mode encap', then the SRH would be directly inserted after the IPv6 header without outer encapsulation. The inline mode is only available if CONFIG_IPV6_SEG6_INLINE is enabled. This feature was made configurable because direct header insertion may break several mechanisms such as PMTUD or IPSec AH. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: sr: add code base for control plane support of SR-IPv6David Lebrun
This patch adds the necessary hooks and structures to provide support for SR-IPv6 control plane, essentially the Generic Netlink commands that will be used for userspace control over the Segment Routing kernel structures. The genetlink commands provide control over two different structures: tunnel source and HMAC data. The tunnel source is the source address that will be used by default when encapsulating packets into an outer IPv6 header + SRH. If the tunnel source is set to :: then an address of the outgoing interface will be selected as the source. The HMAC commands currently just return ENOTSUPP and will be implemented in a future patch. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)David Lebrun
Implement minimal support for processing of SR-enabled packets as described in https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02. This patch implements the following operations: - Intermediate segment endpoint: incrementation of active segment and rerouting. - Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH and routing of inner packet. - Cleanup flag support for SR-inlined packets: removal of SRH if we are the penultimate segment endpoint. A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled packets. Default is deny. This patch does not provide support for HMAC-signed packets. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>