summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2016-09-09efi: Refactor efi_memmap_init_early() into arch-neutral codeMatt Fleming
Every EFI architecture apart from ia64 needs to setup the EFI memory map at efi.memmap, and the code for doing that is essentially the same across all implementations. Therefore, it makes sense to factor this out into the common code under drivers/firmware/efi/. The only slight variation is the data structure out of which we pull the initial memory map information, such as physical address, memory descriptor size and version, etc. We can address this by passing a generic data structure (struct efi_memory_map_data) as the argument to efi_memmap_init_early() which contains the minimum info required for initialising the memory map. In the process, this patch also fixes a few undesirable implementation differences: - ARM and arm64 were failing to clear the EFI_MEMMAP bit when unmapping the early EFI memory map. EFI_MEMMAP indicates whether the EFI memory map is mapped (not the regions contained within) and can be traversed. It's more correct to set the bit as soon as we memremap() the passed in EFI memmap. - Rename efi_unmmap_memmap() to efi_memmap_unmap() to adhere to the regular naming scheme. This patch also uses a read-write mapping for the memory map instead of the read-only mapping currently used on ARM and arm64. x86 needs the ability to update the memory map in-place when assigning virtual addresses to regions (efi_map_region()) and tagging regions when reserving boot services (efi_reserve_boot_services()). There's no way for the generic fake_mem code to know which mapping to use without introducing some arch-specific constant/hook, so just use read-write since read-only is of dubious value for the EFI memory map. Tested-by: Dave Young <dyoung@redhat.com> [kexec/kdump] Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [arm] Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Peter Jones <pjones@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-09[media] media: platform: pxa_camera: make a standalone v4l2 deviceRobert Jarzmik
This patch removes the soc_camera API dependency from pxa_camera. In the current status : - all previously captures are working the same on pxa270 - the s_crop() call was removed, judged not working (see what happens soc_camera_s_crop() when get_crop() == NULL) - if the pixel clock is provided by then sensor, ie. not MCLK, the dual stage change is not handled yet. => there is no in-tree user of this, so I'll let it that way - the MCLK is not yet finished, it's as in the legacy way, ie. activated at video device opening and closed at video device closing. In a subsequence patch pxa_camera_mclk_ops should be used, and platform data MCLK ignored. It will be the sensor's duty to request the clock and enable it, which will end in pxa_camera_mclk_ops. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-09-09drivers/perf: arm_pmu: expose a cpumask in sysfsMark Rutland
In systems with heterogeneous CPUs, there are multiple logical CPU PMUs, each of which covers a subset of CPUs in the system. In some cases userspace needs to know which CPUs a given logical PMU covers, so we'd like to expose a cpumask under sysfs, similar to what is done for uncore PMUs. Unfortunately, prior to commit 00e727bb389359c8 ("perf stat: Balance opening and reading events"), perf stat only correctly handled a cpumask holding a single CPU, and only when profiling in system-wide mode. In other cases, the presence of a cpumask file could cause perf stat to behave erratically. Thus, exposing a cpumask file would break older perf binaries in cases where they would otherwise work. To avoid this issue while still providing userspace with the information it needs, this patch exposes a differently-named file (cpus) under sysfs. New tools can look for this and operate correctly, while older tools will not be adversely affected by its presence. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09drivers/perf: arm_pmu: add common attr group fieldsMark Rutland
In preparation for adding common attribute groups, add an array of attribute group pointers to arm_pmu, which will be used if the backend hasn't already set pmu::attr_groups. Subsequent patches will move backends over to using these, before adding common fields. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09x86/pkeys: Default to a restrictive init PKRUDave Hansen
PKRU is the register that lets you disallow writes or all access to a given protection key. The XSAVE hardware defines an "init state" of 0 for PKRU: its most permissive state, allowing access/writes to everything. Since we start off all new processes with the init state, we start all processes off with the most permissive possible PKRU. This is unfortunate. If a thread is clone()'d [1] before a program has time to set PKRU to a restrictive value, that thread will be able to write to all data, no matter what pkey is set on it. This weakens any integrity guarantees that we want pkeys to provide. To fix this, we define a very restrictive PKRU to override the XSAVE-provided value when we create a new FPU context. We choose a value that only allows access to pkey 0, which is as restrictive as we can practically make it. This does not cause any practical problems with applications using protection keys because we require them to specify initial permissions for each key when it is allocated, which override the restrictive default. In the end, this ensures that threads which do not know how to manage their own pkey rights can not do damage to data which is pkey-protected. I would have thought this was a pretty contrived scenario, except that I heard a bug report from an MPX user who was creating threads in some very early code before main(). It may be crazy, but folks evidently _do_ it. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: mgorman@techsingularity.net Cc: arnd@arndb.de Cc: linux-api@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@kernel.org Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20160729163021.F3C25D4A@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-09generic syscalls: Wire up memory protection keys syscallsDave Hansen
These new syscalls are implemented as generic code, so enable them for architectures like arm64 which use the generic syscall table. According to Arnd: Even if the support is x86 specific for the forseeable future, it may be good to reserve the number just in case. The other architecture specific syscall lists are usually left to the individual arch maintainers, most a lot of the newer architectures share this table. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: mgorman@techsingularity.net Cc: linux-api@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@kernel.org Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20160729163018.505A6875@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-09x86/pkeys: Allocation/free syscallsDave Hansen
This patch adds two new system calls: int pkey_alloc(unsigned long flags, unsigned long init_access_rights) int pkey_free(int pkey); These implement an "allocator" for the protection keys themselves, which can be thought of as analogous to the allocator that the kernel has for file descriptors. The kernel tracks which numbers are in use, and only allows operations on keys that are valid. A key which was not obtained by pkey_alloc() may not, for instance, be passed to pkey_mprotect(). These system calls are also very important given the kernel's use of pkeys to implement execute-only support. These help ensure that userspace can never assume that it has control of a key unless it first asks the kernel. The kernel does not promise to preserve PKRU (right register) contents except for allocated pkeys. The 'init_access_rights' argument to pkey_alloc() specifies the rights that will be established for the returned pkey. For instance: pkey = pkey_alloc(flags, PKEY_DENY_WRITE); will allocate 'pkey', but also sets the bits in PKRU[1] such that writing to 'pkey' is already denied. The kernel does not prevent pkey_free() from successfully freeing in-use pkeys (those still assigned to a memory range by pkey_mprotect()). It would be expensive to implement the checks for this, so we instead say, "Just don't do it" since sane software will never do it anyway. Any piece of userspace calling pkey_alloc() needs to be prepared for it to fail. Why? pkey_alloc() returns the same error code (ENOSPC) when there are no pkeys and when pkeys are unsupported. They can be unsupported for a whole host of reasons, so apps must be prepared for this. Also, libraries or LD_PRELOADs might steal keys before an application gets access to them. This allocation mechanism could be implemented in userspace. Even if we did it in userspace, we would still need additional user/kernel interfaces to tell userspace which keys are being used by the kernel internally (such as for execute-only mappings). Having the kernel provide this facility completely removes the need for these additional interfaces, or having an implementation of this in userspace at all. Note that we have to make changes to all of the architectures that do not use mman-common.h because we use the new PKEY_DENY_ACCESS/WRITE macros in arch-independent code. 1. PKRU is the Protection Key Rights User register. It is a usermode-accessible register that controls whether writes and/or access to each individual pkey is allowed or denied. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: arnd@arndb.de Cc: linux-api@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@kernel.org Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20160729163015.444FE75F@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-09x86/pkeys: Make mprotect_key() mask off additional vm_flagsDave Hansen
Today, mprotect() takes 4 bits of data: PROT_READ/WRITE/EXEC/NONE. Three of those bits: READ/WRITE/EXEC get translated directly in to vma->vm_flags by calc_vm_prot_bits(). If a bit is unset in mprotect()'s 'prot' argument then it must be cleared in vma->vm_flags during the mprotect() call. We do this clearing today by first calculating the VMA flags we want set, then clearing the ones we do not want to inherit from the original VMA: vm_flags = calc_vm_prot_bits(prot, key); ... newflags = vm_flags; newflags |= (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC)); However, we *also* want to mask off the original VMA's vm_flags in which we store the protection key. To do that, this patch adds a new macro: ARCH_VM_PKEY_FLAGS which allows the architecture to specify additional bits that it would like cleared. We use that to ensure that the VM_PKEY_BIT* bits get cleared. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: arnd@arndb.de Cc: linux-api@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@kernel.org Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20160729163013.E48D6981@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-09mm: Implement new pkey_mprotect() system callDave Hansen
pkey_mprotect() is just like mprotect, except it also takes a protection key as an argument. On systems that do not support protection keys, it still works, but requires that key=0. Otherwise it does exactly what mprotect does. I expect it to get used like this, if you want to guarantee that any mapping you create can *never* be accessed without the right protection keys set up. int real_prot = PROT_READ|PROT_WRITE; pkey = pkey_alloc(0, PKEY_DENY_ACCESS); ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey); This way, there is *no* window where the mapping is accessible since it was always either PROT_NONE or had a protection key set that denied all access. We settled on 'unsigned long' for the type of the key here. We only need 4 bits on x86 today, but I figured that other architectures might need some more space. Semantically, we have a bit of a problem if we combine this syscall with our previously-introduced execute-only support: What do we do when we mix execute-only pkey use with pkey_mprotect() use? For instance: pkey_mprotect(ptr, PAGE_SIZE, PROT_WRITE, 6); // set pkey=6 mprotect(ptr, PAGE_SIZE, PROT_EXEC); // set pkey=X_ONLY_PKEY? mprotect(ptr, PAGE_SIZE, PROT_WRITE); // is pkey=6 again? To solve that, we make the plain-mprotect()-initiated execute-only support only apply to VMAs that have the default protection key (0) set on them. Proposed semantics: 1. protection key 0 is special and represents the default, "unassigned" protection key. It is always allocated. 2. mprotect() never affects a mapping's pkey_mprotect()-assigned protection key. A protection key of 0 (even if set explicitly) represents an unassigned protection key. 2a. mprotect(PROT_EXEC) on a mapping with an assigned protection key may or may not result in a mapping with execute-only properties. pkey_mprotect() plus pkey_set() on all threads should be used to _guarantee_ execute-only semantics if this is not a strong enough semantic. 3. mprotect(PROT_EXEC) may result in an "execute-only" mapping. The kernel will internally attempt to allocate and dedicate a protection key for the purpose of execute-only mappings. This may not be possible in cases where there are no free protection keys available. It can also happen, of course, in situations where there is no hardware support for protection keys. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: arnd@arndb.de Cc: linux-api@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@kernel.org Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20160729163012.3DDD36C4@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-09brcmfmac: add support for bcm4339 chip with modalias sdio:c00v02D0d4339Arend Van Spriel
The driver already supports the bcm4339 chipset but only for the variant that shares the same modalias as the bcm4335, ie. sdio:c00v02D0d4335. It turns out that there are also bcm4339 devices out there that have a more distiguishable modalias sdio:c00v02D0d4339. Reported-by: Steve deRosier <derosier@gmail.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-09-09add basic register-field manipulation macrosJakub Kicinski
Common approach to accessing register fields is to define structures or sets of macros containing mask and shift pair. Operations on the register are then performed as follows: field = (reg >> shift) & mask; reg &= ~(mask << shift); reg |= (field & mask) << shift; Defining shift and mask separately is tedious. Ivo van Doorn came up with an idea of computing them at compilation time based on a single shifted mask (later refined by Felix) which can be used like this: #define REG_FIELD 0x000ff000 field = FIELD_GET(REG_FIELD, reg); reg &= ~REG_FIELD; reg |= FIELD_PREP(REG_FIELD, field); FIELD_{GET,PREP} macros take care of finding out what the appropriate shift is based on compilation time ffs operation. GENMASK can be used to define registers (which is usually less error-prone and easier to match with datasheets). This approach is the most convenient I've seen so to limit code multiplication let's move the macros to a global header file. Attempts to use static inlines instead of macros failed due to false positive triggering of BUILD_BUG_ON()s, especially with GCC < 6.0. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-09-09kbuild: allow archs to select link dead code/data eliminationNicholas Piggin
Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to select to build with -ffunction-sections, -fdata-sections, and link with --gc-sections. It requires some work (documented) to ensure all unreferenced entrypoints are live, and requires toolchain and build verification, so it is made a per-arch option for now. On a random powerpc64le build, this yelds a significant size saving, it boots and runs fine, but there is a lot I haven't tested as yet, so these savings may be reduced if there are bugs in the link. text data bss dec filename 11169741 1180744 1923176 14273661 vmlinux 10445269 1004127 1919707 13369103 vmlinux.dce ~700K text, ~170K data, 6% removed from kernel image size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.com>
2016-09-08rpmsg: Allow callback to return errorsBjorn Andersson
Some rpmsg backends support holding on to and redelivering messages upon failed handling of them, so provide a way for the callback to report and error and allow the backends to handle this. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Move virtio specifics from public headerBjorn Andersson
Move virtio rpmsg implementation details from the public header file to the virtio rpmsg implementation. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: virtio: Hide vrp pointer from the public APIBjorn Andersson
Create a container struct virtio_rpmsg_channel around the rpmsg_channel to keep virtio backend information separate from the rpmsg and public API. This makes the public structures independant of virtio. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Hide rpmsg indirection tablesBjorn Andersson
Move the device and endpoint indirection tables to the rpmsg internal header file, to hide them from the public API. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Move endpoint related interface to rpmsg coreBjorn Andersson
Move the rpmsg_send() and rpmsg_destroy_ept() interface to the rpmsg core, so that we eventually can hide the rpmsg_endpoint ops from the public API. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Indirection table for rpmsg_endpoint operationsBjorn Andersson
Add indirection table for rpmsg_endpoint related operations and move virtio implementation behind this, this finishes of the decoupling of the virtio implementation from the public API. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Introduce indirection table for rpmsg_device operationsBjorn Andersson
To allow for multiple backend implementations add an indireection table for rpmsg_device related operations and move the virtio implementation behind this table. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Clean up rpmsg device vs channel namingBjorn Andersson
The rpmsg device representing struct is called rpmsg_channel and the variable name used throughout is rpdev, with the communication happening on endpoints it's clearer to just call this a "device" in a public API. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Make rpmsg_create_ept() take channel_info structBjorn Andersson
As we introduce support for additional rpmsg backends, some of these only supports point-to-point "links" represented by a name. By making rpmsg_create_ept() take a channel_info struct we allow for these backends to either be passed a source address, a destination address or a name identifier. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: rpmsg_send() operations takes rpmsg_endpointBjorn Andersson
The rpmsg_send() operations has been taking a rpmsg_device, but this forces users of secondary rpmsg_endpoints to use the rpmsg_sendto() interface - by extracting source and destination from the given data structures. If we instead pass the rpmsg_endpoint to these functions a service can use rpmsg_sendto() to respond to messages, even on secondary endpoints. In addition this would allow us to support operations on multiple channels in future backends that does not support off-channel operations. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08hwmon: (core) Add basic pwm attribute support to new APIGuenter Roeck
Add basic pwm attribute support (no auto attributes) to new API. Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08hwmon: (core) Add fan attribute support to new APIGuenter Roeck
Acked-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08hwmon: (core) Add energy and humidity attribute support to new APIGuenter Roeck
Acked-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08hwmon: (core) Add power attribute support to new APIGuenter Roeck
Acked-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08hwmon: (core) Add current attribute support to new APIGuenter Roeck
Acked-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08hwmon: (core) Add voltage attribute support to new APIGuenter Roeck
Acked-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08hwmon: (core) New hwmon registration APIGuenter Roeck
Up to now, each hwmon driver has to implement its own sysfs attributes. This requires a lot of template code, and distracts from the driver's core function to read and write chip registers. To be able to reduce driver complexity, move sensor attribute handling and thermal zone registration into hwmon core. By using the new API, driver code and data size is typically reduced by 20-70%, depending on driver complexity and the number of sysfs attributes supported. With this patch, the new API only supports thermal sensors. Support for other sensor types will be added with subsequent patches. Acked-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-08tcp: use an RB tree for ooo receive queueYaogong Wang
Over the years, TCP BDP has increased by several orders of magnitude, and some people are considering to reach the 2 Gbytes limit. Even with current window scale limit of 14, ~1 Gbytes maps to ~740,000 MSS. In presence of packet losses (or reorders), TCP stores incoming packets into an out of order queue, and number of skbs sitting there waiting for the missing packets to be received can be in the 10^5 range. Most packets are appended to the tail of this queue, and when packets can finally be transferred to receive queue, we scan the queue from its head. However, in presence of heavy losses, we might have to find an arbitrary point in this queue, involving a linear scan for every incoming packet, throwing away cpu caches. This patch converts it to a RB tree, to get bounded latencies. Yaogong wrote a preliminary patch about 2 years ago. Eric did the rebase, added ofo_last_skb cache, polishing and tests. Tested with network dropping between 1 and 10 % packets, with good success (about 30 % increase of throughput in stress tests) Next step would be to also use an RB tree for the write queue at sender side ;) Signed-off-by: Yaogong Wang <wygivan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-By: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08vlan: Check for vlan ethernet types for 8021.q or 802.1adEric Garver
This is to simplify using double tagged vlans. This function allows all valid vlan ethertypes to be checked in a single function call. Also replace some instances that check for both ETH_P_8021Q and ETH_P_8021AD. Patch based on one originally by Thomas F Herbert. Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com> Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08net/mlx5e: Move an_disable_cap bit to a new positionBodong Wang
Previous an_disable_cap position bit31 is deprecated to be use in driver with newer firmware. New firmware will advertise the same capability in bit29. Old capability didn't allow setting more than one protocol for a specific speed when autoneg is off, while newer firmware will allow this and it is indicated in the new capability location. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08net: inet: diag: expose the socket mark to privileged processes.Lorenzo Colitti
This adds the capability for a process that has CAP_NET_ADMIN on a socket to see the socket mark in socket dumps. Commit a52e95abf772 ("net: diag: allow socket bytecode filters to match socket marks") recently gave privileged processes the ability to filter socket dumps based on mark. This patch is complementary: it ensures that the mark is also passed to userspace in the socket's netlink attributes. It is useful for tools like ss which display information about sockets. Tested: https://android-review.googlesource.com/270210 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08pstore/pmsg: drop bounce bufferMark Salyzyn
Removing a bounce buffer copy operation in the pmsg driver path is always better. We also gain in overall performance by not requesting a vmalloc on every write as this can cause precious RT tasks, such as user facing media operation, to stall while memory is being reclaimed. Added a write_buf_user to the pstore functions, a backup platform write_buf_user that uses the small buffer that is part of the instance, and implemented a ramoops write_buf_user that only supports PSTORE_TYPE_PMSG. Signed-off-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-08pstore/ram: Set pstore flags dynamicallyNamhyung Kim
The ramoops can be configured to enable each pstore type by setting their size. In that case, it'd be better not to register disabled types in the first place. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-08pstore: Split pstore fragile flagsNamhyung Kim
This patch adds new PSTORE_FLAGS for each pstore type so that they can be enabled separately. This is a preparation for ongoing virtio-pstore work to support those types flexibly. The PSTORE_FLAGS_FRAGILE is changed to PSTORE_FLAGS_DMESG to preserve the original behavior. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: linux-acpi@vger.kernel.org Cc: linux-efi@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> [kees: retained "FRAGILE" for now to make merges easier] Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-08Merge tag 'kvm-s390-next-4.9-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: features and fixes for 4.9 - lazy enablement of runtime instrumentation - up to 255 CPUs for nested guests - rework of machine check deliver - cleanups/fixes
2016-09-08Merge branch 'linus' into timers/core, to refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-08Drivers: hv: utils: Support TimeSync version 4.0 protocol samples.Alex Ng
This enables support for more accurate TimeSync v4 samples when hosted under Windows Server 2016 and newer hosts. The new time samples include a "vmreferencetime" field that represents the guest's TSC value when the host generated its time sample. This value lets the guest calculate the latency in receiving the time sample. The latency is added to the sample host time prior to updating the clock. Signed-off-by: Alex Ng <alexng@messages.microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07qed*: Add support for the ethtool get_regs operationTomer Tayar
Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-07qed: Add support for debug data collectionTomer Tayar
This patch adds the support for dumping and formatting the HW/FW debug data. Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-07usercopy: force check_object_size() inlineKees Cook
Just for good measure, make sure that check_object_size() is always inlined too, as already done for copy_*_user() and __copy_*_user(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-07Drivers: hv: vmbus: suppress some "hv_vmbus: Unknown GUID" warningsDexuan Cui
Some VMBus devices are not needed by Linux guest[1][2], and, VMBus channels of Hyper-V Sockets don't really mean usual synthetic devices, so let's suppress the warnings for them. [1] https://support.microsoft.com/en-us/kb/2925727 [2] https://msdn.microsoft.com/en-us/library/jj980180(v=winembedded.81).aspx Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07jump_labels: Allow array initialisersCatalin Marinas
The static key API is currently designed around single variable definitions. There are cases where an array of static keys is desirable, so extend the API to allow this rather than using the internal static key implementation directly. Cc: Jason Baron <jbaron@akamai.com> Cc: Jonathan Corbet <corbet@lwn.net> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Dave P Martin <Dave.Martin@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-07netfilter: gre: Use consistent GRE and PTTP header structure instead of the ↵Gao Feng
ones defined by netfilter There are two existing strutures which defines the GRE and PPTP header. So use these two structures instead of the ones defined by netfilter to keep consitent with other codes. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07netfilter: gre: Use consistent GRE_* macros instead of ones defined by ↵Gao Feng
netfilter. There are already some GRE_* macros in kernel, so it is unnecessary to define these macros. And remove some useless macros Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07Revert "Drivers: hv: ring_buffer: count on wrap around mappings in ↵K. Y. Srinivasan
get_next_pkt_raw()" To deal with the merge conflict between net-next and char-misc trees, revert commit bb08d431a914 from char-misc tree. This commit can be rebased and applied once net-next picks up char-misc changes. Here is the commit log of the reverted patch: "With wrap around mappings in place we can always provide drivers with direct links to packets on the ring buffer, even when they wrap around. Do the required updates to get_next_pkt_raw()/put_pkt_raw()" Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Most relevant updates are the removal of per-conntrack timers to use a workqueue/garbage collection approach instead from Florian Westphal, the hash and numgen expression for nf_tables from Laura Garcia, updates on nf_tables hash set to honor the NLM_F_EXCL flag, removal of ip_conntrack sysctl and many other incremental updates on our Netfilter codebase. More specifically, they are: 1) Retrieve only 4 bytes to fetch ports in case of non-linear skb transport area in dccp, sctp, tcp, udp and udplite protocol conntrackers, from Gao Feng. 2) Missing whitespace on error message in physdev match, from Hangbin Liu. 3) Skip redundant IPv4 checksum calculation in nf_dup_ipv4, from Liping Zhang. 4) Add nf_ct_expires() helper function and use it, from Florian Westphal. 5) Replace opencoded nf_ct_kill() call in IPVS conntrack support, also from Florian. 6) Rename nf_tables set implementation to nft_set_{name}.c 7) Introduce the hash expression to allow arbitrary hashing of selector concatenations, from Laura Garcia Liebana. 8) Remove ip_conntrack sysctl backward compatibility code, this code has been around for long time already, and we have two interfaces to do this already: nf_conntrack sysctl and ctnetlink. 9) Use nf_conntrack_get_ht() helper function whenever possible, instead of opencoding fetch of hashtable pointer and size, patch from Liping Zhang. 10) Add quota expression for nf_tables. 11) Add number generator expression for nf_tables, this supports incremental and random generators that can be combined with maps, very useful for load balancing purpose, again from Laura Garcia Liebana. 12) Fix a typo in a debug message in FTP conntrack helper, from Colin Ian King. 13) Introduce a nft_chain_parse_hook() helper function to parse chain hook configuration, this is used by a follow up patch to perform better chain update validation. 14) Add rhashtable_lookup_get_insert_key() to rhashtable and use it from the nft_set_hash implementation to honor the NLM_F_EXCL flag. 15) Missing nulls check in nf_conntrack from nf_conntrack_tuple_taken(), patch from Florian Westphal. 16) Don't use the DYING bit to know if the conntrack event has been already delivered, instead a state variable to track event re-delivery states, also from Florian. 17) Remove the per-conntrack timer, use the workqueue approach that was discussed during the NFWS, from Florian Westphal. 18) Use the netlink conntrack table dump path to kill stale entries, again from Florian. 19) Add a garbage collector to get rid of stale conntracks, from Florian. 20) Reschedule garbage collector if eviction rate is high. 21) Get rid of the __nf_ct_kill_acct() helper. 22) Use ARPHRD_ETHER instead of hardcoded 1 from ARP logger. 23) Make nf_log_set() interface assertive on unsupported families. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-06usercopy: fold builtin_const check into inline functionKees Cook
Instead of having each caller of check_object_size() need to remember to check for a const size parameter, move the check into check_object_size() itself. This actually matches the original implementation in PaX, though this commit cleans up the now-redundant builtin_const() calls in the various architectures. Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-06PCI/AER: Add bus flag to skip source ID matchingJon Derrick
Allow root port buses to choose to skip source id matching when finding the faulting device. Certain root port devices may return an incorrect source ID and recommend to scan child device registers for AER notifications. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>