summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-04-29Merge branch 'x86/urgent' into x86/asm, to refresh the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29netfilter: conntrack: init all_locks to avoid debug warningFlorian Westphal
Else we get 'BUG: spinlock bad magic on CPU#' on resize when spin lock debugging is enabled. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-29pinctrl: at91-pio4: fix pull-up/down logicLudovic Desroches
The default configuration of a pin is often with a value in the pull-up/down field at chip reset. So, even if the internal logic of the controller prevents writing a configuration with pull-up and pull-down at the same time, we must ensure explicitly this condition before writing the register. This was leading to a pull-down condition not taken into account for instance. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller") Cc: stable@vger.kernel.org #v4.4 and later Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-04-29efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI ↵Ingo Molnar
driver So the following commit: 884f4f66ffd6 ("efi: Remove global 'memmap' EFI memory map") ... triggered the following build warning on x86 64-bit allyesconfig: drivers/xen/efi.c:290:47: warning: missing braces around initializer [-Wmissing-braces] It's this initialization in drivers/xen/efi.c: static const struct efi efi_xen __initconst = { ... .memmap = NULL, /* Not used under Xen. */ ... which was forgotten about, as .memmap now is an embedded struct: struct efi_memory_map memmap; We can remove this initialization - it's an EFI core internal data structure plus it's not used in the Xen driver anyway. Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: ard.biesheuvel@linaro.org Cc: bp@alien8.de Cc: linux-tip-commits@vger.kernel.org Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/20160429083128.GA4925@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/boot: Correctly bounds-check relocationsYinghai Lu
Relocation handling performs bounds checking on the resulting calculated addresses. The existing code uses output_len (VO size plus relocs size) as the max address. This is not right since the max_addr check should stop at the end of VO and exclude bss, brk, etc, which follows. The valid range should be VO [_text, __bss_start] in the loaded physical address space. This patch adds an export for __bss_start in voffset.h and uses it to set the correct limit for max_addr. Signed-off-by: Yinghai Lu <yinghai@kernel.org> [ Rewrote the changelog. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1461888548-32439-7-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/KASLR: Clean up unused code from old 'run_size' and rename it to ↵Yinghai Lu
'kernel_total_size' Since 'run_size' is now calculated in misc.c, the old script and associated argument passing is no longer needed. This patch removes them, and renames 'run_size' to the more descriptive 'kernel_total_size'. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote the changelog, renamed 'run_size' to 'kernel_total_size' ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1461888548-32439-6-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/boot: Fix "run_size" calculationYinghai Lu
Currently, the "run_size" variable holds the total kernel size (size of code plus brk and bss) and is calculated via the shell script arch/x86/tools/calc_run_size.sh. It gets the file offset and mem size of the .bss and .brk sections from the vmlinux, and adds them as follows: run_size = $(( $offsetA + $sizeA + $sizeB )) However, this is not correct (it is too large). To illustrate, here's a walk-through of the script's calculation, compared to the correct way to find it. First, offsetA is found as the starting address of the first .bss or .brk section seen in the ELF file. The sizeA and sizeB values are the respective section sizes. [bhe@x1 linux]$ objdump -h vmlinux vmlinux: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn 27 .bss 00170000 ffffffff81ec8000 0000000001ec8000 012c8000 2**12 ALLOC 28 .brk 00027000 ffffffff82038000 0000000002038000 012c8000 2**0 ALLOC Here, offsetA is 0x012c8000, with sizeA at 0x00170000 and sizeB at 0x00027000. The resulting run_size is 0x145f000: 0x012c8000 + 0x00170000 + 0x00027000 = 0x145f000 However, if we instead examine the ELF LOAD program headers, we see a different picture. [bhe@x1 linux]$ readelf -l vmlinux Elf file type is EXEC (Executable file) Entry point 0x1000000 There are 5 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000200000 0xffffffff81000000 0x0000000001000000 0x0000000000b5e000 0x0000000000b5e000 R E 200000 LOAD 0x0000000000e00000 0xffffffff81c00000 0x0000000001c00000 0x0000000000145000 0x0000000000145000 RW 200000 LOAD 0x0000000001000000 0x0000000000000000 0x0000000001d45000 0x0000000000018158 0x0000000000018158 RW 200000 LOAD 0x000000000115e000 0xffffffff81d5e000 0x0000000001d5e000 0x000000000016a000 0x0000000000301000 RWE 200000 NOTE 0x000000000099bcac 0xffffffff8179bcac 0x000000000179bcac 0x00000000000001bc 0x00000000000001bc 4 Section to Segment mapping: Segment Sections... 00 .text .notes __ex_table .rodata __bug_table .pci_fixup .tracedata __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param __modver 01 .data .vvar 02 .data..percpu 03 .init.text .init.data .x86_cpu_dev.init .parainstructions .altinstructions .altinstr_replacement .iommu_table .apicdrivers .exit.text .smp_locks .bss .brk 04 .notes As mentioned, run_size needs to be the size of the running kernel including .bss and .brk. We can see from the Section/Segment mapping above that .bss and .brk are included in segment 03 (which corresponds to the final LOAD program header). To find the run_size, we calculate the end of the LOAD segment from its PhysAddr start (0x0000000001d5e000) and its MemSiz (0x0000000000301000), minus the physical load address of the kernel (the first LOAD segment's PhysAddr: 0x0000000001000000). The resulting run_size is 0x105f000: 0x0000000001d5e000 + 0x0000000000301000 - 0x0000000001000000 = 0x105f000 So, from this we can see that the existing run_size calculation is 0x400000 too high. And, as it turns out, the correct run_size is actually equal to VO_end - VO_text, which is certainly easier to calculate. _end: 0xffffffff8205f000 _text:0xffffffff81000000 0xffffffff8205f000 - 0xffffffff81000000 = 0x105f000 As a result, run_size is a simple constant, so we don't need to pass it around; we already have voffset.h for such things. We can share voffset.h between misc.c and header.S instead of getting run_size in other ways. This patch moves voffset.h creation code to boot/compressed/Makefile, and switches misc.c to use the VO_end - VO_text calculation for run_size. Dependence before: boot/header.S ==> boot/voffset.h ==> vmlinux boot/header.S ==> compressed/vmlinux ==> compressed/misc.c Dependence after: boot/header.S ==> compressed/vmlinux ==> compressed/misc.c ==> boot/voffset.h ==> vmlinux Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote the changelog. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: lasse.collin@tukaani.org Fixes: e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd") Link: http://lkml.kernel.org/r/1461888548-32439-5-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/boot: Calculate decompression size during boot not buildYinghai Lu
Currently z_extract_offset is calculated in boot/compressed/mkpiggy.c. This doesn't work well because mkpiggy.c doesn't know the details of the decompressor in use. As a result, it can only make an estimation, which has risks: - output + output_len (VO) could be much bigger than input + input_len (ZO). In this case, the decompressed kernel plus relocs could overwrite the decompression code while it is running. - The head code of ZO could be bigger than z_extract_offset. In this case an overwrite could happen when the head code is running to move ZO to the end of buffer. Though currently the size of the head code is very small it's still a potential risk. Since there is no rule to limit the size of the head code of ZO, it runs the risk of suddenly becoming a (hard to find) bug. Instead, this moves the z_extract_offset calculation into header.S, and makes adjustments to be sure that the above two cases can never happen, and further corrects the comments describing the calculations. Since we have (in the previous patch) made ZO always be located against the end of decompression buffer, z_extract_offset is only used here to calculate an appropriate buffer size (INIT_SIZE), and is not longer used elsewhere. As such, it can be removed from voffset.h. Additionally clean up #if/#else #define to improve readability. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote the changelog and comments. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1461888548-32439-4-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/boot: Move compressed kernel to the end of the decompression bufferYinghai Lu
This change makes later calculations about where the kernel is located easier to reason about. To better understand this change, we must first clarify what 'VO' and 'ZO' are. These values were introduced in commits by hpa: 77d1a4999502 ("x86, boot: make symbols from the main vmlinux available") 37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields") Specifically: All names prefixed with 'VO_': - relate to the uncompressed kernel image - the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define) All names prefixed with 'ZO_': - relate to the bootable compressed kernel image (boot/compressed/vmlinux), which is composed of the following memory areas: - head text - compressed kernel (VO image and relocs table) - decompressor code - the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below) The 'INIT_SIZE' value is used to find the larger of the two image sizes: #define ZO_INIT_SIZE (ZO__end - ZO_startup_32 + ZO_z_extract_offset) #define VO_INIT_SIZE (VO__end - VO__text) #if ZO_INIT_SIZE > VO_INIT_SIZE # define INIT_SIZE ZO_INIT_SIZE #else # define INIT_SIZE VO_INIT_SIZE #endif The current code uses extract_offset to decide where to position the copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE currently includes the extract_offset.) Why does z_extract_offset exist? It's needed because we are trying to minimize the amount of RAM used for the whole act of creating an uncompressed, executable, properly relocation-linked kernel image in system memory. We do this so that kernels can be booted on even very small systems. To achieve the goal of minimal memory consumption we have implemented an in-place decompression strategy: instead of cleanly separating the VO and ZO images and also allocating some memory for the decompression code's runtime needs, we instead create this elaborate layout of memory buffers where the output (decompressed) stream, as it progresses, overlaps with and destroys the input (compressed) stream. This can only be done safely if the ZO image is placed to the end of the VO range, plus a certain amount of safety distance to make sure that when the last bytes of the VO range are decompressed, the compressed stream pointer is safely beyond the end of the VO range. z_extract_offset is calculated in arch/x86/boot/compressed/mkpiggy.c during the build process, at a point when we know the exact compressed and uncompressed size of the kernel images and can calculate this safe minimum offset value. (Note that the mkpiggy.c calculation is not perfect, because we don't know the decompressor used at that stage, so the z_extract_offset calculation is necessarily imprecise and is mostly based on gzip internals - we'll improve that in the next patch.) When INIT_SIZE is bigger than VO_INIT_SIZE (uncommon but possible), the copied ZO occupies the memory from extract_offset to the end of decompression buffer. It overlaps with the soon-to-be-uncompressed kernel like this: |-----compressed kernel image------| V V 0 extract_offset +INIT_SIZE |-----------|---------------|-------------------------|--------| | | | | VO__text startup_32 of ZO VO__end ZO__end ^ ^ |-------uncompressed kernel image---------| When INIT_SIZE is equal to VO_INIT_SIZE (likely) there's still space left from end of ZO to the end of decompressing buffer, like below. |-compressed kernel image-| V V 0 extract_offset +INIT_SIZE |-----------|---------------|-------------------------|--------| | | | | VO__text startup_32 of ZO ZO__end VO__end ^ ^ |------------uncompressed kernel image-------------| To simplify calculations and avoid special cases, it is cleaner to always place the compressed kernel image in memory so that ZO__end is at the end of the decompression buffer, instead of placing t at the start of extract_offset as is currently done. This patch adds BP_init_size (which is the INIT_SIZE as passed in from the boot_params) into asm-offsets.c to make it visible to the assembly code. Then when moving the ZO, it calculates the starting position of the copied ZO (via BP_init_size and the ZO run size) so that the VO__end will be at the end of the decompression buffer. To make the position calculation safe, the end of ZO is page aligned (and a comment is added to the existing VO alignment for good measure). Signed-off-by: Yinghai Lu <yinghai@kernel.org> [ Rewrote changelog and comments. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1461888548-32439-3-git-send-email-keescook@chromium.org [ Rewrote the changelog some more. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/KASLR: Handle kernel relocations above 2G correctlyBaoquan He
When processing the relocation table, the offset used to calculate the relocation is an 'int'. This is sufficient for calculating the physical address of the relocs entry on 32-bit systems and on 64-bit systems when the relocation is under 2G. To handle relocations above 2G (seen in situations like kexec, netboot, etc), this offset needs to be calculated using a 'long' to avoid wrapping and miscalculating the relocation. Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote the changelog. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1461888548-32439-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29Merge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes A few fixes for 4.6. - revert amdgpu PX commit that was previously reverted on the radeon side - cleaned up version of the NI+ MC update display fix for radeon - TTM kref fix * 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: disable vm interrupts with vm_fault_stop=2 drm/amdgpu: print a message if ATPX dGPU power control is missing Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control" drm/radeon: fix vertical bars appear on monitor (v2) drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
2016-04-29Merge branch 'drm-vmwgfx-fixes' of ↵Dave Airlie
git://people.freedesktop.org/~syeh/repos_linux into drm-fixes three misc vmwgfx fixes * 'drm-vmwgfx-fixes' of git://people.freedesktop.org/~syeh/repos_linux: drm/vmwgfx: Fix order of operation drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands. drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
2016-04-28Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two boot crash fixes and an IRQ handling crash fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Handle zero vector gracefully in clear_vector_irq() Revert "x86/mm/32: Set NX in __supported_pte_mask before enabling paging" xen/qspinlock: Don't kick CPU if IRQ is not initialized
2016-04-28Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "x86 PMU driver fixes plus a core code race fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix incorrect lbr_sel_mask value perf/x86/intel/pt: Don't die on VMXON perf/core: Fix perf_event_open() vs. execve() race perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAX perf/core: Make sysctl_perf_cpu_time_max_percent conform to documentation perf/x86/intel/rapl: Add missing Haswell model perf/x86/intel: Add model number for Skylake Server to perf
2016-04-28Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two lockdep fixes" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Fix lock_chain::base size locking/lockdep: Fix ->irq_context calculation
2016-04-28Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Ingo Molnar: "This fixes a bug in the efivars code" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Fix out-of-bounds read in variable_matches()
2016-04-28Merge tag 'media/v4.6-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Some regression fixes: - videobuf2 core: avoid the risk of going past buffer on multi-planes and fix rw mode - fix support for 4K formats at V4L2 core - fix a trouble at davinci_fpe, caused by a bad patch - usbvision: revert a patch with a partial fixup. The fixup patch was merged already, and this one has some issues" * tag 'media/v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] vb2-memops: Fix over allocation of frame vectors [media] media: vb2: Fix regression on poll() for RW mode [media] v4l2-dv-timings.h: fix polarity for 4k formats [media] davinci_vpfe: Revert "staging: media: davinci_vpfe: remove,unnecessary ret variable" [media] usbvision: revert commit 588afcc1 [media] videobuf2-v4l2: Verify planes array in buffer dequeueing [media] videobuf2-core: Check user space planes array in dqbuf
2016-04-28Merge tag 'sound-4.6-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Usually we get a big collection of fixes for ASoC once during rc. And this is it. At this time, most of fixes are about Intel Skylake ASoC driver, which is a new and still on-going development. Along with it, a slight large LOC is seen in legacy HD-audio driver, but it's merely a code move to the upper layer. Other than that, the rest are small or trivial fixes to various drivers, in addition to an ASoC dapm debugfs code fix" * tag 'sound-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits) ALSA: hda - Update BCLK also at hotplug for i915 HSW/BDW ALSA: hda - Add dock support for ThinkPad X260 ASoC: wm5102: Free compressed IRQ in CODEC remove ASoC: arizona: Free speaker thermal IRQs in CODEC remove ASoC: Intel: Skylake: Fix ibs/obs calc for non-integral sampling rates ASoC: Intel: sst: fix a loop timeout in sst_hsw_stream_reset() ASoC: Intel: Skylake: Fix to turn OFF codec power when entering S3 ASoC: hdac_hdmi: Fix codec power state in S3 during playback ASoC: hdac_hdmi: Fix to use dev_pm ops instead soc pm ASoC: wm8962: Correct typo when setting DSPCLK rate ASoC: nau8825: Fix jack detection across suspend ASoC: Intel: Skylake: Fix DSP resource de-allocation ASoC: Intel: Skylake: Fix for unloading module only when it is loaded ASoC: Intel: Skylake: Fix kbuild dependency ASoC: dapm: Make sure we have a card when displaying component widgets ASoC: rt5640: Correct the digital interface data select ASoC: Intel: Skylake: remove call to pci_dev_put ASoC: Intel: Skylake: Call i915 exit last ASoC: Intel: Skylake: Unmap the address last ASoC: Intel: Skylake: Freeup properly on skl_dsp_free ...
2016-04-28Documentation/sysctl/vm.txt: update numa_zonelist_order descriptionXishi Qiu
Commit 3193913ce62c ("mm: page_alloc: default node-ordering on 64-bit NUMA, zone-ordering on 32-bit") changes the default value of numa_zonelist_order. Update the document. Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28lib/stackdepot.c: allow the stack trace hash to be zeroAlexander Potapenko
Do not bail out from depot_save_stack() if the stack trace has zero hash. Initially depot_save_stack() silently dropped stack traces with zero hashes, however there's actually no point in reserving this zero value. Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28rapidio: fix potential NULL pointer dereferenceVladimir Zapolskiy
The change fixes improper check for a returned error value by class_create() function, which on error returns ERR_PTR() value, thus the original check always results in a dead code on error path. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm/memory-failure: fix race with compound page split/mergeKonstantin Khlebnikov
get_hwpoison_page() must recheck relation between head and tail pages. n-horiguchi said: without this recheck, the race causes kernel to pin an irrelevant page, and finally makes kernel crash for refcount mismatch. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28ocfs2/dlm: return zero if deref_done message is successfully handledxuejiufei
dlm_deref_lockres_done_handler() should return zero if the message is successfully handled. Fixes: 60d663cb5273 ("ocfs2/dlm: add DEREF_DONE message"). Signed-off-by: xuejiufei <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28Ananth has movedAnanth N Mavinakayanahalli
The current ID is going away soon... update email address Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28kcov: don't profile branches in kcovAndrey Ryabinin
Profiling 'if' statements in __sanitizer_cov_trace_pc() leads to unbound recursion and crash: __sanitizer_cov_trace_pc() -> ftrace_likely_update -> __sanitizer_cov_trace_pc() ... Define DISABLE_BRANCH_PROFILING to disable this tracer. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28kcov: don't trace the code coverage codeJames Morse
Kcov causes the compiler to add a call to __sanitizer_cov_trace_pc() in every basic block. Ftrace patches in a call to _mcount() to each function it has annotated. Letting these mechanisms annotate each other is a bad thing. Break the loop by adding 'notrace' to __sanitizer_cov_trace_pc() so that ftrace won't try to patch this code. This patch lets arm64 with KCOV and STACK_TRACER boot. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm: wake kcompactd before kswapd's short sleepVlastimil Babka
When kswapd goes to sleep it checks if the node is balanced and at first it sleeps only for HZ/10 time, then rechecks if the node is still balanced and nobody has woken it during the initial sleep. Only then it goes fully sleep until an allocation slowpath wakes it up again. For higher-order allocations, waking up kcompactd is done only before the full sleep. This turns out to be an issue in case another high-order allocation fails during the initial sleep. It will wake kswapd up, however kswapd considers the zone balanced from the order-0 perspective, and will just quickly try to sleep again. So if there's a longer stream of high-order allocations hitting the slowpath and waking up kswapd, it might never actually wake up kcompactd, which may be considered a regression from kswapd-based compaction. In the worst case, it might be that a single allocation that cannot direct reclaim/compact itself is waking kswapd in the retry loop and preventing kcompactd from being woken up and unblocking it. This patch makes sure kcompactd is woken up in such situations by simply moving the wakeup before the short initial sleep. More efficient solution would be to wake kcompactd immediately instead of kswapd if the node is already order-0 balanced, but in that case we should also move reset_isolation_suitable() call to kcompactd so it's not adding to the allocator's latency. Since it's late in the 4.6 cycle, let's go with the simpler change for now. Fixes: accf62422b3a ("mm, kswapd: replace kswapd compaction with waking up kcompactd") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28.mailmap: add Frank RowandFrank Rowand
Set current email address to replace obsolete email addresses. Signed-off-by: Frank Rowand <frank.rowand@sonymobile.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm/hwpoison: fix wrong num_poisoned_pages accountingMinchan Kim
Currently, migration code increses num_poisoned_pages on *failed* migration page as well as successfully migrated one at the trial of memory-failure. It will make the stat wrong. As well, it marks the page as PG_HWPoison even if the migration trial failed. It would mean we cannot recover the corrupted page using memory-failure facility. This patches fixes it. Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm: call swap_slot_free_notify() with page lock heldMinchan Kim
Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page)) in page_swap_info. The reason is that page_endio in rw_page unlocks the page if read I/O is completed so we need to hold a PG_lock again to check PageSwapCache. Otherwise, the page can be removed from swapcache. Kernel BUG at c00f9040 [verbose debug info unavailable] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 4 PID: 13446 Comm: RenderThread Tainted: G W 3.10.84-g9f14aec-dirty #73 task: c3b73200 ti: dd192000 task.ti: dd192000 PC is at page_swap_info+0x10/0x2c LR is at swap_slot_free_notify+0x18/0x6c pc : [<c00f9040>] lr : [<c00f5560>] psr: 400f0113 sp : dd193d78 ip : c2deb1e4 fp : da015180 r10: 00000000 r9 : 000200da r8 : c120fe08 r7 : 00000000 r6 : 00000000 r5 : c249a6c0 r4 : = c249a6c0 r3 : 00000000 r2 : 40080009 r1 : 200f0113 r0 : = c249a6c0 ..<snip> .. Call Trace: page_swap_info+0x10/0x2c swap_slot_free_notify+0x18/0x6c swap_readpage+0x90/0x11c read_swap_cache_async+0x134/0x1ac swapin_readahead+0x70/0xb0 handle_pte_fault+0x320/0x6fc handle_mm_fault+0xc0/0xf0 do_page_fault+0x11c/0x36c do_DataAbort+0x34/0x118 Fixes: 3f2b1a04f44933f2 ("zram: revive swap_slot_free_notify") Signed-off-by: Minchan Kim <minchan@kernel.org> Tested-by: Kyeongdon Kim <kyeongdon.kim@lge.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm: vmscan: reclaim highmem zone if buffer_heads is over limitMinchan Kim
We have been reclaimed highmem zone if buffer_heads is over limit but commit 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()") changed the behavior so it doesn't reclaim highmem zone although buffer_heads is over the limit. This patch restores the logic. Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()") Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28numa: fix /proc/<pid>/numa_maps for THPGerald Schaefer
In gather_pte_stats() a THP pmd is cast into a pte, which is wrong because the layouts may differ depending on the architecture. On s390 this will lead to inaccurate numa_maps accounting in /proc because of misguided pte_present() and pte_dirty() checks on the fake pte. On other architectures pte_present() and pte_dirty() may work by chance, but there may be an issue with direct-access (dax) mappings w/o underlying struct pages when HAVE_PTE_SPECIAL is set and THP is available. In vm_normal_page() the fake pte will be checked with pte_special() and because there is no "special" bit in a pmd, this will always return false and the VM_PFNMAP | VM_MIXEDMAP checking will be skipped. On dax mappings w/o struct pages, an invalid struct page pointer would then be returned that can crash the kernel. This patch fixes the numa_maps THP handling by introducing new "_pmd" variants of the can_gather_numa_stats() and vm_normal_page() functions. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> [4.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA checkKonstantin Khlebnikov
Khugepaged detects own VMAs by checking vm_file and vm_ops but this way it cannot distinguish private /dev/zero mappings from other special mappings like /dev/hpet which has no vm_ops and popultes PTEs in mmap. This fixes false-positive VM_BUG_ON and prevents installing THP where they are not expected. Link: http://lkml.kernel.org/r/CACT4Y+ZmuZMV5CjSFOeXviwQdABAgT7T+StKfTqan9YDtgEi5g@mail.gmail.com Fixes: 78f11a255749 ("mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups") Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mailmap: fix Krzysztof Kozlowski's misspelled nameKrzysztof Kozlowski
Patchwork introduced a garbled Polish character in commit 1e3012d0fdc5 ("crypto: s5p-sss - Use memcpy_toio for iomem annotated memory") so fix the mail mapping. Additionally prefer to use kernel.org account for personal work, instead of my gmail address. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28thp: keep huge zero page pinned until tlb flushKirill A. Shutemov
Andrea has found[1] a race condition on MMU-gather based TLB flush vs split_huge_page() or shrinker which frees huge zero under us (patch 1/2 and 2/2 respectively). With new THP refcounting, we don't need patch 1/2: mmu_gather keeps the page pinned until flush is complete and the pin prevents the page from being split under us. We still need patch 2/2. This is simplified version of Andrea's patch. We don't need fancy encoding. [1] http://lkml.kernel.org/r/1447938052-22165-1-git-send-email-aarcange@redhat.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28mm: exclude HugeTLB pages from THP page_mapped() logicSteve Capper
HugeTLB pages cannot be split, so we use the compound_mapcount to track rmaps. Currently page_mapped() will check the compound_mapcount, but will also go through the constituent pages of a THP compound page and query the individual _mapcount's too. Unfortunately, page_mapped() does not distinguish between HugeTLB and THP compound pages and assumes that a compound page always needs to have HPAGE_PMD_NR pages querying. For most cases when dealing with HugeTLB this is just inefficient, but for scenarios where the HugeTLB page size is less than the pmd block size (e.g. when using contiguous bit on ARM) this can lead to crashes. This patch adjusts the page_mapped function such that we skip the unnecessary THP reference checks for HugeTLB pages. Fixes: e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount() for compound pages") Signed-off-by: Steve Capper <steve.capper@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28kexec: export OFFSET(page.compound_head) to find out compound tail pageAtsushi Kumagai
PageAnon() always look at head page to check PAGE_MAPPING_ANON and tail page's page->mapping has just a poisoned data since commit 1c290f642101 ("mm: sanitize page->mapping for tail pages"). If makedumpfile checks page->mapping of a compound tail page to distinguish anonymous page as usual, it must fail in newer kernel. So it's necessary to export OFFSET(page.compound_head) to avoid checking compound tail pages. The problem is that unnecessary hugepages won't be removed from a dump file in kernels 4.5.x and later. This means that extra disk space would be consumed. It's a problem, but not critical. Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28kexec: update VMCOREINFO for compound_order/dtorAtsushi Kumagai
makedumpfile refers page.lru.next to get the order of compound pages for page filtering. However, now the order is stored in page.compound_order, hence VMCOREINFO should be updated to export the offset of page.compound_order. The fact is, page.compound_order was introduced already in kernel 4.0, but the offset of it was the same as page.lru.next until kernel 4.3, so this was not actual problem. The above can be said also for page.lru.prev and page.compound_dtor, it's necessary to detect hugetlbfs pages. Further, the content was changed from direct address to the ID which means dtor. The problem is that unnecessary hugepages won't be removed from a dump file in kernels 4.4.x and later. This means that extra disk space would be consumed. It's a problem, but not critical. Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "There is a lifecycle fix in the auth code, a fix for a narrow race condition on map, and a helpful message in the log when there is a feature mismatch (which happens frequently now that the default server-side options have changed)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: report unsupported features to syslog rbd: fix rbd map vs notify races libceph: make authorizer destruction independent of ceph_auth_client
2016-04-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Three more bug fixes for 4.6 - Due to a race in the dynamic page table code a multi-threaded program can cause a translation specification exception. With panic_on_oops a user space program can crash the system. - An information leak with the /dev/sclp device. - A use after free in the s390 PCI code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/sclp_ctl: fix potential information leak with /dev/sclp s390/mm: fix asce_bits handling with dynamic pagetable levels s390/pci: fix use after free in dma_init
2016-04-28RDMA/nes: don't leak skb if carrier downFlorian Westphal
Alternatively one could free the skb, OTOH I don't think this test is useful so just remove it. Cc: <linux-rdma@vger.kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28Merge branch 'bpf-fixes'David S. Miller
Alexei Starovoitov says: ==================== bpf: fix several bugs First two patches address bugs found by Jann Horn. Last patch is a minor samples fix spotted during the testing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28samples/bpf: fix trace_output exampleAlexei Starovoitov
llvm cannot always recognize memset as builtin function and optimize it away, so just delete it. It was a leftover from testing of bpf_perf_event_output() with large data structures. Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28bpf: fix check_map_func_compatibility logicAlexei Starovoitov
The commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter") introduced clever way to check bpf_helper<->map_type compatibility. Later on commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adjusted the logic and inadvertently broke it. Get rid of the clever bool compare and go back to two-way check from map and from helper perspective. Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28bpf: fix refcnt overflowAlexei Starovoitov
On a system with >32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK, the malicious application may overflow 32-bit bpf program refcnt. It's also possible to overflow map refcnt on 1Tb system. Impose 32k hard limit which means that the same bpf program or map cannot be shared by more than 32k processes. Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge branch 'cpsw-phy-handle-fixes'David S. Miller
David Rivshin says: ==================== drivers: net: cpsw: phy-handle fixes This series fixes a number of related issues around using phy-handle properties in cpsw emac nodes. Patch 1 fixes a bug if more than one slave is used, and either slave uses the phy-handle property in the devicetree. Patch 2 fixes a NULL pointer dereference which can occur if a phy-handle property is used and of_phy_connect() return NULL, such as with a bad devicetree. Patch 3 fixes an issue where the phy-mode property would be ignored if a phy-handle property was used. This also fixes a bogus error message that would be emitted. Patch 4 fixes makes the binding documentation more explicit that exactly one PHY property should be used, and also marks phy_id as deprecated. Patch 5 cleans up the fixed-link case to work like the now-fixed phy-handle case. I have tested on the following hardware configurations: - (EVMSK) dual emac, phy_id property in both slaves - (EVMSK) dual emac, phy-handle property in both slaves - (EVMSK) a bad phy-handle property pointing to &mmc1 - (EVMSK) phy_id property with incorrect PHY address - (BeagleBoneBlack) single emac, phy_id property - (custom) single emac, fixed-link subnode Andrew Goodbody reported testing v2 on a board that doesn't use dual_emac mode, but with 2 PHYs using phy-handle properties [1]. Nicolas Chauvet reported testing v2 on an HP t410 (dm8148). Markus Brunner reported testing v1 on the following [2]: - emac0 with phy_id and emac1 with fixed phy - emac0 with phy-handle and emac1 with fixed phy - emac0 with fixed phy and emac1 with fixed phy [1] https://lkml.org/lkml/2016/4/22/537 [2] http://www.spinics.net/lists/netdev/msg357890.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: use of_phy_connect() in fixed-link caseDavid Rivshin
If a fixed-link DT subnode is used, the phy_device was looked up so that a PHY ID string could be constructed and passed to phy_connect(). This is not necessary, as the device_node can be passed directly to of_phy_connect() instead. This reuses the same codepath as if the phy-handle DT property was used. Signed-off-by: David Rivshin <drivshin@allworx.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28dt: cpsw: phy-handle, phy_id, and fixed-link are mutually exclusiveDavid Rivshin
The phy-handle, phy_id, and fixed-link properties are mutually exclusive, and only one need be specified. Make this clear in the binding doc. Also mark the phy_id property as deprecated, as phy-handle should be used instead. Signed-off-by: David Rivshin <drivshin@allworx.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: don't ignore phy-mode if phy-handle is usedDavid Rivshin
The phy-mode emac property was only being processed in the phy_id or fixed-link cases. However if phy-handle was specified instead, an error message would complain about the lack of phy_id or fixed-link, and then jump past the of_get_phy_mode(). This would result in the PHY mode defaulting to MII, regardless of what the devicetree specified. Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing") Signed-off-by: David Rivshin <drivshin@allworx.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com> Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28drivers: net: cpsw: fix segfault in case of bad phy-handleDavid Rivshin
If an emac node has a phy-handle property that points to something which is not a phy, then a segmentation fault will occur when the interface is brought up. This is because while phy_connect() will return ERR_PTR() on failure, of_phy_connect() will return NULL. The common error check uses IS_ERR(), and so missed when of_phy_connect() fails. The NULL pointer is then dereferenced. Also, the common error message referenced slave->data->phy_id, which would be empty in the case of phy-handle. Instead, use the name of the device_node as a useful identifier. And in the phy_id case add the error code for completeness. Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing") Signed-off-by: David Rivshin <drivshin@allworx.com> Signed-off-by: David S. Miller <davem@davemloft.net>