summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-10-28perf ebpf: Add the libbpf glueWang Nan
The 'bpf-loader.[ch]' files are introduced in this patch. Which will be the interface between perf and libbpf. bpf__prepare_load() resides in bpf-loader.c. Following patches will enrich these two files. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Make perf depend on libbpfWang Nan
By adding libbpf into perf's Makefile, this patch enables perf to build libbpf if libelf is found and neither NO_LIBELF nor NO_LIBBPF is set. The newly introduced code is similar to how libapi and libtraceevent are wired into Makefile.perf. MANIFEST is also updated for 'make perf-*-src-pkg'. Append make_no_libbpf to tools/perf/tests/make. The 'bpf' feature check is appended into default FEATURE_TESTS and FEATURE_DISPLAY, so perf will check the API version of bpf in /path/to/kernel/include/uapi/linux/bpf.h. Which should not fail except when we are trying to port this code to an old kernel. Error messages are also updated to notify users about the lack of BPF support in 'perf record' if libelf is missing or the BPF API check failed. tools/lib/bpf is added to TAG_FOLDERS to allow us to navigate libbpf files when working on perf using tools/perf/tags. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-2-git-send-email-wangnan0@huawei.com [ Document NO_LIBBPF in Makefile.perf, noted by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28clocksource/drivers/sh_mtu2: Fix multiple shutdown call issueMagnus Damm
On the r7s72100 Genmai board the MTU2 driver currently triggers a common clock framework WARN_ON(enable_count) when disabling the clock due to the MTU2 driver after recent callback rework may call ->set_state_shutdown() multiple times. A similar issue was spotted for the TMU driver and fixed in: 452b132 clocksource/drivers/sh_tmu: Fix traceback spotted in -next On r7s72100 Genmai v4.3-rc7 built with shmobile_defconfig spits out the following during boot: sh_mtu2 fcff0000.timer: ch0: used for clock events ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at drivers/clk/clk.c:675 clk_core_disable+0x2c/0x6c() CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc7 #1 Hardware name: Generic R7S72100 (Flattened Device Tree) Backtrace: [<c00133d4>] (dump_backtrace) from [<c0013570>] (show_stack+0x18/0x1c) [<c0013558>] (show_stack) from [<c01c7aac>] (dump_stack+0x74/0x90) [<c01c7a38>] (dump_stack) from [<c00272fc>] (warn_slowpath_common+0x88/0xb4) [<c0027274>] (warn_slowpath_common) from [<c0027400>] (warn_slowpath_null+0x24/0x2c) [<c00273dc>] (warn_slowpath_null) from [<c03a9320>] (clk_core_disable+0x2c/0x6c) [<c03a92f4>] (clk_core_disable) from [<c03aa0a0>] (clk_disable+0x40/0x4c) [<c03aa060>] (clk_disable) from [<c0395d2c>] (sh_mtu2_disable+0x24/0x50) [<c0395d08>] (sh_mtu2_disable) from [<c0395d6c>] (sh_mtu2_clock_event_shutdown+0x14/0x1c) [<c0395d58>] (sh_mtu2_clock_event_shutdown) from [<c007d7d0>] (clockevents_switch_state+0xc8/0x114) [<c007d708>] (clockevents_switch_state) from [<c007d834>] (clockevents_shutdown+0x18/0x28) [<c007d81c>] (clockevents_shutdown) from [<c007dd58>] (clockevents_exchange_device+0x70/0x78) [<c007dce8>] (clockevents_exchange_device) from [<c007e578>] (tick_check_new_device+0x88/0xe0) [<c007e4f0>] (tick_check_new_device) from [<c007daf0>] (clockevents_register_device+0xac/0x120) [<c007da44>] (clockevents_register_device) from [<c0395be8>] (sh_mtu2_probe+0x230/0x350) [<c03959b8>] (sh_mtu2_probe) from [<c028b6f0>] (platform_drv_probe+0x50/0x98) Reported-by: Chris Brandt <chris.brandt@renesas.com> Fixes: 19a9ffb ("clockevents/drivers/sh_mtu2: Migrate to new 'set-state' interface") Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-10-28ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_tVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: PAE40: switch to using phys_addr_t for physical addressesVineet Gupta
That way a single flip of phys_addr_t to 64 bit ensures all places dealing with physical addresses get correct data Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: HIGHMEM: populate high memory from DTVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28perf symbols: Fix endless loop in dso__split_kallsyms_for_kcoreJiri Olsa
Currently we split symbols based on the map comparison, but symbols are stored within dso objects and maps could point into same dso objects (kernel maps). Hence we could end up changing rbtree we are currently iterating and mess it up. It's easily reproduced on s390x by running: $ perf record -a -- sleep 3 $ perf buildid-list -i perf.data --with-hits The fix is to compare dso objects instead. Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Enable pre-event inherit setting by config termsWang Nan
This patch allows perf record setting event's attr.inherit bit by config terms like: # perf record -e cycles/no-inherit/ ... # perf record -e cycles/inherit/ ... So user can control inherit bit for each event separately. In following example, a.out fork()s in main then do some complex CPU intensive computations in both of its children. Basic result with and without inherit: # perf record -e cycles -e instructions ./a.out [ perf record: Woken up 9 times to write data ] [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ] # perf report --stdio # ... # Samples: 23K of event 'cycles' # Event count (approx.): 23641752891 ... # Samples: 24K of event 'instructions' # Event count (approx.): 30428312415 # perf record -i -e cycles -e instructions ./a.out [ perf record: Woken up 5 times to write data ] [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ] ... # Samples: 12K of event 'cycles' # Event count (approx.): 11699501775 ... # Samples: 12K of event 'instructions' # Event count (approx.): 15058023559 Cancel inherit for one event when globally enable: # perf record -e cycles/no-inherit/ -e instructions ./a.out [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ] ... # Samples: 12K of event 'cycles/no-inherit/' # Event count (approx.): 11895759282 ... # Samples: 24K of event 'instructions' # Event count (approx.): 30668000441 Enable inherit for one event when globally disable: # perf record -i -e cycles/inherit/ -e instructions ./a.out [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ] ... # Samples: 23K of event 'cycles/inherit/' # Event count (approx.): 23285400229 ... # Samples: 11K of event 'instructions' # Event count (approx.): 14969050259 Committer note: One can check if the bit was set, in addition to seeing the result in the perf.data file size as above by doing one of: # perf record -e cycles -e instructions -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ] # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1 # So, the inherit bit was set in both, now, if we disable it globally using --no-inherit: # perf record --no-inherit -e cycles -e instructions -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ] # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1 No inherit bit set, then disabling it and setting just on the cycles event: # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ] # perf evlist -v cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1 # We can see it as well in by using a more verbose level of debug messages in the tool that sets up the perf_event_attr, 'perf record' in this case: [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1 ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 ------------------------------------------------------------ perf_event_attr: size 112 config 0x1 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 <SNIP> Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28ARC: mm: HIGHMEM: kmap API implementationVineet Gupta
Implement kmap* API for ARC. This enables - permanent kernel maps (pkmaps): :kmap() API - fixmap : kmap_atomic() We use a very simple/uniform approach for both (unlike some of the other arches). So fixmap doesn't use the customary compile time address stuff. The important semantic is sleep'ability (pkmap) vs. not (fixmap) which the API guarantees. Note that this patch only enables highmem for subsequent PAE40 support as there is no real highmem for ARC in pure 32-bit paradigm as explained below. ARC has 2:2 address split of the 32-bit address space with lower half being translated (virtual) while upper half unstranslated (0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say DDR 0x0 by external Bus Glue logic (outside the core). So kernel can potentially access 1.75G worth of memory directly w/o need for highmem. (the top 256M is taken by uncached peripheral space from 0xF000_0000 to 0xFFFF_FFFF) In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while the logical/virtual addresses remain 32-bits. Thus highmem is required for kernel proper to be able to access these pages for it's own purposes (user space is agnostic to this anyways). Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: preps ahead of HIGHMEM support #2Vineet Gupta
Explicit'ify that all memory added so far is low memory Nothing semantical Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: preps ahead of HIGHMEM supportVineet Gupta
Before we plug in highmem support, some of code needs to be ready for it - copy_user_highpage() needs to be using the kmap_atomic API - mk_pte() can't assume page_address() - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: use generic macros _BITUL()/_AC()Alexey Brodkin
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: Improve Duplicate PD Fault handlerVineet Gupta
- Move the verbosity knob from .data to .bss by using inverted logic - No need to readout PD1 descriptor - clip the non pfn bits of PD0 to avoid clipping inside the loop Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28perf symbols: we can now read separate debug-info files based on a build IDDima Kogan
Recent GDB (at least on a vanilla Debian box) looks for debug information in /usr/lib/debug/.build-id/nn/nnnnnnn where nn/nnnnnn is the build-id of the stripped ELF binary. This is documented here: https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html This was not working in perf because we didn't read the build id until AFTER we searched for the separate debug information file. This patch reads the build ID and THEN does the search. Signed-off-by: Dima Kogan <dima@secretsauce.net> Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf symbols: Fix type error when reading a build-idDima Kogan
This was benign, but wrong. The build-id should live in a char[], not a char*[] Signed-off-by: Dima Kogan <dima@secretsauce.net> Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28fs/writeback, rcu: Don't use list_entry_rcu() for pointer offsetting in ↵Tejun Heo
bdi_split_work_to_wbs() bdi_split_work_to_wbs() uses list_for_each_entry_rcu_continue() to walk @bdi->wb_list. To set up the initial iteration condition, it uses list_entry_rcu() to calculate the entry pointer corresponding to the list head; however, this isn't an actual RCU dereference and using list_entry_rcu() for it ended up breaking a proposed list_entry_rcu() change because it was feeding an non-lvalue pointer into the macro. Don't use the RCU variant for simple pointer offsetting. Use list_entry() instead. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Darren Hart <dvhart@linux.intel.com> Cc: David Howells <dhowells@redhat.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Patrick Marlier <patrick.marlier@gmail.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: pranith kumar <bobby.prani@gmail.com> Link: http://lkml.kernel.org/r/20151027051939.GA19355@mtj.duckdns.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28Merge branch 'linus' into core/rcu, to fix up a semantic conflictIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28efi: Fix warning of int-to-pointer-cast on x86 32-bit buildsTaku Izumi
Commit: 0f96a99dab36 ("efi: Add "efi_fake_mem" boot option") introduced the following warning message: drivers/firmware/efi/fake_mem.c:186:20: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] new_memmap_phy was defined as a u64 value and cast to void*, causing a int-to-pointer-cast warning on x86 32-bit builds. However, since the void* type is inappropriate for a physical address, the definition of struct efi_memory_map::phys_map has been changed to phys_addr_t in the previous patch, and so the cast can be dropped entirely. This patch also changes the type of the "new_memmap_phy" variable from "u64" to "phys_addr_t" to align with the types of memblock_alloc() and struct efi_memory_map::phys_map. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com> [ Removed void* cast, updated commit log] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: linux-efi@vger.kernel.org Cc: matt.fleming@intel.com Link: http://lkml.kernel.org/r/1445593697-1342-2-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28efi: Use correct type for struct efi_memory_map::phys_mapArd Biesheuvel
We have been getting away with using a void* for the physical address of the UEFI memory map, since, even on 32-bit platforms with 64-bit physical addresses, no truncation takes place if the memory map has been allocated by the firmware (which only uses 1:1 virtually addressable memory), which is usually the case. However, commit: 0f96a99dab36 ("efi: Add "efi_fake_mem" boot option") adds code that clones and modifies the UEFI memory map, and the clone may live above 4 GB on 32-bit platforms. This means our use of void* for struct efi_memory_map::phys_map has graduated from 'incorrect but working' to 'incorrect and broken', and we need to fix it. So redefine struct efi_memory_map::phys_map as phys_addr_t, and get rid of a bunch of casts that are now unneeded. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: izumi.taku@jp.fujitsu.com Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: linux-efi@vger.kernel.org Cc: matt.fleming@intel.com Link: http://lkml.kernel.org/r/1445593697-1342-1-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28MAINTAINERS: Add public mailing list for ARCVineet Gupta
Cc: <stable@vger.kernel.org> #3.9+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: Ensure DT mem base is same as what kernel is built withVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: boot: Non Master cpus only need to call EARLY_CPU_SETUP onceVineet Gupta
With prev fixes, all cores now start via common entry point @stext which already calls EARLY_CPU_SETUP for all cores - so no need to invoke it again Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()Vineet Gupta
MCIP now registers it's own per cpu setup routine (for IPI IRQ request) using smp_ops.init_irq_cpu(). So no need for platforms to do that. This now completely decouples platforms from MCIP. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: smp: Introduce smp hook @init_irq_cpu called for all coresVineet Gupta
Note this is not part of platform owned static machine_desc, but more of device owned plat_smp_ops (rather misnamed) which a IPI provider or some such typically defines. This will help us seperate out the IPI registration from platform specific init_cpu_smp() into device specific init_irq_cpu() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: smp: Rename platform hook @init_smp -> @init_cpu_smpVineet Gupta
This conveys better that it is called for each cpu Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()Vineet Gupta
MCIP now registers it's own probe callback with smp_ops.init_early_smp() which is called by ARC common code, so no need for platforms to do that. This decouples the platforms and MCIP and helps confine MCIP details to it's own file. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: smp: Introduce smp hook @init_early_smp for Master coreVineet Gupta
This adds a platform agnostic early SMP init hook which is called on Master core before calling setup_processor() setup_arch() smp_init_cpus() smp_ops.init_early_smp() ... setup_processor() How this helps: - Used for one time init of certain SMP centric IP blocks, before calling setup_processor() which probes various bits of core, possibly including this block - Currently platforms need to call this IP block init from their init routines, which doesn't make sense as this is specific to ARC core and not platform and otherwise requires copy/paste in all (and hence a possible point of failure) e.g. MCIP init is called from 2 platforms currently (axs10x and sim) which will go away once we have this. This change only adds the hooks but they are empty for now. Next commit will populate them and remove the explicit init calls from platforms. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: remove @init_time, @init_irq platform callbacksVineet Gupta
These are not in use for ARC platforms. Moreover DT mechanims exist to probe them w/o explicit platform calls. - clocksource drivers can use CLOCKSOURCE_OF_DECLARE() - intc IRQCHIP_DECLARE() calls + cascading inside DT allows external intc to be probed automatically Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: smp: irqchip: handle IPI as percpu irq like timerVineet Gupta
The reason this was not done so far was lack of genuine IPI_IRQ for ARC700, as we don't have a SMP version of core yet (which might change soon thx to EZChip). Nevertheles to increase the build coverage, we need to allow CONFIG_SMP for ARC700 and still be able to run it on a UP platform (nsim or AXS101) with a UP Device Tree (SMP-on-UP) The build itself requires some define for IPI_IRQ and even a dummy value is fine since that code won't run anyways. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: boot: Support Halt-on-reset and Run-on-reset SMP booting modesVineet Gupta
For Run-on-reset, non masters need to spin wait. For Halt-on-reset they can jump to entry point directly. Also while at it, made reset vector handler as "the" entry point for kernel including host debugger based boot (which uses the ELF header entry point) Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28Merge tag 'powerpc-4.3-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - powerpc/dma: dma_set_coherent_mask() should not be GPL only from Ben * tag 'powerpc-4.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/dma: dma_set_coherent_mask() should not be GPL only
2015-10-28powerpc/dma: dma_set_coherent_mask() should not be GPL onlyBenjamin Herrenschmidt
When turning this from inline to an exported function I was a bit over-eager and made it GPL only. This prevents the use of pretty much all non-GPL PCI driver which is a bit over the top. Let's bring it back in line with other architecture. Fixes: 817820b0226a ("powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-27Merge branch 'mlx4-fixes'David S. Miller
Or Gerlitz says: ==================== Mellanox mlx4 driver fixes for 4.3-rc7 Jack's fix is for a regression introduced in 4.3-rc1 Carol's fix addresses an issue which exists for while and turns to beat us hard on PPC, please queue for -stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27net/mlx4: Copy/set only sizeof struct mlx4_eqe bytesCarol L Soto
When doing memcpy/memset of EQEs, we should use sizeof struct mlx4_eqe as the base size and not caps.eqe_size which could be bigger. If caps.eqe_size is bigger than the struct mlx4_eqe then we corrupt data in the master context. When using a 64 byte stride, the memcpy copied over 63 bytes to the slave_eq structure. This resulted in copying over the entire eqe of interest, including its ownership bit -- and also 31 bytes of garbage into the next WQE in the slave EQ -- which did NOT include the ownership bit (and therefore had no impact). However, once the stride is increased to 128, we are overwriting the ownership bits of *three* eqes in the slave_eq struct. This results in an incorrect ownership bit for those eqes, which causes the eq to seem to be full. The issue therefore surfaced only once 128-byte EQEs started being used in SRIOV and (overarchitectures that have 128/256 byte cache-lines such as PPC) - e.g after commit 77507aa249ae "net/mlx4_core: Enable CQE/EQE stride support". Fixes: 08ff32352d6f ('mlx4: 64-byte CQE/EQE support') Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27net/mlx4_en: Explicitly set no vlan tags in WQE ctrl segment when no vlan is ↵Jack Morgenstein
present We do not set the ins_vlan field to zero when no vlan id is present in the packet. Since WQEs in the TX ring are not zeroed out between uses, this oversight could result in having vlan flags present in the WQE ctrl segment when no vlan is preset. Fixes: e38af4faf01d ('net/mlx4_en: Add support for hardware accelerated 802.1ad vlan') Reported-by: Gideon Naim <gideonn@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27vhost: fix performance on LE hostsMichael S. Tsirkin
commit 2751c9882b947292fcfb084c4f604e01724af804 ("vhost: cross-endian support for legacy devices") introduced a minor regression: even with cross-endian disabled, and even on LE host, vhost_is_little_endian is checking is_le flag so there's always a branch. To fix, simply check virtio_legacy_is_little_endian first. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27bpf: sample: define aarch64 specific registersYang Shi
Define aarch64 specific registers for building bpf samples correctly. Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27amd-xgbe: Fix race between access of desc and desc indexLendacky, Thomas
During Tx cleanup it's still possible for the descriptor data to be read ahead of the descriptor index. A memory barrier is required between the read of the descriptor index and the start of the Tx cleanup loop. This allows a change to a lighter-weight barrier in the Tx transmit routine just before updating the current descriptor index. Since the memory barrier does result in extra overhead on arm64, keep the previous change to not chase the current descriptor value. This prevents the execution of the barrier for each loop performed. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in ↵Sowmini Varadhan
rds_tcp_data_recv Either of pskb_pull() or pskb_trim() may fail under low memory conditions. If rds_tcp_data_recv() ignores such failures, the application will receive corrupted data because the skb has not been correctly carved to the RDS datagram size. Avoid this by handling pskb_pull/pskb_trim failure in the same manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and retry via the deferred call to rds_send_worker() that gets set up on ENOMEM from rds_tcp_read_sock() Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27forcedeth: fix unilateral interrupt disabling in netpoll pathNeil Horman
Forcedeth currently uses disable_irq_lockdep and enable_irq_lockdep, which in some configurations simply calls local_irq_disable. This causes errant warnings in the netpoll path as in netpoll_send_skb_on_dev, where we disable irqs using local_irq_save, leading to the following warning: WARNING: at net/core/netpoll.c:352 netpoll_send_skb_on_dev+0x243/0x250() (Not tainted) Hardware name: netpoll_send_skb_on_dev(): eth0 enabled interrupts in poll (nv_start_xmit_optimized+0x0/0x860 [forcedeth]) Modules linked in: netconsole(+) configfs ipv6 iptable_filter ip_tables ppdev parport_pc parport sg microcode serio_raw edac_core edac_mce_amd k8temp snd_hda_codec_realtek snd_hda_codec_generic forcedeth snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_nforce2 i2c_core shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif pata_amd ata_generic pata_acpi sata_nv dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 1940, comm: modprobe Not tainted 2.6.32-573.7.1.el6.x86_64.debug #1 Call Trace: [<ffffffff8107bbc1>] ? warn_slowpath_common+0x91/0xe0 [<ffffffff8107bcc6>] ? warn_slowpath_fmt+0x46/0x60 [<ffffffffa00fe5b0>] ? nv_start_xmit_optimized+0x0/0x860 [forcedeth] [<ffffffff814b3593>] ? netpoll_send_skb_on_dev+0x243/0x250 [<ffffffff814b37c9>] ? netpoll_send_udp+0x229/0x270 [<ffffffffa02e3299>] ? write_msg+0x39/0x110 [netconsole] [<ffffffffa02e331b>] ? write_msg+0xbb/0x110 [netconsole] [<ffffffff8107bd55>] ? __call_console_drivers+0x75/0x90 [<ffffffff8107bdba>] ? _call_console_drivers+0x4a/0x80 [<ffffffff8107c445>] ? release_console_sem+0xe5/0x250 [<ffffffff8107d200>] ? register_console+0x190/0x3e0 [<ffffffffa02e71a6>] ? init_netconsole+0x1a6/0x216 [netconsole] [<ffffffffa02e7000>] ? init_netconsole+0x0/0x216 [netconsole] [<ffffffff810020d0>] ? do_one_initcall+0xc0/0x280 [<ffffffff810d4933>] ? sys_init_module+0xe3/0x260 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b ---[ end trace f349c7af88e6a6d5 ]--- console [netcon0] enabled netconsole: network logging started Fix it by modifying the forcedeth code to use disable_irq_nosync_lockdep_irqsavedisable_irq_nosync_lockdep_irqsave instead, which saves and restores irq state properly. This also saves us a little code in the process Tested by the reporter, with successful restuls Patch applies to the head of the net tree Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Reported-by: Vasily Averin <vvs@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27openvswitch: Fix skb leak using IPv6 defragJoe Stringer
nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the reassembly is successful, expects the caller to free all of the original skbs using nf_ct_frag6_consume_orig(). This call was previously missing, meaning that the original fragments were never freed (with the exception of the last fragment to arrive). Fix this by ensuring that all original fragments except for the last fragment are freed via nf_ct_frag6_consume_orig(). The last fragment will be morphed into the head, so it must not be freed yet. Furthermore, retain the ->next pointer for the head after skb_morph(). Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27ipv6: Export nf_ct_frag6_consume_orig()Joe Stringer
This is needed in openvswitch to fix an skb leak in the next patch. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27openvswitch: Fix double-free on ip_defrag() errorsJoe Stringer
If ip_defrag() returns an error other than -EINPROGRESS, then the skb is freed. When handle_fragments() passes this back up to do_execute_actions(), it will be freed again. Prevent this double free by never freeing the skb in do_execute_actions() for errors returned by ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead. Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27fib_trie: leaf_walk_rcu should not compute key if key is less than pn->keyAlexander Duyck
We were computing the child index in cases where the key value we were looking for was actually less than the base key of the tnode. As a result we were getting incorrect index values that would cause us to skip over some children. To fix this I have added a test that will force us to use child index 0 if the key we are looking for is less than the key of the current tnode. Fixes: 8be33e955cb9 ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf") Reported-by: Brian Rak <brak@gameservers.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-28block: re-add discard_granularity and alignment checksMing Lin
In commit b49a087("block: remove split code in blkdev_issue_{discard,write_same}"), discard_granularity and alignment checks were removed. Ideally, with bio late splitting, the upper layers shouldn't need to depend on device's limits. Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe device when mkfs.xfs. We have not found the root cause yet. This patch re-adds discard_granularity and alignment checks by reverting the related changes in commit b49a087. The good thing is now we can remove the 2G discard size cap and just use UINT_MAX to avoid bi_size overflow. Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-28Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Two fixes for ARM and one for clkdev: - Fix another build issue with vdsomunge on non-glibc systems - Fix a randconfig build error caused by an invalid configuration - Fix a clkdev problem causing the Nokia n700 to no longer boot" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: clkdev: fix clk_add_alias() with a NULL alias device name ARM: 8445/1: fix vdsomunge not to depend on glibc specific byteswap.h ARM: make RiscPC depend on MMU
2015-10-28Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull blkcg fix from Jens Axboe: "One final fix that should go into 4.3. It's a simple 2x1 liner, fixing a blkcg accounting issue. It was using the wrong bio member to look at the sync and write bits..." * 'for-linus' of git://git.kernel.dk/linux-block: blkcg: fix incorrect read/write sync/async stat accounting
2015-10-28Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a problem in the Crypto API that may cause spurious errors when signals are received by the process that made the orignal system call into the kernel" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: api - Only abort operations on fatal signal
2015-10-28Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module preemption fix from Rusty Russell: "Turns out we should have always been disabling preemption here; someone finally caught it thanks to Peter Z's additional checks" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: module: Fix locking in symbol_put_addr()
2015-10-27perf tools: Search for more options when passing args to -hArnaldo Carvalho de Melo
Recently 'perf <tool> -h' was made aware of arguments and would show just the help for the arguments specified, but that required a strict form, i.e.: $ perf -h --tui worked, but: $ perf -h tui didn't. Make it support both cases and also look at the option help when neither matches, so that he following examples works: $ perf report -h interface Usage: perf report [<options>] --gtk Use the GTK2 interface --stdio Use the stdio interface --tui Use the TUI interface $ perf report -h stack Usage: perf report [<options>] -g, --call-graph <print_type,threshold[,print_limit],order, sort_key[,branch]> Display call graph (stack chain/backtrace): print_type: call graph printing style (graph|flat|fractal|none) threshold: minimum call graph inclusion threshold (<percent>) print_limit: maximum number of call graph entry (<number>) order: call graph order (caller|callee) sort_key: call graph sort key (function|address) branch: include last branch info to call graph (branch) Default: graph,0.5,caller,function --max-stack <n> Set the maximum stack depth when parsing the callchain, anything beyond the specified depth will be ignored. Default: 127 $ Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>