summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-12-17ARC: rename smp operation init_irq_cpu() to init_per_cpu()Noam Camus
This will better reflect its description i.e. "any needed setup..." and not just do an "IPI request". Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailingVineet Gupta
ARC dwarf unwinder only supports CIE version == 1 The boot time dwarf sanitizer (part of binary lookup table constructor) would simply bail if it saw CIE version == 3, rendering unwinder with a NULL lookup table. It seems libgcc linked with kernel does have such entries. With fallback linear search removed, and a NULL binary lookup table, unwinder fails to generate any stack trace. So allow graceful ignoring of unsupported CIE entries. This problem was initially seen in Alexey's setup (and not mine) as he was using buildroot built toolchain (libgcc) which doesn't get built with CFLAGS_FOR_TARGET="-gdwarf-2 which is my default Fixes STAR 9000985048: "kernel unwinder broken with stock tools" Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries Reported-by Alexey Brodkin <abrodkin@synopsys.com> Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: dw2 unwind: Reinstante unwinding out of modulesVineet Gupta
The fix which removed linear searching of dwarf (because binary lookup data always exists) missed out on the fact that modules don't get the binary lookup tables info. This caused unwinding out of modules to stop working. So add binary lookup header setup (equivalent of eh_frame_hdr setup) to modules as well. While at it, confine the header setup to within unwinder code, reducing one API exposed out of unwinder code. Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: [plat-sim] unbork non default CONFIG_LINUX_LINK_BASEVineet Gupta
HIGHMEM support bumped the default memory size for nsim platform to 1G. Thus total memory ended at the very edge of start of peripherals address space. With linux link base shifted, memory started bleeding into peripheral space which caused early boot bad_page spew ! Fixes: 29e332261d2 ("ARC: mm: HIGHMEM: populate high memory from DT") Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-16timekeeping: Cap adjustments so they don't exceed the maxadj valueJohn Stultz
Thus its been occasionally noted that users have seen confusing warnings like: Adjusting tsc more than 11% (5941981 vs 7759439) We try to limit the maximum total adjustment to 11% (10% tick adjustment + 0.5% frequency adjustment). But this is done by bounding the requested adjustment values, and the internal steering that is done by tracking the error from what was requested and what was applied, does not have any such limits. This is usually not problematic, but in some cases has a risk that an adjustment could cause the clocksource mult value to overflow, so its an indication things are outside of what is expected. It ends up most of the reports of this 11% warning are on systems using chrony, which utilizes the adjtimex() ADJ_TICK interface (which allows a +-10% adjustment). The original rational for ADJ_TICK unclear to me but my assumption it was originally added to allow broken systems to get a big constant correction at boot (see adjtimex userspace package for an example) which would allow the system to work w/ ntpd's 0.5% adjustment limit. Chrony uses ADJ_TICK to make very aggressive short term corrections (usually right at startup). Which push us close enough to the max bound that a few late ticks can cause the internal steering to push past the max adjust value (tripping the warning). Thus this patch adds some extra logic to enforce the max adjustment cap in the internal steering. Note: This has the potential to slow corrections when the ADJ_TICK value is furthest away from the default value. So it would be good to get some testing from folks using chrony, to make sure we don't cause any troubles there. Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Tested-by: Miroslav Lichvar <mlichvar@redhat.com> Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: John Stultz <john.stultz@linaro.org>
2015-12-16ntp: Fix second_overflow's input parameter type to be 64bitsDengChao
The function "second_overflow" uses "unsign long" as its input parameter type which will overflow after year 2106 on 32bit systems. Thus this patch replaces it with time64_t type. While the 64-bit division is expensive, "next_ntp_leap_sec" has been calculated already, so we can just re-use it in the TIME_INS/DEL cases, allowing one expensive division per leapsecond instead of re-doing the divsion once a second after the leap flag has been set. Signed-off-by: DengChao <chao.deng@linaro.org> [jstultz: Tweaked commit message] Signed-off-by: John Stultz <john.stultz@linaro.org>
2015-12-16ntp: Change time_reftime to time64_t and utilize 64bit __ktime_get_real_secondsDengChao
The type of static variant "time_reftime" and the call of get_seconds in ntp are both not y2038 safe. So change the type of time_reftime to time64_t and replace get_seconds with __ktime_get_real_seconds. The local variant "secs" in ntp_update_offset represents seconds between now and last ntp adjustment, it seems impossible that this time will last more than 68 years, so keep its type as "long". Reviewed-by: John Stultz <john.stultz@linaro.org> Signed-off-by: DengChao <chao.deng@linaro.org> [jstultz: Tweaked commit message] Signed-off-by: John Stultz <john.stultz@linaro.org>
2015-12-16timekeeping: Provide internal function __ktime_get_real_secondsDengChao
In order to fix Y2038 issues in the ntp code we will need replace get_seconds() with ktime_get_real_seconds() but as the ntp code uses the timekeeping lock which is also used by ktime_get_real_seconds(), we need a version without locking. Add a new function __ktime_get_real_seconds() in timekeeping to do this. Reviewed-by: John Stultz <john.stultz@linaro.org> Signed-off-by: DengChao <chao.deng@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org>
2015-12-16perf tools: Remove 'perf' from subcmd function and variable namesJosh Poimboeuf
In preparation for moving exec_cmd.c and run-command.c out of perf and into a library, remove 'perf' from all the symbol names. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/bc3ee82b40b8f396b644fa49e0f7260ce442635b.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16perf tools: Remove subcmd dependencies on strbufJosh Poimboeuf
Introduce and use new astrcat() and astrcatf() functions which replace the strbuf functionality for subcmd. For now they duplicate strbuf's die-on-allocation-error policy. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/957d207e1254406fa11fc2e405e75a7e405aad8f.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16fou: clean up socket with kfree_rcuHannes Frederic Sowa
fou->udp_offloads is managed by RCU. As it is actually included inside the fou sockets, we cannot let the memory go out of scope before a grace period. We either can synchronize_rcu or switch over to kfree_rcu to manage the sockets. kfree_rcu seems appropriate as it is used by vxlan and geneve. Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16net: Pass ndm_state to route netlink FDB notifications.Hubert Sokolowski
Before this change applications monitoring FDB notifications were not able to determine whether a new FDB entry is permament or not: bridge fdb add f1:f2:f3:f4:f5:f8 dev sw0p1 temp self bridge fdb add f1:f2:f3:f4:f5:f9 dev sw0p1 self bridge monitor fdb f1:f2:f3:f4:f5:f8 dev sw0p1 self permanent f1:f2:f3:f4:f5:f9 dev sw0p1 self permanent With this change ndm_state from the original netlink message is passed to the new netlink message sent as notification. bridge fdb add f1:f2:f3:f4:f5:f6 dev sw0p1 self bridge fdb add f1:f2:f3:f4:f5:f7 dev sw0p1 temp self bridge monitor fdb f1:f2:f3:f4:f5:f6 dev sw0p1 self permanent f1:f2:f3:f4:f5:f7 dev sw0p1 self static Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16Merge tag 'mac80211-for-davem-2015-12-15' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Another set of fixes: * memory leak fixes (from Ola) * operating mode notification spec compliance fix (from Eyal) * copy rfkill names in case pointer becomes invalid (myself) * two hardware restart fixes (myself) * get rid of "limiting TX power" log spam (myself) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-1682xx: FCC: Fixing a bug causing to FCC port lock-upMartin Roth
The patch fixes FCC port lock-up, which occurs as a result of a bug during underrun/collision handling. Within the tx_startup() function in mac-fcc.c, the address of last BD is not calculated correctly. As a result of wrong calculation of the last BD address, the next transmitted BD may be set to an area out of the transmit BD ring. This actually causes to port lock-up and it is not recoverable. Signed-off-by: Martin Roth <martin.roth@motorolasolutions.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16gianfar: Don't enable RX Filer if not supportedHamish Martin
After commit 15bf176db1fb ("gianfar: Don't enable the Filer w/o the Parser"), 'TSEC' model controllers (for example as seen on MPC8541E) always have 8 bytes stripped from the front of received frames. Only 'eTSEC' gianfar controllers have the RX Filer capability (amongst other enhancements). Previously this was treated as always enabled for both 'TSEC' and 'eTSEC' controllers. In commit 15bf176db1fb ("gianfar: Don't enable the Filer w/o the Parser") a subtle change was made to the setting of 'uses_rxfcb' to effectively always set it (since 'rx_filer_enable' was always true). This had the side-effect of always stripping 8 bytes from the front of received frames on 'TSEC' type controllers. We now only enable the RX Filer capability on controller types that support it, thereby avoiding the issue for 'TSEC' type controllers. Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16drm/amdgpu: fix user fence handlingChristian König
This fixes a random corruption under memory pressure. We need to fence the BO for the user fence as well, otherwise it might be swapped out and the GPU could write the fence value to an undesired location. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-12-16mtd: ubi: don't leak e if schedule_erase() failsSebastian Siewior
If __erase_worker() fails to erase the EB and schedule_erase() fails as well to do anything about it then we go RO. But that is not a reason to leak the e argument here. Therefore clean up e. Cc: <stable@vger.kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16mtd: ubi: fixup error correction in do_sync_erase()Sebastian Siewior
Since fastmap we gained do_sync_erase(). This function can return an error and its error handling isn't obvious. First the memory allocation for struct ubi_work can fail and as such struct ubi_wl_entry is leaked. However if the memory allocation succeeds then the tail function takes care of the struct ubi_wl_entry. A free here could result in a double free. To make the error handling simpler, I split the tail function into one piece which does the work and another which frees the struct ubi_work which is passed as argument. As result do_sync_erase() can keep the struct on stack and we get rid of one error source. Cc: <stable@vger.kernel.org> Fixes: 8199b901a ("UBI: Add fastmap support to the WL sub-system") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16UBI: fix use of "VID" vs. "EC" in header self-checkBrian Norris
Looks like a typo, using UBI_EC_HDR_SIZE_CRC (note the "EC") to compute the CRC for the VID header. This shouldn't cause any functional change, as both structures are 64 bytes. Verified with: BUILD_BUG_ON(UBI_VID_HDR_SIZE_CRC != UBI_EC_HDR_SIZE_CRC); Reported here: http://lists.infradead.org/pipermail/linux-mtd/2013-September/048570.html Reported by: Bill Pringlemeir <bpringlemeir@gmail.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16UBI: fix return error codeSudip Mukherjee
We are checking dfs_rootdir for error value or NULL. But in the conditional ternary operator we returned -ENODEV if dfs_rootdir contains an error value and returned PTR_ERR(dfs_rootdir) if dfs_rootdir is NULL. So in the case of dfs_rootdir being NULL we actually assigned 0 to err and returned it to the caller implying a success. Lets return -ENODEV when dfs_rootdir is NULL else return PTR_ERR(dfs_rootdir). Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16ftrace/scripts: Have recordmcount copy the object fileSteven Rostedt (Red Hat)
Russell King found that he had weird side effects when compiling the kernel with hard linked ccache. The reason was that recordmcount modified the kernel in place via mmap, and when a file gets modified twice by recordmcount, it will complain about it. To fix this issue, Russell wrote a patch that checked if the file was hard linked more than once and would unlink it if it was. Linus Torvalds was not happy with the fact that recordmcount does this in place modification. Instead of doing the unlink only if the file has two or more hard links, it does the unlink all the time. In otherwords, it always does a copy if it changed something. That is, it does the write out if a change was made. Cc: stable@vger.kernel.org # 2.6.37+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-12-16perf list: Robustify event printing routineArnaldo Carvalho de Melo
When a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") added PERF_COUNT_SW_BPF_OUTPUT we ended up with a new entry in the event_symbols_sw array that wasn't initialized, thus set to NULL, fix print_symbol_events() to check for that case so that we don't crash if this happens again. (gdb) bt #0 __match_glob (ignore_space=false, pat=<optimized out>, str=<optimized out>) at util/string.c:198 #1 strglobmatch (str=<optimized out>, pat=pat@entry=0x7fffffffe61d "stall") at util/string.c:252 #2 0x00000000004993a5 in print_symbol_events (type=1, syms=0x872880 <event_symbols_sw+160>, max=11, name_only=false, event_glob=0x7fffffffe61d "stall") at util/parse-events.c:1615 #3 print_events (event_glob=event_glob@entry=0x7fffffffe61d "stall", name_only=false) at util/parse-events.c:1675 #4 0x000000000042c79e in cmd_list (argc=1, argv=0x7fffffffe390, prefix=<optimized out>) at builtin-list.c:68 #5 0x00000000004788a5 in run_builtin (p=p@entry=0x871758 <commands+120>, argc=argc@entry=2, argv=argv@entry=0x7fffffffe390) at perf.c:370 #6 0x0000000000420ab0 in handle_internal_command (argv=0x7fffffffe390, argc=2) at perf.c:429 #7 run_argv (argv=0x7fffffffe110, argcp=0x7fffffffe11c) at perf.c:473 #8 main (argc=2, argv=0x7fffffffe390) at perf.c:588 (gdb) p event_symbols_sw[PERF_COUNT_SW_BPF_OUTPUT] $4 = {symbol = 0x0, alias = 0x0} (gdb) A patch to robustify perf to not segfault when the next counter gets added in the kernel will follow this one. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-57wysblcjfrseb0zg5u7ek10@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16perf list: Add support for PERF_COUNT_SW_BPF_OUTArnaldo Carvalho de Melo
When PERF_COUNT_SW_BPF_OUTPUT was added to the kernel we should've added it to tools/perf, where it is used just to list events. This ended up causing a segfault in commands like "perf list stall". Fix it by adding that new software counter. A patch to robustify perf to not segfault when the next counter gets added in the kernel will follow this one. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uya354upi3eprsey6mi5962d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16dma-debug: Fix dma_debug_entry offset calculationDaniel Mentz
dma-debug uses struct dma_debug_entry to keep track of dma coherent memory allocation requests. The virtual address is converted into a pfn and an offset. Previously, the offset was calculated using an incorrect bit mask. As a result, we saw incorrect error messages from dma-debug like the following: "DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000" Cacheline 0x03e00000 does not exist on our platform. Cc: <stable@vger.kernel.org> Fixes: 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()") Signed-off-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-16perf tools: Provide subcmd configuration at runtimeJosh Poimboeuf
Create init functions for exec_cmd.c and pager.c. This allows their configuration to be specified at runtime so they can be split out into a separate library which can be used by other programs. Their configuration is stored in a shared subcmd_config struct. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/21f5f6b38da72c985a8dcfa185700d03e7eecd1d.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16perf tools: Document the fact that parse_options*() may exitJosh Poimboeuf
Generally, calling exit() from a library is bad practice. Eventually these functions might be redesigned so that they don't exit. For now, just document the fact that they do. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/97b1af06cc3b18dd0f49e655d6d659eaa64ecde5.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16perf tools: Move strlcpy() from perf to tools/lib/string.cJosh Poimboeuf
strlcpy() will be needed by the subcmd library. Move it to the shared tools/lib/string.c file which can be used by other tools. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/71e2804b973bf39ad3d3b9be10f99f2ea630be46.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Further ARM fixes: - Anson Huang noticed that we were corrupting a register we shouldn't be during suspend on some CPUs. - Shengjiu Wang spotted a bug in the 'swp' instruction emulation. - Will Deacon fixed a bug in the ASID allocator. - Laura Abbott fixed the kernel permission protection to apply to all threads running in the system. - I've fixed two bugs with the domain access control register handling, one to do with printing an appropriate value at oops time, and the other to further fix the uaccess_with_memcpy code" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8475/1: SWP emulation: Restore original *data when failed ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted ARM: fix uaccess_with_memcpy() with SW_DOMAIN_PAN ARM: report proper DACR value in oops dumps ARM: 8464/1: Update all mm structures with section adjustments ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rollovers
2015-12-16tools build: Fix feature Makefile issues with 'O='Josh Poimboeuf
When building perf binaries outside the source tree with 'make O=<dir>', the auto-detected features get re-tested for every build, which is unnecessary and inconsistent with the behavior seen when building directly in the source tree. Another issue is that 'make O=<dir> clean' doesn't remove the feature files from the object tree. Fix these problems by looking for the binaries in the $(OUTPUT) directory. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/113bd01530e9761778c60a75a96c65fc59860f68.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16kvm/x86: Remove Hyper-V SynIC timer stoppingAndrey Smetanin
It's possible that guest send us Hyper-V EOM at the middle of Hyper-V SynIC timer running, so we start processing of Hyper-V SynIC timers in vcpu context and stop the Hyper-V SynIC timer unconditionally: host guest ------------------------------------------------------------------------------ start periodic stimer start periodic timer timer expires after 15ms send expiration message into guest restart periodic timer timer expires again after 15 ms msg slot is still not cleared so setup ->msg_pending (1) restart periodic timer process timer msg and clear slot ->msg_pending was set: send EOM into host received EOM kvm_make_request(KVM_REQ_HV_STIMER) kvm_hv_process_stimers(): ... stimer_stop() if (time_now >= stimer->exp_time) stimer_expiration(stimer); Because the timer was rearmed at (1), time_now < stimer->exp_time and stimer_expiration is not called. The timer then never fires. The patch fixes such situation by not stopping Hyper-V SynIC timer at all, because it's safe to restart it without stop in vcpu context and timer callback always returns HRTIMER_NORESTART. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16kvm: Dump guest rIP when the guest tried something unsupportedBorislav Petkov
It looks like this in action: kvm [5197]: vcpu0, guest rIP: 0xffffffff810187ba unhandled rdmsr: 0xc001102 and helps to pinpoint quickly where in the guest we did the unsupported thing. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16KVM: vmx: detect mismatched size in VMCS read/writePaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- I am sending this as RFC because the error messages it produces are very ugly. Because of inlining, the original line is lost. The alternative is to change vmcs_read/write/checkXX into macros, but then you need to have a single huge BUILD_BUG_ON or BUILD_BUG_ON_MSG because multiple BUILD_BUG_ON* with the same __LINE__ are not supported well.
2015-12-16KVM: VMX: fix read/write sizes of VMCS fields in dump_vmcsPaolo Bonzini
This was not printing the high parts of several 64-bit fields on 32-bit kernels. Separate from the previous one to make the patches easier to review. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16KVM: VMX: fix read/write sizes of VMCS fieldsPaolo Bonzini
In theory this should have broken EPT on 32-bit kernels (due to reading the high part of natural-width field GUEST_CR3). Not sure if no one noticed or the processor behaves differently from the documentation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16KVM: VMX: fix the writing POSTED_INTR_NVLi RongQing
POSTED_INTR_NV is 16bit, should not use 64bit write function [ 5311.676074] vmwrite error: reg 3 value 0 (err 12) [ 5311.680001] CPU: 49 PID: 4240 Comm: qemu-system-i38 Tainted: G I 4.1.13-WR8.0.0.0_standard #1 [ 5311.689343] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 [ 5311.699550] 00000000 00000000 e69a7e1c c1950de1 00000000 e69a7e38 fafcff45 fafebd24 [ 5311.706924] 00000003 00000000 0000000c b6a06dfa e69a7e40 fafcff79 e69a7eb0 fafd5f57 [ 5311.714296] e69a7ec0 c1080600 00000000 00000001 c0e18018 000001be 00000000 00000b43 [ 5311.721651] Call Trace: [ 5311.722942] [<c1950de1>] dump_stack+0x4b/0x75 [ 5311.726467] [<fafcff45>] vmwrite_error+0x35/0x40 [kvm_intel] [ 5311.731444] [<fafcff79>] vmcs_writel+0x29/0x30 [kvm_intel] [ 5311.736228] [<fafd5f57>] vmx_create_vcpu+0x337/0xb90 [kvm_intel] [ 5311.741600] [<c1080600>] ? dequeue_task_fair+0x2e0/0xf60 [ 5311.746197] [<faf3b9ca>] kvm_arch_vcpu_create+0x3a/0x70 [kvm] [ 5311.751278] [<faf29e9d>] kvm_vm_ioctl+0x14d/0x640 [kvm] [ 5311.755771] [<c1129d44>] ? free_pages_prepare+0x1a4/0x2d0 [ 5311.760455] [<c13e2842>] ? debug_smp_processor_id+0x12/0x20 [ 5311.765333] [<c10793be>] ? sched_move_task+0xbe/0x170 [ 5311.769621] [<c11752b3>] ? kmem_cache_free+0x213/0x230 [ 5311.774016] [<faf29d50>] ? kvm_set_memory_region+0x60/0x60 [kvm] [ 5311.779379] [<c1199fa2>] do_vfs_ioctl+0x2e2/0x500 [ 5311.783285] [<c11752b3>] ? kmem_cache_free+0x213/0x230 [ 5311.787677] [<c104dc73>] ? __mmdrop+0x63/0xd0 [ 5311.791196] [<c104dc73>] ? __mmdrop+0x63/0xd0 [ 5311.794712] [<c104dc73>] ? __mmdrop+0x63/0xd0 [ 5311.798234] [<c11a2ed7>] ? __fget+0x57/0x90 [ 5311.801559] [<c11a2f72>] ? __fget_light+0x22/0x50 [ 5311.805464] [<c119a240>] SyS_ioctl+0x80/0x90 [ 5311.808885] [<c1957d30>] sysenter_do_call+0x12/0x12 [ 5312.059280] kvm: zapping shadow pages for mmio generation wraparound [ 5313.678415] kvm [4231]: vcpu0 disabled perfctr wrmsr: 0xc2 data 0xffff [ 5313.726518] kvm [4231]: vcpu0 unhandled rdmsr: 0x570 Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16kvm/x86: Hyper-V SynIC timersAndrey Smetanin
Per Hyper-V specification (and as required by Hyper-V-aware guests), SynIC provides 4 per-vCPU timers. Each timer is programmed via a pair of MSRs, and signals expiration by delivering a special format message to the configured SynIC message slot and triggering the corresponding synthetic interrupt. Note: as implemented by this patch, all periodic timers are "lazy" (i.e. if the vCPU wasn't scheduled for more than the timer period the timer events are lost), regardless of the corresponding configuration MSR. If deemed necessary, the "catch up" mode (the timer period is shortened until the timer catches up) will be implemented later. Changes v2: * Use remainder to calculate periodic timer expiration time Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16kvm/x86: Hyper-V SynIC message slot pending clearing at SINT ackAndrey Smetanin
The SynIC message protocol mandates that the message slot is claimed by atomically setting message type to something other than HVMSG_NONE. If another message is to be delivered while the slot is still busy, message pending flag is asserted to indicate to the guest that the hypervisor wants to be notified when the slot is released. To make sure the protocol works regardless of where the message sources are (kernel or userspace), clear the pending flag on SINT ACK notification, and let the message sources compete for the slot again. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16kvm/x86: Hyper-V internal helper to read MSR HV_X64_MSR_TIME_REF_COUNTAndrey Smetanin
This helper will be used also in Hyper-V SynIC timers implementation. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16kvm/x86: Added Hyper-V vcpu_to_hv_vcpu()/hv_vcpu_to_vcpu() helpersAndrey Smetanin
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16kvm/x86: Rearrange func's declarations inside Hyper-V headerAndrey Smetanin
This rearrangement places functions declarations together according to their functionality, so future additions will be simplier. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16drivers/hv: Move struct hv_timer_message_payload into UAPI Hyper-V x86 headerAndrey Smetanin
This struct is required for Hyper-V SynIC timers implementation inside KVM and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into Hyper-V UAPI header. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16drivers/hv: Move struct hv_message into UAPI Hyper-V x86 headerAndrey Smetanin
This struct is required for Hyper-V SynIC timers implementation inside KVM and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into Hyper-V UAPI header. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16drivers/hv: Move HV_SYNIC_STIMER_COUNT into Hyper-V UAPI x86 headerAndrey Smetanin
This constant is required for Hyper-V SynIC timers MSR's support by userspace(QEMU). Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16drivers/hv: replace enum hv_message_type by u32Andrey Smetanin
enum hv_message_type inside struct hv_message, hv_post_message is not size portable. Replace enum by u32. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16Merge tag 'kvm-s390-next-4.5-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390 features and fixes for 4.5 (kvm/next) Some small cleanups - use assignment instead of memcpy - use %pK for kernel pointers Changes regarding guest memory size - Fix an off-by-one error in our guest memory interface (we might use unnecessarily big page tables, e.g. 3 levels for a 2GB guest instead of 2 levels) - We now ask the machine about the max. supported guest address and limit accordingly.
2015-12-16nfsd: don't hold ls_mutex across a layout recallJeff Layton
We do need to serialize layout stateid morphing operations, but we currently hold the ls_mutex across a layout recall which is pretty ugly. It's also unnecessary -- once we've bumped the seqid and copied it, we don't need to serialize the rest of the CB_LAYOUTRECALL vs. anything else. Just drop the mutex once the copy is done. This was causing a "workqueue leaked lock or atomic" warning and an occasional deadlock. There's more work to be done here but this fixes the immediate regression. Fixes: cc8a55320b5f "nfsd: serialize layout stateid morphing operations" Cc: stable@vger.kernel.org Reported-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-12-16net: fix warnings in 'make htmldocs' by moving macro definition out of field ↵Hannes Frederic Sowa
declaration Docbook does not like the definition of macros inside a field declaration and adds a warning. Move the definition out. Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16clocksource/drivers/h8300: Use ioread / iowriteYoshinori Sato
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-12-16rhashtable: Fix walker list corruptionHerbert Xu
The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable: Fix sleeping inside RCU critical section in walk_stop") introduced a new spinlock for the walker list. However, it did not convert all existing users of the list over to the new spin lock. Some continued to use the old mutext for this purpose. This obviously led to corruption of the list. The fix is to use the spin lock everywhere where we touch the list. This also allows us to do rcu_rad_lock before we take the lock in rhashtable_walk_start. With the old mutex this would've deadlocked but it's safe with the new spin lock. Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Antonio Quartulli says: ==================== Included changes: - change my email in MAINTAINERS and Doc files - create and export list of single hop neighs per interface - protect CRC in the BLA code by means of its own lock - minor fixes and code cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>