summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-12-08powerpc/opal-irqchip: Fix double endian conversionAlistair Popple
The OPAL event calls return a mask of events that are active in big endian format. This is checked when unmasking the events in the irqchip by comparison with a cached value. The cached value was stored in big endian format but should've been converted to CPU endian first. This bug leads to OPAL event delivery being delayed or dropped on some systems. Symptoms may include a non-functional console. The bug is fixed by calling opal_handle_events(...) instead of duplicating code in opal_event_unmask(...). Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events") Cc: stable@vger.kernel.org # v4.2+ Reported-by: Douglas L Lehr <dllehr@us.ibm.com> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-08Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-08Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: User visible fixes: - Fix showing the running kernel build id using: (Michael Petlan) $ perf buildid-list -k 03c2a89c595616188f02f0282762a75b47069bc0 - hists browser (report, top) symbol filter segfault fixes (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-08Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Fixes and improvements for supporting annotating ARM binaries, support ARM call and jump instructions, more work needed to have arch specific stuff separated into tools/perf/arch/*/annotate/ (Russell King) - Fix several 'perf test' entries broken by recent perf/core changes (Jiri Olsa) Infrastructure changes: - Consolidate perf_ev{list,sel}__{enable,disable}() calls (Jiri Olsa) - Pass correct string to dso__adjust_kmod_long_name() (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-07IB core: Fix ib_sg_to_pages()Bart Van Assche
On 12/03/2015 01:18 AM, Christoph Hellwig wrote: > The patch looks good to me, but while we touch this area, how about > throwing in a few cosmetic fixes as well? How about the patch below ? In that version of the ib_sg_to_pages() fix these concerns have been addressed and additionally to more bugs have been fixed. ------------ [PATCH] IB core: Fix ib_sg_to_pages() Fix the code for detecting gaps. A gap occurs not only if the second or later scatterlist element is not aligned but also if any scatterlist element other than the last does not end at a page boundary. In the code for coalescing contiguous elements, ensure that mr->length is correct and that last_page_addr is up-to-date. Ensure that this function returns a negative error code instead of zero if the first set_page() call fails. Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API") Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix srp_map_sg_fr()Bart Van Assche
After dma_map_sg() has been called the return value of that function must be used as the number of elements in the scatterlist instead of scsi_sg_count(). Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.4+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix indirect data buffer rkey endiannessBart Van Assche
Detected by sparse. Fixes: commit 330179f2fa93 ("IB/srp: Register the indirect data buffer descriptor") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.3+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Initialize dma_length in srp_map_idbChristoph Hellwig
Without this sg_dma_len will return 0 on architectures tha have the dma_length field. Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix possible send queue overflowSagi Grimberg
When using work request based memory registration (fast_reg) we must reserve SQ entries for registration and invalidation in addition to send operations. Each IO consumes 3 SQ entries (registration, send, invalidation) so we need to allocate 3x larger send-queue instead of 2x. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix a memory leakBart Van Assche
If srp_connect_ch() returns a positive value then that is considered by its caller as a connection failure but this does not result in a scsi_host_put() call and additionally causes the srp_create_target() function to return a positive value while it should return a negative value. Avoid all this confusion and additionally fix a memory leak by ensuring that srp_connect_ch() always returns a value that is <= 0. This patch avoids that a rejected login triggers the following memory leak: unreferenced object 0xffff88021b24a220 (size 8): comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s) hex dump (first 8 bytes): 68 6f 73 74 35 38 00 a5 host58.. backtrace: [<ffffffff8151014a>] kmemleak_alloc+0x7a/0xc0 [<ffffffff81165c1e>] __kmalloc_track_caller+0xfe/0x160 [<ffffffff81260d2b>] kvasprintf+0x5b/0x90 [<ffffffff81260e2d>] kvasprintf_const+0x8d/0xb0 [<ffffffff81254b0c>] kobject_set_name_vargs+0x3c/0xa0 [<ffffffff81337e3c>] dev_set_name+0x3c/0x40 [<ffffffff81355757>] scsi_host_alloc+0x327/0x4b0 [<ffffffffa03edc8e>] srp_create_target+0x4e/0x8a0 [ib_srp] [<ffffffff8133778b>] dev_attr_store+0x1b/0x20 [<ffffffff811f27fa>] sysfs_kf_write+0x4a/0x60 [<ffffffff811f1e8e>] kernfs_fop_write+0x14e/0x180 [<ffffffff81176eef>] __vfs_write+0x2f/0xf0 [<ffffffff811771e4>] vfs_write+0xa4/0x100 [<ffffffff81177c64>] SyS_write+0x54/0xc0 [<ffffffff8151b257>] entry_SYSCALL_64_fastpath+0x12/0x6f Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/sa: Put netlink request into the request list before sendingKaike Wan
It was found by Saurabh Sengar that the netlink code tried to allocate memory with GFP_KERNEL while holding a spinlock. While it is possible to fix the issue by replacing GFP_KERNEL with GFP_ATOMIC, it is better to get rid of the spinlock while sending the packet. However, in order to protect against a race condition that a quick response may be received before the request is put on the request list, we need to put the request on the list first. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reported-by: Saurabh Sengar <saurabh.truth@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/iser: use sector_div instead of do_divArnd Bergmann
do_div is the wrong way to divide a sector_t, as it is less efficient when sector_t is 32-bit wide. With the upcoming do_div optimizations, the kernel starts warning about this: drivers/infiniband/ulp/iser/iser_verbs.c:1296:4: note: in expansion of macro 'do_div' include/asm-generic/div64.h:224:22: warning: passing argument 1 of '__div64_32' from incompatible pointer type This changes the code to use sector_div instead, which always produces optimal code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/core: use RCU for uverbs id lookupMike Marciniszyn
The current implementation gets a spin_lock, and at any scale with qib and hfi1 post send, the lock contention grows exponentially with the number of QPs. idr_find() is RCU compatibile, so read doesn't need the lock. Change to use rcu_read_lock() and rcu_read_unlock() in __idr_get_uobj(). kfree_rcu() is used to insure a grace period between the idr removal and actual free. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/qib: Minor fixes to qib per SFF 8636Easwar Hariharan
Minor errors found via code inspection during future development. SFF 8636 defines bit position 2 to hold the status indication of QSFP memory paging. The mask used to test for the value was incorrect and is fixed in this patch. Additionally, the dump function had a mismatch between the field being printed out and the field used to source the data which was fixed. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reported-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/core: Fix user mode post wr corruptionMike Marciniszyn
Commit e622f2f4ad21 ("IB: split struct ib_send_wr") introduced a regression for HCAs whose user mode post sends go through ib_uverbs_post_send(). The code didn't account for the fact that the first sge is offset by an operation dependent length. The allocation did, but the pointer to the destination sge list is computed without that knowledge. The sge list copy_from_user() then corrupts fields in the work request Store the operation dependent length in a local variable and compute the sge list copy_from_user() destination using that length. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/qib: Fix qib_mr structureIra Weiny
struct qib_mr requires the mr member be the last because struct qib_mregion contains a dynamic array at the end. The additions of members should have been placed before this structure as the comment noted. Failure to do so was causing random memory corruption. Reproducing this bug was easy to do by running the client and server of ib_write_bw -s 8 -n 5 on the same node. This BUG() was tripped in a slab debug kernel: kernel BUG at mm/slab.c:2572! Fixes: 38071a461f0a ("IB/qib: Support the new memory registration API") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07perf annotate: ARM supportRussell King
Add basic support to parse ARM assembly. This: * enables perf to correctly show the disassembly, rather than chopping some constants off at the '#' (which is not a comment character on ARM). * allows perf to identify ARM instructions that branch to other parts within the same function, thereby properly annotating them. * allows perf to identify function calls, allowing called functions to be followed in the annotated view. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/n/tip-owp1uj0nmcgfrlppfyeetuyf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf stat: Move enable_on_exec setup under earlier codeJiri Olsa
It's more readable this way and we can save one perf_evsel__is_group_leader condition in current code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf stat: Create events as disabledJiri Olsa
Currently we have 2 kinds of stat counters based on when the event is enabled: 1) tracee command events, which are enable once the tracee executes exec syscall (enable_on_exec bit) 2) all other events which get alive within the perf_event_open syscall And 2) case could raise a problem in case we want additional filter to be attached for event. In this case we want the event to be enabled after it's configured with filter. Changing the behaviour of 2) events, so they all are created as disabled (disabled bit). Adding extra enable call to make them alive once they finish setup. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf stat: Use perf_evlist__enable in handle_initial_delayJiri Olsa
No need to mimic the behaviour of perf_evlist__enable, we can use it directly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evlist: Factor perf_evlist__(enable|disable) functionsJiri Olsa
Use perf_evsel__(enable|disable) functions in perf_evlist__(enable|disable) functions in order to centralize ioctl enable/disable calls. This way we eliminate 2 places calling directly ioctl. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evsel: Introduce disable() methodJiri Olsa
Adding perf_evsel__disable function to have complement for perf_evsel__enable function. Both will be used in following patch to factor perf_evlist__(enable|disable). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evsel: Use event maps directly in perf_evsel__enableJiri Olsa
All events now share proper cpu and thread maps. There's no need to pass those maps from evlist, it's safe to use evsel maps for enabling event. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf test: Create kernel maps properly for hist entries testJiri Olsa
It fixes segfault within machine__exit, that's caused but not creating kernel maps for machine.. We're calling machine__destroy_kernel_maps in machine__exit since commit: ebe9729c8c31 perf machine: Fix to destroy kernel maps when machine exits Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/tip-k4snzv5t4dvdckggzwdzyljo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf test: Prevent using bpf-output event in round trip name testJiri Olsa
The bpf-output is added under software events, but is not parse-able within parse_events, which is what round trip test is expecting. Checking software events only until dummy event. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1449131658-1841-6-git-send-email-jolsa@kernel.org [ Make it a one liner by keeping __perf_evsel__name_array_test() around ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf test: Fix cpus and thread maps reference in error pathJiri Olsa
In error path to try user space event, both cpus and threads map now owned by evlist and freed by perf_evlist__set_maps call. Getting reference to keep them alive. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1449131658-1841-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf test: Use machine__new_host in mmap thread code reading testJiri Olsa
This is more straightforward than what we have now. It also fixes a segfault within machine__exit, that's caused by not creating kernel maps for machine.. We're calling machine__destroy_kernel_maps in machine__exit since commit: ebe9729c8c31 perf machine: Fix to destroy kernel maps when machine exits Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1449131658-1841-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf test: Use machine__new_host in mmap thread lookup testJiri Olsa
This is more straightforward than what we have now. It also fixes a segfault within machine__exit, that's caused by not creating kernel maps for machine.. We're calling machine__destroy_kernel_maps in machine__exit since commit: ebe9729c8c31 perf machine: Fix to destroy kernel maps when machine exits Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1449131658-1841-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf test: Use machine__new_host in dwarf unwind testJiri Olsa
This is more straightforward than what we have now. It also fixes a segfault within machine__exit, that's caused by not creating kernel maps for machine.. We're calling machine__destroy_kernel_maps in machine__exit since commit: ebe9729c8c31 perf machine: Fix to destroy kernel maps when machine exits Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1449131658-1841-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf machine: Pass correct string to dso__adjust_kmod_long_nameWang Nan
There's a mistake in dso__adjust_kmod_long_name() that it use strdup() to dup the new long_name of a dso, but passes the original string to dso__set_long_name(). Which causes random crash during cleanup. Signed-off-by: Wang Nan <wangnan0@huawei.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Fixes: c03d5184f0e9 ("perf machine: Adjust dso->long_name for offline module") Link: http://lkml.kernel.org/r/1449455785-42020-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07SUNRPC: Fix callback channelTrond Myklebust
The NFSv4.1 callback channel is currently broken because the receive message will keep shrinking because the backchannel receive buffer size never gets reset. The easiest solution to this problem is instead of changing the receive buffer, to rather adjust the copied request. Fixes: 38b7631fbe42 ("nfs4: limit callback decoding to received bytes") Cc: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-07Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "This includes some fixes and cleanups in virtio and vhost code. Most notably, shadowing the index fixes the excessive cacheline bouncing observed on AMD platforms" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_ring: shadow available ring flags & index virtio: Do not drop __GFP_HIGH in alloc_indirect vhost: replace % with & on data path tools/virtio: fix byteswap logic tools/virtio: move list macro stubs virtio: fix memory leak of virtio ida cache layers vhost: relax log address alignment virtio-net: Stop doing DMA from the stack
2015-12-07Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Ext4 bug fixes for v4.4, including fixes for post-2038 time encodings, some endian conversion problems with ext4 encryption, potential memory leaks after truncate in data=journal mode, and an ocfs2 regression caused by a jbd2 performance improvement" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix null committed data return in undo_access ext4: add "static" to ext4_seq_##name##_fops struct ext4: fix an endianness bug in ext4_encrypted_follow_link() ext4: fix an endianness bug in ext4_encrypted_zeroout() jbd2: Fix unreclaimed pages after truncate in data=journal mode ext4: Fix handling of extended tv_sec
2015-12-07arm64: update linker script to increased L1_CACHE_BYTES valueArd Biesheuvel
Bring the linker script in line with the recent increase of L1_CACHE_BYTES to 128. Replace the hardcoded value of 64 with the symbolic constant. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: fix up RW_DATA_SECTION as well] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-12-07lightnvm: do not compile in debugging by defaultMatias Bjørling
The LightNVM module exposes a debug interface when CONFIG_NVM_DEBUG is set. This interfaces takes a string to configure media managers and targets. Make sure this interface is only exposed when chosen deliberately. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: prevent gennvm module unload on useMatias Bjørling
After the gennvm module has been initialized. It might be attached to one or several devices. In that case, the module is in use. Make sure that it can not be unloaded. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: fix media mgr registrationMatias Bjørling
This patch fixes two issues during media manager registration. 1. The ppa pool can be used at media manager registration. Allocate the ppa pool before that. 2. If a media manager can't be found, this should not lead to the device being unallocated. A media manager can be registered later, that can manage the device. Only warn if a media manager fails initialization. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: replace req queue with nvmdev for lldMatias Bjørling
In the case where a request queue is passed to the low lever lightnvm device drive integration, the device driver might pass its admin commands through another queue. Instead pass nvm_dev, and let the low level drive the appropriate queue. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: comments on constantsMatias Bjørling
It is not obvious what NVM_IO_* and NVM_BLK_T_* are used for. Make sure to comment them appropriately as the other constants. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: check mm before useMatias Bjørling
The core can may issue I/Os before a media manager is registered with the lightnvm subsystem. Make sure that we don't call the media manager ->end_io prematurely with a null pointer. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: refactor spin_unlock in gennvm_get_blkWenwei Tao
The spin_unlock is duplicated multiple times. Jump to a single unlock to improve the code flow. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: put blks when luns configure failedWenwei Tao
Put the allocated blocks back to the free list when the luns configure failed, to make these blocks useable to others. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: use flags in rrpc_get_blkWenwei Tao
rrpc_get_blk use constant 0 as the input parameter of nvm_get_blk, this may result in getting gc block failed unexpectedly. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07virtio_ring: shadow available ring flags & indexVenkatesh Srinivas
Improves cacheline transfer flow of available ring header. Virtqueues are implemented as a pair of rings, one producer->consumer avail ring and one consumer->producer used ring; preceding the avail ring in memory are two contiguous u16 fields -- avail->flags and avail->idx. A producer posts work by writing to avail->idx and a consumer reads avail->idx. The flags and idx fields only need to be written by a producer CPU and only read by a consumer CPU; when the producer and consumer are running on different CPUs and the virtio_ring code is structured to only have source writes/sink reads, we can continuously transfer the avail header cacheline between 'M' states between cores. This flow optimizes core -> core bandwidth on certain CPUs. (see: "Software Optimization Guide for AMD Family 15h Processors", Section 11.6; similar language appears in the 10h guide and should apply to CPUs w/ exclusive caches, using LLC as a transfer cache) Unfortunately the existing virtio_ring code issued reads to the avail->idx and read-modify-writes to avail->flags on the producer. This change shadows the flags and index fields in producer memory; the vring code now reads from the shadows and only ever writes to avail->flags and avail->idx, allowing the cacheline to transfer core -> core optimally. In a concurrent version of vring_bench, the time required for 10,000,000 buffer checkout/returns was reduced by ~2% (average across many runs) on an AMD Piledriver (15h) CPU: (w/o shadowing): Performance counter stats for './vring_bench': 5,451,082,016 L1-dcache-loads ... 2.221477739 seconds time elapsed (w/ shadowing): Performance counter stats for './vring_bench': 5,405,701,361 L1-dcache-loads ... 2.168405376 seconds time elapsed The further away (in a NUMA sense) virtio producers and consumers are from each other, the more we expect to benefit. Physical implementations of virtio devices and implementations of virtio where the consumer polls vring avail indexes (vhost) should also benefit. Signed-off-by: Venkatesh Srinivas <venkateshs@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07virtio: Do not drop __GFP_HIGH in alloc_indirectMichal Hocko
b92b1b89a33c ("virtio: force vring descriptors to be allocated from lowmem") tried to exclude highmem pages for descriptors so it cleared __GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH which doesn't make much sense for this fix because __GFP_HIGH only controls access to memory reserves and it doesn't have any influence on the zone selection. Some of the call paths use GFP_ATOMIC and dropping __GFP_HIGH will reduce their changes for success because the lack of access to memory reserves. Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
2015-12-07vhost: replace % with & on data pathMichael S. Tsirkin
We know vring num is a power of 2, so use & to mask the high bits. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07tools/virtio: fix byteswap logicMichael S. Tsirkin
commit cf561f0d2eb74574ad9985a2feab134267a9d298 ("virtio: introduce virtio_is_little_endian() helper") changed byteswap logic to skip feature bit checks for LE platforms, but didn't update tools/virtio, so vring_bench started failing. Update the copy under tools/virtio/ (TODO: find a way to avoid this code duplication). Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07tools/virtio: move list macro stubsMichael S. Tsirkin
Makes them more generally available. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07virtio: fix memory leak of virtio ida cache layersSuman Anna
The virtio core uses a static ida named virtio_index_ida for assigning index numbers to virtio devices during registration. The ida core may allocate some internal idr cache layers and an ida bitmap upon any ida allocation, and all these layers are truely freed only upon the ida destruction. The virtio_index_ida is not destroyed at present, leading to a memory leak when using the virtio core as a module and atleast one virtio device is registered and unregistered. Fix this by invoking ida_destroy() in the virtio core module exit. Cc: stable@vger.kernel.org Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07vhost: relax log address alignmentMichael S. Tsirkin
commit 5d9a07b0de512b77bf28d2401e5fe3351f00a240 ("vhost: relax used address alignment") fixed the alignment for the used virtual address, but not for the physical address used for logging. That's a mistake: alignment should clearly be the same for virtual and physical addresses, Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>