summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-09mm, sched/numa: Remove remaining traces of NUMA rate-limitingSrikar Dronamraju
Remove the leftover pglist_data::numabalancing_migrate_lock and its initialization, we stopped using this lock with: efaffc5e40ae ("mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration") [ mingo: Rewrote the changelog. ] Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux-MM <linux-mm@kvack.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1538824999-31230-1-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09x86/intel_rdt: Fix out-of-bounds memory access in CBM testsReinette Chatre
While the DOC at the beginning of lib/bitmap.c explicitly states that "The number of valid bits in a given bitmap does _not_ need to be an exact multiple of BITS_PER_LONG.", some of the bitmap operations do indeed access BITS_PER_LONG portions of the provided bitmap no matter the size of the provided bitmap. For example, if bitmap_intersects() is provided with an 8 bit bitmap the operation will access BITS_PER_LONG bits from the provided bitmap. While the operation ensures that these extra bits do not affect the result, the memory is still accessed. The capacity bitmasks (CBMs) are typically stored in u32 since they can never exceed 32 bits. A few instances exist where a bitmap_* operation is performed on a CBM by simply pointing the bitmap operation to the stored u32 value. The consequence of this pattern is that some bitmap_* operations will access out-of-bounds memory when interacting with the provided CBM. This is confirmed with a KASAN test that reports: BUG: KASAN: stack-out-of-bounds in __bitmap_intersects+0xa2/0x100 and BUG: KASAN: stack-out-of-bounds in __bitmap_weight+0x58/0x90 Fix this by moving any CBM provided to a bitmap operation needing BITS_PER_LONG to an 'unsigned long' variable. [ tglx: Changed related function arguments to unsigned long and got rid of the _cbm extra step ] Fixes: 72d505056604 ("x86/intel_rdt: Add utilities to test pseudo-locked region possibility") Fixes: 49f7b4efa110 ("x86/intel_rdt: Enable setting of exclusive mode") Fixes: d9b48c86eb38 ("x86/intel_rdt: Display resource groups' allocations' size in bytes") Fixes: 95f0b77efa57 ("x86/intel_rdt: Initialize new resource group with sane defaults") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/69a428613a53f10e80594679ac726246020ff94f.1538686926.git.reinette.chatre@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09dma-direct: document the zone selection logicChristoph Hellwig
What we are doing here isn't quite obvious, so add a comment explaining it. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-10-09Merge tag 'perf-core-for-mingo-4.20-20181008' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Fix building the python bindings with python3, which fixes some problems with building with clang on Clear Linux (Eduardo Habkost) - Fix coverity warnings, fixing up some error paths and plugging some temporary small buffer leaks (Sanskriti Sharma) - Adopt a wrapper for strerror_r() for the same reasons as recently for libbpf (Steven Rostedt) - S390 does not support watchpoints in perf test 22', check if that test is supported by the arch. (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09spi: spi-ep93xx: Use dma_data_direction for ep93xx_spi_dma_{finish,prepare}Nathan Chancellor
Clang warns when one enumerated type is implicitly converted to another. drivers/spi/spi-ep93xx.c:342:62: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion] nents = dma_map_sg(chan->device->dev, sgt->sgl, sgt->nents, dir); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./include/linux/dma-mapping.h:428:58: note: expanded from macro 'dma_map_sg' #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, 0) ~~~~~~~~~~~~~~~~ ^ drivers/spi/spi-ep93xx.c:348:57: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion] dma_unmap_sg(chan->device->dev, sgt->sgl, sgt->nents, dir); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./include/linux/dma-mapping.h:429:62: note: expanded from macro 'dma_unmap_sg' #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, 0) ~~~~~~~~~~~~~~~~~~ ^ drivers/spi/spi-ep93xx.c:377:56: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion] dma_unmap_sg(chan->device->dev, sgt->sgl, sgt->nents, dir); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./include/linux/dma-mapping.h:429:62: note: expanded from macro 'dma_unmap_sg' #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, 0) ~~~~~~~~~~~~~~~~~~ ^ 3 warnings generated. dma_{,un}map_sg expect an enum of type dma_data_direction but this driver uses dma_transfer_direction for everything. Convert the driver to use dma_data_direction for these two functions. There are two places that strictly require an enum of type dma_transfer_direction: the direction member in struct dma_slave_config and the direction parameter in dmaengine_prep_slave_sg. To avoid using an explicit cast, add a simple function, ep93xx_dma_data_to_trans_dir, to safely map between the two types because they are not 1 to 1 in meaning. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-08rxrpc: Fix the packet reception routineDavid Howells
The rxrpc_input_packet() function and its call tree was built around the assumption that data_ready() handler called from UDP to inform a kernel service that there is data to be had was non-reentrant. This means that certain locking could be dispensed with. This, however, turns out not to be the case with a multi-queue network card that can deliver packets to multiple cpus simultaneously. Each of those cpus can be in the rxrpc_input_packet() function at the same time. Fix by adding or changing some structure members: (1) Add peer->rtt_input_lock to serialise access to the RTT buffer. (2) Make conn->service_id into a 32-bit variable so that it can be cmpxchg'd on all arches. (3) Add call->input_lock to serialise access to the Rx/Tx state. Note that although the Rx and Tx states are (almost) entirely separate, there's no point completing the separation and having separate locks since it's a bi-phasal RPC protocol rather than a bi-direction streaming protocol. Data transmission and data reception do not take place simultaneously on any particular call. and making the following functional changes: (1) In rxrpc_input_data(), hold call->input_lock around the core to prevent simultaneous producing of packets into the Rx ring and updating of tracking state for a particular call. (2) In rxrpc_input_ping_response(), only read call->ping_serial once, and check it before checking RXRPC_CALL_PINGING as that's a cheaper test. The bit test and bit clear can then be combined. No further locking is needed here. (3) In rxrpc_input_ack(), take call->input_lock after we've parsed much of the ACK packet. The superseded ACK check is then done both before and after the lock is taken. The handing of ackinfo data is split, parsing before the lock is taken and processing with it held. This is keyed on rxMTU being non-zero. Congestion management is also done within the locked section. (4) In rxrpc_input_ackall(), take call->input_lock around the Tx window rotation. The ACKALL packet carries no information and is only really useful after all packets have been transmitted since it's imprecise. (5) In rxrpc_input_implicit_end_call(), we use rx->incoming_lock to prevent calls being simultaneously implicitly ended on two cpus and also to prevent any races with incoming call setup. (6) In rxrpc_input_packet(), use cmpxchg() to effect the service upgrade on a connection. It is only permitted to happen once for a connection. (7) In rxrpc_new_incoming_call(), we have to recheck the routing inside rx->incoming_lock to see if someone else set up the call, connection or peer whilst we were getting there. We can't trust the values from the earlier routing check unless we pin refs on them - which we want to avoid. Further, we need to allow for an incoming call to have its state changed on another CPU between us making it live and us adjusting it because the conn is now in the RXRPC_CONN_SERVICE state. (8) In rxrpc_peer_add_rtt(), take peer->rtt_input_lock around the access to the RTT buffer. Don't need to lock around setting peer->rtt. For reference, the inventory of state-accessing or state-altering functions used by the packet input procedure is: > rxrpc_input_packet() * PACKET CHECKING * ROUTING > rxrpc_post_packet_to_local() > rxrpc_find_connection_rcu() - uses RCU > rxrpc_lookup_peer_rcu() - uses RCU > rxrpc_find_service_conn_rcu() - uses RCU > idr_find() - uses RCU * CONNECTION-LEVEL PROCESSING - Service upgrade - Can only happen once per conn ! Changed to use cmpxchg > rxrpc_post_packet_to_conn() - Setting conn->hi_serial - Probably safe not using locks - Maybe use cmpxchg * CALL-LEVEL PROCESSING > Old-call checking > rxrpc_input_implicit_end_call() > rxrpc_call_completed() > rxrpc_queue_call() ! Need to take rx->incoming_lock > __rxrpc_disconnect_call() > rxrpc_notify_socket() > rxrpc_new_incoming_call() - Uses rx->incoming_lock for the entire process - Might be able to drop this earlier in favour of the call lock > rxrpc_incoming_call() ! Conflicts with rxrpc_input_implicit_end_call() > rxrpc_send_ping() - Don't need locks to check rtt state > rxrpc_propose_ACK * PACKET DISTRIBUTION > rxrpc_input_call_packet() > rxrpc_input_data() * QUEUE DATA PACKET ON CALL > rxrpc_reduce_call_timer() - Uses timer_reduce() ! Needs call->input_lock() > rxrpc_receiving_reply() ! Needs locking around ack state > rxrpc_rotate_tx_window() > rxrpc_end_tx_phase() > rxrpc_proto_abort() > rxrpc_input_dup_data() - Fills the Rx buffer - rxrpc_propose_ACK() - rxrpc_notify_socket() > rxrpc_input_ack() * APPLY ACK PACKET TO CALL AND DISCARD PACKET > rxrpc_input_ping_response() - Probably doesn't need any extra locking ! Need READ_ONCE() on call->ping_serial > rxrpc_input_check_for_lost_ack() - Takes call->lock to consult Tx buffer > rxrpc_peer_add_rtt() ! Needs to take a lock (peer->rtt_input_lock) ! Could perhaps manage with cmpxchg() and xadd() instead > rxrpc_input_requested_ack - Consults Tx buffer ! Probably needs a lock > rxrpc_peer_add_rtt() > rxrpc_propose_ack() > rxrpc_input_ackinfo() - Changes call->tx_winsize ! Use cmpxchg to handle change ! Should perhaps track serial number - Uses peer->lock to record MTU specification changes > rxrpc_proto_abort() ! Need to take call->input_lock > rxrpc_rotate_tx_window() > rxrpc_end_tx_phase() > rxrpc_input_soft_acks() - Consults the Tx buffer > rxrpc_congestion_management() - Modifies the Tx annotations ! Needs call->input_lock() > rxrpc_queue_call() > rxrpc_input_abort() * APPLY ABORT PACKET TO CALL AND DISCARD PACKET > rxrpc_set_call_completion() > rxrpc_notify_socket() > rxrpc_input_ackall() * APPLY ACKALL PACKET TO CALL AND DISCARD PACKET ! Need to take call->input_lock > rxrpc_rotate_tx_window() > rxrpc_end_tx_phase() > rxrpc_reject_packet() There are some functions used by the above that queue the packet, after which the procedure is terminated: - rxrpc_post_packet_to_local() - local->event_queue is an sk_buff_head - local->processor is a work_struct - rxrpc_post_packet_to_conn() - conn->rx_queue is an sk_buff_head - conn->processor is a work_struct - rxrpc_reject_packet() - local->reject_queue is an sk_buff_head - local->processor is a work_struct And some that offload processing to process context: - rxrpc_notify_socket() - Uses RCU lock - Uses call->notify_lock to call call->notify_rx - Uses call->recvmsg_lock to queue recvmsg side - rxrpc_queue_call() - call->processor is a work_struct - rxrpc_propose_ACK() - Uses call->lock to wrap __rxrpc_propose_ACK() And a bunch that complete a call, all of which use call->state_lock to protect the call state: - rxrpc_call_completed() - rxrpc_set_call_completion() - rxrpc_abort_call() - rxrpc_proto_abort() - Also uses rxrpc_queue_call() Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08rxrpc: Fix the rxrpc_tx_packet trace lineDavid Howells
Fix the rxrpc_tx_packet trace line by storing the where parameter. Fixes: 4764c0da69dc ("rxrpc: Trace packet transmission") Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08rxrpc: Fix connection-level abort handlingDavid Howells
Fix connection-level abort handling to cache the abort and error codes properly so that a new incoming call can be properly aborted if it races with the parent connection being aborted by another CPU. The abort_code and error parameters can then be dropped from rxrpc_abort_calls(). Fixes: f5c17aaeb2ae ("rxrpc: Calls should only have one terminal state") Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08rxrpc: Only take the rwind and mtu values from latest ACKDavid Howells
Move the out-of-order and duplicate ACK packet check to before the call to rxrpc_input_ackinfo() so that the receive window size and MTU size are only checked in the latest ACK packet and don't regress. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08mtd: maps: gpio-addr-flash: Convert to gpiodRicardo Ribalda Delgado
Convert from legacy gpio API to gpiod. Board files will have to use gpiod_lookup_tables. Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Suggested-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: maps: gpio-addr-flash: Replace array with an integerRicardo Ribalda Delgado
By replacing the array with an integer we can avoid completely the bit comparison loop if the value has not changed (by far the most common case). Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: maps: gpio-addr-flash: Use order instead of sizeRicardo Ribalda Delgado
By using the order of the window instead of the size, we can replace a lot of expensive division and modulus on the code with simple bit operations. Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: spi-nor: fsl-quadspi: Don't let -EINVAL on the busAhmad Fatoum
fsl_qspi_get_seqid() may return -EINVAL, but fsl_qspi_init_ahb_read() doesn't check for error codes with the result that -EINVAL could find itself signalled over the bus. In conjunction with the LS1046A SoC's A-009283 errata ("Illegal accesses to SPI flash memory can result in a system hang") this illegal access to SPI flash memory results in a system hang if userspace attempts reading later on. Avoid this by always checking fsl_qspi_get_seqid()'s return value and bail out otherwise. Fixes: e46ecda764dc ("mtd: spi-nor: Add Freescale QuadSPI driver") Cc: stable@vger.kernel.org Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: devices: m25p80: Make sure WRITE_EN is issued before each writeYogesh Gaur
Some SPI controllers can't write nor->page_size bytes in a single step because their TX FIFO is too small, but when that happens we should make sure a WRITE_EN command before each write access and READ_SR command after each write access is issued. The core is already taking care of that, so all we have to do here is return the actual number of bytes that were written during the spi_mem_exec_op() operation. Signed-off-by: Yogesh Gaur <yogeshnarayan.gaur@nxp.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: spi-nor: Support controllers with limited TX FIFO sizeYogesh Gaur
Some SPI controllers can't write nor->page_size bytes in a single step because their TX FIFO is too small. Allow nor->write() to return a size that is smaller than the requested write size to gracefully handle this case. Signed-off-by: Yogesh Gaur <yogeshnarayan.gaur@nxp.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: spi-nor: cadence-quadspi: Use proper enum for dma_[un]map_singleNathan Chancellor
Clang warns when one enumerated type is converted implicitly to another. drivers/mtd/spi-nor/cadence-quadspi.c:962:47: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion] dma_dst = dma_map_single(nor->dev, buf, len, DMA_DEV_TO_MEM); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ ./include/linux/dma-mapping.h:428:66: note: expanded from macro 'dma_map_single' ~~~~~~~~~~~~~~~~~~~~ ^ drivers/mtd/spi-nor/cadence-quadspi.c:997:43: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion] dma_unmap_single(nor->dev, dma_dst, len, DMA_DEV_TO_MEM); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ ./include/linux/dma-mapping.h:429:70: note: expanded from macro 'dma_unmap_single' ~~~~~~~~~~~~~~~~~~~~~~ ^ 2 warnings generated. Use the proper enums from dma_data_direction to satisfy Clang. DMA_FROM_DEVICE = DMA_DEV_TO_MEM = 2 Link: https://github.com/ClangBuiltLinux/linux/issues/108 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: spi-nor: parse SFDP Sector Map Parameter TableTudor Ambarus
Add support for the SFDP (JESD216B) Sector Map Parameter Table. This table is optional, but when available, we parse it to identify the location and size of sectors within the main data array of the flash memory device and to identify which Erase Types are supported by each sector. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Reviewed-by: Marek Vasut <marek.vasut@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08mtd: spi-nor: add support to non-uniform SFDP SPI NOR flash memoriesTudor Ambarus
Based on Cyrille Pitchen's patch https://lkml.org/lkml/2017/3/22/935. This patch is a transitional patch in introducing the support of SFDP SPI memories with non-uniform erase sizes like Spansion s25fs512s. Non-uniform erase maps will be used later when initialized based on the SFDP data. Introduce the memory erase map which splits the memory array into one or many erase regions. Each erase region supports up to 4 erase types, as defined by the JEDEC JESD216B (SFDP) specification. To be backward compatible, the erase map of uniform SPI NOR flash memories is initialized so it contains only one erase region and this erase region supports only one erase command. Hence a single size is used to erase any sector/block of the memory. Besides, since the algorithm used to erase sectors on non-uniform SPI NOR flash memories is quite expensive, when possible, the erase map is tuned to come back to the uniform case. The 'erase with the best command, move forward and repeat' approach was suggested by Cristian Birsan in a brainstorm session, so: Suggested-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Reviewed-by: Marek Vasut <marek.vasut@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-08filesystem-dax: Fix dax_layout_busy_page() livelockDan Williams
In the presence of multi-order entries the typical pagevec_lookup_entries() pattern may loop forever: while (index < end && pagevec_lookup_entries(&pvec, mapping, index, min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { ... for (i = 0; i < pagevec_count(&pvec); i++) { index = indices[i]; ... } index++; /* BUG */ } The loop updates 'index' for each index found and then increments to the next possible page to continue the lookup. However, if the last entry in the pagevec is multi-order then the next possible page index is more than 1 page away. Fix this locally for the filesystem-dax case by checking for dax-multi-order entries. Going forward new users of multi-order entries need to be similarly careful, or we need a generic way to report the page increment in the radix iterator. Fixes: 5fac7408d828 ("mm, fs, dax: handle layout changes to pinned dax...") Cc: <stable@vger.kernel.org> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-08sunvdc: Remove VLA usageKees Cook
In the quest to remove all stack VLA usage from the kernel[1], this moves the math for cookies calculation into macros and allocates a fixed size array for the maximum number of cookies and adds a runtime sanity check. (Note that the size was always fixed, but just hidden from the compiler.) [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08tools lib traceevent, perf tools: Move struct tep_handler definition in a ↵Tzvetomir Stoyanov
local header file As traceevent is going to be transferred into a proper library, its local data should be protected from the library users. This patch encapsulates struct tep_handler into a local header, not visible outside of the library. It implements also a bunch of new APIs, which library users can use to access tep_handler members. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linux trace devel <linux-trace-devel@vger.kernel.org> Cc: tzvetomir stoyanov <tstoyanov@vmware.com> Link: http://lkml.kernel.org/r/20181005122225.522155df@gandalf.local.home Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08tools lib traceevent: Separate out tep_strerror() for strerror_r() issuesSteven Rostedt (VMware)
While working on having PowerTop use libtracevent as a shared object library, Tzvetomir hit "str_error_r not defined". This was added by commit c3cec9e68f12d ("tools lib traceevent: Use str_error_r()") because strerror_r() has two definitions, where one is GNU specific, and the other is XSI complient. The strerror_r() is in a wrapper str_error_r() to keep the code from having to worry about which compiler is being used. The problem is that str_error_r() is external to libtraceevent, and not part of the library. If it is used as a shared object then the tools using it will need to define that function. I do not want that function defined in libtraceevent itself, as it is out of scope for that library. As there's only a single instance of this call, and its in the traceevent library's own tep_strerror() function, we can copy what was done in perf, and create yet another external file that undefs _GNU_SOURCE to use the more portable version of the function. We don't need to worry about the errors that strerror_r() returns. If the buffer isn't big enough, we simply truncate it. Reported-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: linux trace devel <linux-trace-devel@vger.kernel.org> Link: http://lkml.kernel.org/r/20181005121816.484e654f@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf python: More portable way to make CFLAGS work with clangEduardo Habkost
The existing code that tries to make CFLAGS compatible with clang doesn't work with Python 3. Instead of trying to touch _sysconfigdata.build_time_vars directly, change the dictionary returned by disutils.sysconfig.get_config_vars(). This works on both Python 2 and Python 3. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20181005204058.7966-3-ehabkost@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf python: Make clang_has_option() work on Python 3Eduardo Habkost
Use a bytes literal so it works with Python 3's version of Popen(). Note that the b"..." syntax requires Python 2.6+. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20181005204058.7966-2-ehabkost@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08regulator: stpmic1: add stpmic1 regulator driverpascal paillet
The stpmic1 PMIC embeds several regulators and switches with different capabilities. Signed-off-by: pascal paillet <p.paillet@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-08perf tools: Free temporary 'sys' string in read_event_files()Sanskriti Sharma
For each system in a given pevent, read_event_files() reads in a temporary 'sys' string. Be sure to free this string before moving onto to the next system and/or leaving read_event_files(). Fixes the following coverity complaints: Error: RESOURCE_LEAK (CWE-772): tools/perf/util/trace-event-read.c:343: overwrite_var: Overwriting "sys" in "sys = read_string()" leaks the storage that "sys" points to. tools/perf/util/trace-event-read.c:353: leaked_storage: Variable "sys" going out of scope leaks the storage it points to. Signed-off-by: Sanskriti Sharma <sansharm@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Link: http://lkml.kernel.org/r/1538490554-8161-6-git-send-email-sansharm@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf tools: Avoid double free in read_event_file()Sanskriti Sharma
The temporary 'buf' buffer allocated in read_event_file() may be freed twice. Move the free() call to the common function exit point. Fixes the following coverity complaints: Error: USE_AFTER_FREE (CWE-825): tools/perf/util/trace-event-read.c:309: double_free: Calling "free" frees pointer "buf" which has already been freed. Signed-off-by: Sanskriti Sharma <sansharm@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Link: http://lkml.kernel.org/r/1538490554-8161-5-git-send-email-sansharm@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf tools: Free 'printk' string in parse_ftrace_printk()Sanskriti Sharma
parse_ftrace_printk() tokenizes and parses a line, calling strdup() each iteration. Add code to free this temporary format string duplicate. Fixes the following coverity complaints: Error: RESOURCE_LEAK (CWE-772): tools/perf/util/trace-event-parse.c:158: overwrite_var: Overwriting "printk" in "printk = strdup(fmt + 1)" leaks the storage that "printk" points to. tools/perf/util/trace-event-parse.c:162: leaked_storage: Variable "printk" going out of scope leaks the storage it points to. Signed-off-by: Sanskriti Sharma <sansharm@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Link: http://lkml.kernel.org/r/1538490554-8161-4-git-send-email-sansharm@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf tools: Cleanup trace-event-info 'tdata' leakSanskriti Sharma
Free tracing_data structure in tracing_data_get() error paths. Fixes the following coverity complaint: Error: RESOURCE_LEAK (CWE-772): leaked_storage: Variable "tdata" going out of scope leaks the storage Signed-off-by: Sanskriti Sharma <sansharm@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Link: http://lkml.kernel.org/r/1538490554-8161-3-git-send-email-sansharm@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf strbuf: Match va_{add,copy} with va_endSanskriti Sharma
Ensure that all code paths in strbuf_addv() call va_end() on the ap_saved copy that was made. Fixes the following coverity complaint: Error: VARARGS (CWE-237): [#def683] tools/perf/util/strbuf.c:106: missing_va_end: va_end was not called for "ap_saved". Signed-off-by: Sanskriti Sharma <sansharm@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Link: http://lkml.kernel.org/r/1538490554-8161-2-git-send-email-sansharm@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf test: S390 does not support watchpoints in test 22Thomas Richter
S390 does not support the perf_event_open system call for attribute type PERF_TYPE_BREAKPOINT. This results in test failure for test 22: [root@s8360046 perf]# ./perf test 22 22: Watchpoint : 22.1: Read Only Watchpoint : FAILED! 22.2: Write Only Watchpoint : FAILED! 22.3: Read / Write Watchpoint : FAILED! 22.4: Modify Watchpoint : FAILED! [root@s8360046 perf]# Add s390 support to avoid these tests being executed on s390 platform: [root@s8360046 perf]# ./perf test 22 [root@s8360046 perf]# ./perf test -v 22 22: Watchpoint : Disabled [root@s8360046 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180928105335.67179-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONGArnaldo Carvalho de Melo
The auxtrace.h header references BITS_PER_LONG without including the header where it is defined, getting it by luck from some other header, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v04ydmbh7tvpcctf3zld9j9s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08tools include: Adopt linux/bits.hArnaldo Carvalho de Melo
So that we reduce the difference of tools/include/linux/bitops.h to the original kernel file, include/linux/bitops.h, trying to remove the need to define BITS_PER_LONG, to avoid clashes with asm/bitsperlong.h. And the things removed from tools/include/linux/bitops.h are really in linux/bits.h, so that we can have a copy and then tools/perf/check_headers.sh will tell us when new stuff gets added to linux/bits.h so that we can check if it is useful and if any adjustment needs to be done to the tools/{include,arch}/ copies. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-y1sqyydvfzo0bjjoj4zsl562@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08dt-bindings: regulator: document stpmic1 pmic regulatorspascal paillet
The STPMIC1 regulators supply power to the application processor as well as to the external system peripherals such as DDR, Flash memories and system devices. Signed-off-by: pascal paillet <p.paillet@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-08blk-mq: complete req in softirq context in case of single queueMing Lei
Lot of controllers may have only one irq vector for completing IO request. And usually affinity of the only irq vector is all possible CPUs, however, on most of ARCH, there may be only one specific CPU for handling this interrupt. So if all IOs are completed in hardirq context, it is inevitable to degrade IO performance because of increased irq latency. This patch tries to address this issue by allowing to complete request in softirq context, like the legacy IO path. IOPS is observed as ~13%+ in the following randread test on raid0 over virtio-scsi. mdadm --create --verbose /dev/md0 --level=0 --chunk=1024 --raid-devices=8 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi fio --time_based --name=benchmark --runtime=30 --filename=/dev/md0 --nrfiles=1 --ioengine=libaio --iodepth=32 --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=32 --rw=randread --blocksize=4k Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: Zach Marano <zmarano@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-08cpupower: Fix coredump on VMWarePrarit Bhargava
cpupower crashes on VMWare guests. The guests have the AMD PStateDef MSR (0xC0010064 + state number) set to zero. As a result fid and did are zero and the crash occurs because of a divide by zero (cof = fid/did). This can be prevented by checking the enable bit in the PStateDef MSR before calculating cof. By doing this the value of pstate[i] remains zero and the value can be tested before displaying the active Pstates. Check the enable bit in the PstateDef register for all supported families and only print out enabled Pstates. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-10-08cpupower: Fix AMD Family 0x17 msr_pstate sizePrarit Bhargava
The msr_pstate data is only 63 bits long and should be 64 bits. Add in the missing bit from res1 for AMD Family 0x17. Reference: https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf, page 138. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-10-08tools headers uapi: Sync kvm.h copyArnaldo Carvalho de Melo
To pick up the changes introduced in: 6fbbde9a1969 ("KVM: x86: Control guest reads of MSR_PLATFORM_INFO") That is not yet used in tools such as 'perf trace'. The type of the change in this file, a simple integer parameter to the KVM_CHECK_EXTENSION ioctl should be easier to implement tho, adding to the libbeauty TODO list. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Drew Schmitt <dasch@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-67h1bio5bihi1q6dy7hgwwx8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08tools arch uapi: Sync the x86 kvm.h copyArnaldo Carvalho de Melo
To get the changes in: d1766202779e ("x86/kvm/lapic: always disable MMIO interface in x2APIC mode") That at this time will not generate changes in tools such as 'perf trace', that still needs more work in tools/perf/examples/bpf/augmented_syscalls.c to need such id -> string tables. This silences the following perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-yadntj2ok6zpzjwi656onuh0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-08rxrpc: Carry call state out of locked section in rxrpc_rotate_tx_window()David Howells
Carry the call state out of the locked section in rxrpc_rotate_tx_window() rather than sampling it afterwards. This is only used to select tracepoint data, but could have changed by the time we do the tracepoint. Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()David Howells
We should only call the function to end a call's Tx phase if we rotated the marked-last packet out of the transmission buffer. Make rxrpc_rotate_tx_window() return an indication of whether it just rotated the packet marked as the last out of the transmit buffer, carrying the information out of the locked section in that function. We can then check the return value instead of examining RXRPC_CALL_TX_LAST. Fixes: 70790dbe3f66 ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08rxrpc: Don't need to take the RCU read lock in the packet receiverDavid Howells
We don't need to take the RCU read lock in the rxrpc packet receive function because it's held further up the stack in the IP input routine around the UDP receive routines. Fix this by dropping the RCU read lock calls from rxrpc_input_packet(). This simplifies the code. Fixes: 70790dbe3f66 ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08rxrpc: Use the UDP encap_rcv hookDavid Howells
Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception in which a packet is placed onto the UDP receive queue and then immediately removed again by rxrpc. Going via the queue in this manner seems like it should be unnecessary. This does, however, require the invention of a value to place in encap_type as that's one of the conditions to switch packets out to the encap_rcv hook. Possibly the value doesn't actually matter for anything other than sockopts on the UDP socket, which aren't accessible outside of rxrpc anyway. This seems to cut a bit of time out of the time elapsed between each sk_buff being timestamped and turning up in rxrpc (the final number in the following trace excerpts). I measured this by making the rxrpc_rx_packet trace point print the time elapsed between the skb being timestamped and the current time (in ns), e.g.: ... 424.278721: rxrpc_rx_packet: ... ACK 25026 So doing a 512MiB DIO read from my test server, with an unmodified kernel: N min max sum mean stddev 27605 2626 7581 7.83992e+07 2840.04 181.029 and with the patch applied: N min max sum mean stddev 27547 1895 12165 6.77461e+07 2459.29 255.02 Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcGreg Kroah-Hartman
David writes: "Sparc fixes: 1) Minor fallthru comment tweaks from Gustavo A. R. Silva. 2) VLA removal from Kees Cook. 3) Make sparc vdso Makefile match x86, from Masahiro Yamada. 4) Fix clock divider programming in mach64 driver, from Mikulas Patocka." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: fix fall-through annotation sparc32: fix fall-through annotation sparc: vdso: clean-up vdso Makefile oradax: remove redundant null check before kfree sparc64: viohs: Remove VLA usage sbus: Use of_get_child_by_name helper sparc: Convert to using %pOFn instead of device_node.name mach64: detect the dot clock divider correctly on sparc
2018-10-08bcache: panic fix for making cache deviceDongbo Cao
when the nbuckets of cache device is smaller than 1024, making cache device will trigger BUG_ON in kernel, add a condition to avoid this. Reported-by: nitroxis <n@nxs.re> Signed-off-by: Dongbo Cao <cdbdyx@163.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-08bcache: split combined if-condition code into separate onesDongbo Cao
Split the combined '||' statements in if() check, to make the code easier for debug. Signed-off-by: Dongbo Cao <cdbdyx@163.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-08bcache: use MAX_CACHES_PER_SET instead of magic number 8 in ↵Shenghui Wang
__bch_bucket_alloc_set Current cache_set has MAX_CACHES_PER_SET caches most, and the macro is used for " struct cache *cache_by_alloc[MAX_CACHES_PER_SET]; " in the define of struct cache_set. Use MAX_CACHES_PER_SET instead of magic number 8 in __bch_bucket_alloc_set. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-08bcache: replace hard coded number with BUCKET_GC_GEN_MAXColy Li
In extents.c:bch_extent_bad(), number 96 is used as parameter to call btree_bug_on(). The purpose is to check whether stale gen value exceeds BUCKET_GC_GEN_MAX, so it is better to use macro BUCKET_GC_GEN_MAX to make the code more understandable. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-08bcache: remove useless parameter of bch_debug_init()Dongbo Cao
Parameter "struct kobject *kobj" in bch_debug_init() is useless, remove it in this patch. Signed-off-by: Dongbo Cao <cdbdyx@163.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>