summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-18bnxt_en: Abort waiting for firmware response if there is no heartbeat.Pavan Chebbi
This is especially beneficial during the NVRAM related firmware commands that have longer timeouts. If the BNXT_STATE_FW_FATAL_COND flag gets set while waiting for firmware response, abort and return error. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Add a warning message for driver initiated resetVasundhara Volam
During loss of heartbeat, log this warning message. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Return proper error code for non-existent NVM variableVasundhara Volam
For NVM params that are not supported in the current NVM configuration, return the error as -EOPNOTSUPP. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Report health status update after reset is doneVasundhara Volam
Report health status update to devlink health reporter, once reset is completed. Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Set MASTER flag during driver registration.Vasundhara Volam
The Linux driver is capable of being the master function to handle resets, so we set the flag to let firmware know. Some other drivers, such as DPDK, is not capable and will not set the flag. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Extend ETHTOOL_RESET to hot reset driver.Vasundhara Volam
If firmware supports hot reset, extend ETHTOOL_RESET to support hot reset driver which does not require a driver reload after ETHTOOL_RESET. The driver will go through the same coordinated reset sequence as a firmware initiated fatal/non-fatal reset. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Increase firmware response timeout for coredump commands.Vasundhara Volam
Use the larger HWRM_COREDUMP_TIMEOUT value for coredump related data response from the firmware. These commands take longer than normal commands. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Improve RX buffer error handling.Michael Chan
When hardware reports RX buffer errors, the latest 57500 chips do not require reset. The packet is discarded by the hardware and the ring will continue to operate. Also, add an rx_buf_errors counter for this type of error. It can help the user to identify if the aggregation ring is too small. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18bnxt_en: Update firmware interface spec to 1.10.1.12.Michael Chan
The aRFS ring table interface has changed for the 57500 chips. Updating it accordingly so it will work with the latest production firmware. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18Merge branch 'selftests-Add-ethtool-and-scale-tests'David S. Miller
Ido Schimmel says: ==================== selftests: Add ethtool and scale tests This patch set adds generic ethtool tests and a mlxsw-specific router scale test for Spectrum-2. Patches #1-#2 from Danielle add the router scale test for Spectrum-2. It re-uses the same test as Spectrum-1, but it is invoked with a different scale, according to what it is queried from devlink-resource. Patches #3-#5 from Amit are a re-work of the ethtool tests that were posted in the past [1]. Patches #3-#4 add the necessary library routines, whereas patch #5 adds the test itself. The test checks both good and bad flows with autoneg on and off. The test plan it detailed in the commit message. Last time Andrew and Florian (copied) provided very useful feedback that is incorporated in this set. Namely: * Parse the value of the different link modes from /usr/include/linux/ethtool.h * Differentiate between supported and advertised speeds and use the latter in autoneg tests * Make the test generic and move it to net/forwarding/ instead of being mlxsw-specific [1] https://patchwork.ozlabs.org/cover/1112903/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18selftests: forwarding: Add speed and auto-negotiation testAmit Cohen
Check configurations and packets transference with different variations of autoneg and speed. Test plan: 1. Test force of same speed with autoneg off 2. Test force of different speeds with autoneg off (should fail) 3. One side is autoneg on and other side sets force of common speeds 4. One side is autoneg on and other side only advertises a subset of the common speeds (one speed of the subset) 5. One side is autoneg on and other side only advertises a subset of the common speeds. Check that highest speed is negotiated 6. Test autoneg on, but each side advertises different speeds (should fail) Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18selftests: forwarding: lib.sh: Add wait for dev with timeoutAmit Cohen
Add a function that waits for device with maximum number of iterations. It enables to limit the waiting and prevent infinite loop. This will be used by the subsequent patch which will set two ports to different speeds in order to make sure they cannot negotiate a link. Waiting for all the setup is limited with 10 minutes for each device. Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18selftests: forwarding: Add ethtool_lib.shAmit Cohen
Functions: 1. speeds_arr_get The function returns an array of speed values from /usr/include/linux/ethtool.h The array looks as follows: [10baseT/Half] = 0, [10baseT/Full] = 1, ... 2. ethtool_set: params: cmd The function runs ethtool by cmd (ethtool -s cmd) and checks if there was an error in configuration 3. dev_speeds_get: params: dev, with_mode (0 or 1), adver (0 or 1) return value: Array of supported/Advertised link modes with/without mode * Example 1: speeds_get swp1 0 0 return: 1000 10000 40000 * Example 2: speeds_get swp1 1 1 return: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full 4. common_speeds_get: params: dev1, dev2, with_mode (0 or 1), adver (0 or 1) return value: Array of common speeds of dev1 and dev2 * Example: common_speeds_get swp1 swp2 0 0 return: 1000 10000 Assuming that swp1 supports 1000 10000 40000 and swp2 supports 1000 10000 Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18selftests: mlxsw: Check devlink device before running testDanielle Ratson
The scale test for Spectrum-2 should only be invoked for Spectrum-2. Skip the test otherwise. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18selftests: mlxsw: Add router scale test for Spectrum-2Danielle Ratson
Same as for Spectrum-1, test the ability to add the maximum number of routes possible to the switch. Invoke the test from the 'resource_scale' wrapper script. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18mlxsw: spectrum_router: Fix determining underlay for a GRE tunnelPetr Machata
The helper mlxsw_sp_ipip_dev_ul_tb_id() determines the underlay VRF of a GRE tunnel. For a tunnel without a bound device, it uses the same VRF that the tunnel is in. However in Linux, a GRE tunnel without a bound device uses the main VRF as the underlay. Fix the function accordingly. mlxsw further assumed that moving a tunnel to a different VRF could cause conflict in local tunnel endpoint address, which cannot be offloaded. However, the only way that an underlay could be changed by moving the tunnel device itself is if the tunnel device does not have a bound device. But in that case the underlay is always the main VRF, so there is no opportunity to introduce a conflict by moving such device. Thus this check constitutes a dead code, and can be removed, which do. Fixes: 6ddb7426a7d4 ("mlxsw: spectrum_router: Introduce loopback RIFs") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18net: atm: Reduce the severity of logging in unlink_clip_vccAditya Pakki
In case of errors in unlink_clip_vcc, the logging level is set to pr_crit but failures in clip_setentry are handled by pr_err(). The patch changes the severity consistent across invocations. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18Merge branch 'page_pool-followup-changes-to-restore-tracepoint-features'David S. Miller
Jesper Dangaard says: ==================== page_pool: followup changes to restore tracepoint features This patchset is a followup to Jonathan patch, that do not release pool until inflight == 0. That changed page_pool to be responsible for its own delayed destruction instead of relying on xdp memory model. As the page_pool maintainer, I'm promoting the use of tracepoint to troubleshoot and help driver developers verify correctness when converting at driver to use page_pool. The role of xdp:mem_disconnect have changed, which broke my bpftrace tools for shutdown verification. With these changes, the same capabilities are regained. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18page_pool: extend tracepoint to also include the page PFNJesper Dangaard Brouer
The MM tracepoint for page free (called kmem:mm_page_free) doesn't provide the page pointer directly, instead it provides the PFN (Page Frame Number). This is annoying when writing a page_pool leak detector in BPF. This patch change page_pool tracepoints to also provide the PFN. The page pointer is still provided to allow other kinds of troubleshooting from BPF. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18page_pool: add destroy attempts counter and rename tracepointJesper Dangaard Brouer
When Jonathan change the page_pool to become responsible to its own shutdown via deferred work queue, then the disconnect_cnt counter was removed from xdp memory model tracepoint. This patch change the page_pool_inflight tracepoint name to page_pool_release, because it reflects the new responsability better. And it reintroduces a counter that reflect the number of times page_pool_release have been tried. The counter is also used by the code, to only empty the alloc cache once. With a stuck work queue running every second and counter being 64-bit, it will overrun in approx 584 billion years. For comparison, Earth lifetime expectancy is 7.5 billion years, before the Sun will engulf, and destroy, the Earth. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18xdp: remove memory poison on free for struct xdp_mem_allocatorJesper Dangaard Brouer
When looking at the details I realised that the memory poison in __xdp_mem_allocator_rcu_free doesn't make sense. This is because the SLUB allocator uses the first 16 bytes (on 64 bit), for its freelist, which overlap with members in struct xdp_mem_allocator, that were updated. Thus, SLUB already does the "poisoning" for us. I still believe that poisoning memory make sense in other cases. Kernel have gained different use-after-free detection mechanism, but enabling those is associated with a huge overhead. Experience is that debugging facilities can change the timing so much, that that a race condition will not be provoked when enabled. Thus, I'm still in favour of poisoning memory where it makes sense. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18net: phy: avoid matching all-ones clause 45 PHY IDsRussell King
We currently match clause 45 PHYs using any ID read from a MMD marked as present in the "Devices in package" registers 5 and 6. However, this is incorrect. 45.2 says: "The definition of the term package is vendor specific and could be a chip, module, or other similar entity." so a package could be more or less than the whole PHY - a PHY could be made up of several modules instantiated onto a single chip such as the Marvell 88x3310, or some of the MMDs could be disabled according to chip configuration, such as the Broadcom 84881. In the case of Broadcom 84881, the "Devices in package" registers contain 0xc000009b, meaning that there is a PHYXS present in the package, but all registers in MMD 4 return 0xffff. This leads to our matching code incorrectly binding this PHY to one of our generic PHY drivers. This patch changes the way we determine whether to attempt to match a MMD identifier, or use it to request a module - if the identifier is all-ones, then we skip over it. When reading the identifiers, we initialise phydev->c45_ids.device_ids to all-ones, only reading the device ID if the "Devices in package" registers indicates we should. This avoids the generic drivers incorrectly matching on a PHY ID of 0xffffffff. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18Merge branch 'Add-support-for-SFPs-behind-PHYs'David S. Miller
Russell King says: ==================== Add support for SFPs behind PHYs This series adds partial support for SFP cages connected to PHYs, specifically optical SFPs. We add core infrastructure to phylib for this, and arrange for minimal code in the PHY driver - currently, this is code to verify that the module is one that we can support for Marvell 10G PHYs. v2: add yaml binding patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18net: phy: marvell10g: add SFP+ supportRussell King
Add support for SFP+ cages to the Marvell 10G PHY driver. This is slightly complicated by the way phylib works in that we need to use a multi-step process to attach the SFP bus, and we also need to track the phylink state machine to know when the module's transmit disable signal should change state. With appropriate DT changes, this allows the SFP+ canges on the Macchiatobin platform to be functional. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18net: phy: add core phylib sfp supportRussell King
Add core phylib help for supporting SFP sockets on PHYs. This provides a mechanism to inform the SFP layer about PHY up/down events, and also unregister the SFP bus when the PHY is going away. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18dt-bindings: net: add ethernet controller and phy sfp propertyRussell King
Document the missing sfp property for ethernet controllers (which has existed for some time) which is being extended to ethernet PHYs. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Wildcard support for the net,iface set from Kristian Evensen. 2) Offload support for matching on the input interface. 3) Simplify matching on vlan header fields. 4) Add nft_payload_rebuild_vlan_hdr() function to rebuild the vlan header from the vlan sk_buff metadata. 5) Pass extack to nft_flow_cls_offload_setup(). 6) Add C-VLAN matching support. 7) Use time64_t in xt_time to fix y2038 overflow, from Arnd Bergmann. 8) Use time_t in nft_meta to fix y2038 overflow, also from Arnd. 9) Add flow_action_entry_next() helper function to flowtable offload infrastructure. 10) Add IPv6 support to the flowtable offload infrastructure. 11) Support for input interface matching from postrouting, from Phil Sutter. 12) Missing check for ndo callback in flowtable offload, from wenxu. 13) Remove conntrack parameter from flow_offload_fill_dir(), from wenxu. 14) Do not pass flow_rule object for rule removal, cookie is sufficient to achieve this. 15) Release flow_rule object in case of error from the offload commit path. 16) Undo offload ruleset updates if transaction fails. 17) Check for error when binding flowtable callbacks, from wenxu. 18) Always unbind flowtable callbacks when unregistering hooks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18btrfs: drop bdev argument from submit_extent_pageDavid Sterba
After previous patches removing bdev being passed around to set it to bio, it has become unused in submit_extent_page. So it now has "only" 13 parameters. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: remove extent_map::bdevDavid Sterba
We can now remove the bdev from extent_map. Previous patches made sure that bio_set_dev is correctly in all places and that we don't need to grab it from latest_bdev or pass it around inside the extent map. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: drop bio_set_dev where not neededDavid Sterba
bio_set_dev sets a bdev to a bio and is not only setting a pointer bug also changing some state bits if there was a different bdev set before. This is one thing that's not needed. Another thing is that setting a bdev at bio allocation time is too early and actually does not work with plain redundancy profiles, where each time we submit a bio to a device, the bdev is set correctly. In many places the bio bdev is set to latest_bdev that seems to serve as a stub pointer "just to put something to bio". But we don't have to do that. Where do we know which bdev to set: * for regular IO: submit_stripe_bio that's called by btrfs_map_bio * repair IO: repair_io_failure, read or write from specific device * super block write (using buffer_heads but uses raw bdev) and barriers * scrub: this does not use all regular IO paths as it needs to reach all copies, verify and fixup eventually, and for that all bdev management is independent * raid56: rbio_add_io_page, for the RMW write * integrity-checker: does it's own low-level block tracking Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: get bdev directly from fs_devices in submit_extent_pageDavid Sterba
This is preparatory patch to remove @bdev parameter from submit_extent_page. It can't be removed completely, because the cgroups need it for wbc when initializing the bio wbc_init_bio bio_associate_blkg_from_css dereference bdev->bi_disk->queue The bdev pointer is the same as latest_bdev, thus no functional change. We can retrieve it from fs_devices that's reachable through several dereferences. The local variable shadows the parameter, but that's only temporary. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18perf parse: Report initial event parsing errorIan Rogers
Record the first event parsing error and report. Implementing feedback from Jiri Olsa: https://lkml.org/lkml/2019/10/28/680 An example error is: $ tools/perf/perf stat -e c/c/ WARNING: multiple event parsing errors event syntax error: 'c/c/' \___ unknown term valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore Initial error: event syntax error: 'c/c/' \___ Cannot find PMU `c'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lore.kernel.org/lkml/20191116074652.9960-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Trace a magic number if variable is not foundMasami Hiramatsu
Trace a magic number as immediate value if the target variable is not found at some probe points which is based on one probe event. This feature is good for the case if you trace a source code line with some local variables, which is compiled into several instructions and some of the variables are optimized out on some instructions. Even if so, with this feature, perf probe trace a magic number instead of such disappeared variables and fold those probes on one event. E.g. without this patch: # perf probe -D "pud_page_vaddr pud" Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. Failed to find 'pud' in this function. p:probe/pud_page_vaddr _text+23480787 pud=%ax:x64 p:probe/pud_page_vaddr _text+23808453 pud=%bp:x64 p:probe/pud_page_vaddr _text+23558082 pud=%ax:x64 p:probe/pud_page_vaddr _text+328373 pud=%r8:x64 p:probe/pud_page_vaddr _text+348448 pud=%bx:x64 p:probe/pud_page_vaddr _text+23816818 pud=%bx:x64 With this patch: # perf probe -D "pud_page_vaddr pud" | head spurious_kernel_fault is blacklisted function, skip it. vmalloc_fault is blacklisted function, skip it. p:probe/pud_page_vaddr _text+23480787 pud=%ax:x64 p:probe/pud_page_vaddr _text+149051 pud=\deade12d:x64 p:probe/pud_page_vaddr _text+23808453 pud=%bp:x64 p:probe/pud_page_vaddr _text+315926 pud=\deade12d:x64 p:probe/pud_page_vaddr _text+23807209 pud=\deade12d:x64 p:probe/pud_page_vaddr _text+23557365 pud=%ax:x64 p:probe/pud_page_vaddr _text+314097 pud=%di:x64 p:probe/pud_page_vaddr _text+314015 pud=\deade12d:x64 p:probe/pud_page_vaddr _text+313893 pud=\deade12d:x64 p:probe/pud_page_vaddr _text+324083 pud=\deade12d:x64 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406476931.24476.6261475888681844285.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Support DW_AT_const_value constant valueMasami Hiramatsu
Support DW_AT_const_value for variable assignment instead of location. Note that this requires ftrace supporting immediate value. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406476012.24476.16096289871757175775.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Support multiprobe eventMasami Hiramatsu
Support multiprobe event if the event is based on function and lines and kernel supports it. In this case, perf probe creates the first probe with an event, and tries to append following probes on that event, since those probes must be on the same source code line. Before this patch; # perf probe -a vfs_read:18 Added new events: probe:vfs_read_L18 (on vfs_read:18) probe:vfs_read_L18_1 (on vfs_read:18) You can now use it in all perf tools, such as: perf record -e probe:vfs_read_L18_1 -aR sleep 1 # After this patch (on multiprobe supported kernel) # perf probe -a vfs_read:18 Added new events: probe:vfs_read_L18 (on vfs_read:18) probe:vfs_read_L18 (on vfs_read:18) You can now use it in all perf tools, such as: perf record -e probe:vfs_read_L18 -aR sleep 1 # Committer testing: On a kernel that doesn't support multiprobe events, after this patch: # uname -a Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux # grep append /sys/kernel/debug/tracing/README be modified by appending '.descending' or '.ascending' to a can be modified by appending any of the following modifiers # # perf probe -a vfs_read:18 Added new events: probe:vfs_read_L18 (on vfs_read:18) probe:vfs_read_L18_1 (on vfs_read:18) You can now use it in all perf tools, such as: perf record -e probe:vfs_read_L18_1 -aR sleep 1 # perf probe -l probe:vfs_read_L18 (on vfs_read:18@fs/read_write.c) probe:vfs_read_L18_1 (on vfs_read:18@fs/read_write.c) # Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406475010.24476.586290752591512351.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Generate event name with line numberMasami Hiramatsu
Generate event name from function name with line number as <function>_L<line_number>. Note that this is only for the new event which is defined by the line number of function (except for line 0). If there is another event on same line, you have to use "-f" option. In that case, the new event has "_1" suffix. e.g. # perf probe -a kernel_read:2 Added new event: probe:kernel_read_L2 (on kernel_read:2) You can now use it in all perf tools, such as: perf record -e probe:kernel_read_L2 -aR sleep 1 But if we omit the line number or 0th line, it will have no suffix. # perf probe -a kernel_read:0 Added new event: probe:kernel_read (on kernel_read) You can now use it in all perf tools, such as: perf record -e probe:kernel_read -aR sleep 1 probe:kernel_read (on kernel_read@linux-5.0.0/fs/read_write.c) probe:kernel_read_L2 (on kernel_read:2@linux-5.0.0/fs/read_write.c) Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406474026.24476.2828897745502059569.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Do not show non representive lines by perf-probe -LMasami Hiramatsu
Since perf probe -L shows non representive lines, it can be mislead users where user can put probes. This prevents to show such non representive lines so that user can understand which lines user can probe. # perf probe -L kernel_read <kernel_read@/build/linux-pvZVvI/linux-5.0.0/fs/read_write.c:0> 0 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { 2 mm_segment_t old_fs; ssize_t result; old_fs = get_fs(); 6 set_fs(get_ds()); /* The cast to a user pointer is valid due to the set_fs() */ 8 result = vfs_read(file, (void __user *)buf, count, pos); 9 set_fs(old_fs); 10 return result; } EXPORT_SYMBOL(kernel_read); Committer testing: Before: # perf probe -L kernel_read <kernel_read@/usr/src/debug/kernel-5.3.fc30/linux-5.3.8-200.fc30.x86_64/fs/read_write.c:0> 0 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) 1 { 2 mm_segment_t old_fs; 3 ssize_t result; 5 old_fs = get_fs(); 6 set_fs(KERNEL_DS); /* The cast to a user pointer is valid due to the set_fs() */ 8 result = vfs_read(file, (void __user *)buf, count, pos); 9 set_fs(old_fs); 10 return result; } EXPORT_SYMBOL(kernel_read); # See the 1, 3, 5 lines? They shouldn't be there, after this patch: # perf probe -L kernel_read <kernel_read@/usr/src/debug/kernel-5.3.fc30/linux-5.3.8-200.fc30.x86_64/fs/read_write.c:0> 0 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { 2 mm_segment_t old_fs; ssize_t result; old_fs = get_fs(); 6 set_fs(KERNEL_DS); /* The cast to a user pointer is valid due to the set_fs() */ 8 result = vfs_read(file, (void __user *)buf, count, pos); 9 set_fs(old_fs); 10 return result; } EXPORT_SYMBOL(kernel_read); # Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406473064.24476.2913278267727587314.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Verify given line is a representive lineMasami Hiramatsu
Verify user given probe line is a representive line (which doesn't share the address with other lines or the line is the least line among the lines which shares same address), and if not, it shows what is the representive line. Without this fix, user can put a probe on the lines which is not a a representive line. But since this is not a representive line, perf probe -l shows a representive line number instead of user given line number. e.g. (put kernel_read:3, but listed as kernel_read:2) # perf probe -a kernel_read:3 Added new event: probe:kernel_read (on kernel_read:3) You can now use it in all perf tools, such as: perf record -e probe:kernel_read -aR sleep 1 # perf probe -l probe:kernel_read (on kernel_read:2@linux-5.0.0/fs/read_write.c) With this fix, perf probe doesn't allow user to put a probe on a representive line, and tell what is the representive line. # perf probe -a kernel_read:3 This line is sharing the addrees with other lines. Please try to probe at kernel_read:2 instead. Error: Failed to add events. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406472071.24476.14915451439785001021.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf probe: Show correct statement line number by perf probe -lMasami Hiramatsu
The dwarf_getsrc_die() can return the line which is not a statement nor the least line number among the lines which shares same address. This can lead perf probe --list shows incorrect line number for probed address. To fix this, this introduces cu_getsrc_die() which returns only a statement line and which is the least line number (we call it the representive line for an address), and use it in cu_find_lineinfo(). Also, if the given address is the entry address of a real function, cu_find_lineinfo() returns the function declared line number instead of the start line number of the function body. For example, without this change perf probe -l shows incorrect line as below. # perf probe -a kernel_read:2 Added new event: probe:kernel_read (on kernel_read:2) You can now use it in all perf tools, such as: perf record -e probe:kernel_read -aR sleep 1 # perf probe -l probe:kernel_read (on kernel_read:1@linux-5.0.0/fs/read_write.c) With this fix, it shows correct line number as below; # perf probe -l probe:kernel_read (on kernel_read:2@linux-5.0.0/fs/read_write.c) Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lore.kernel.org/lkml/157406471067.24476.17463149618465494448.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18x86/insn: Add some Intel instructions to the opcode mapAdrian Hunter
Add to the opcode map the following instructions: cldemote tpause umonitor umwait movdiri movdir64b enqcmd enqcmds encls enclu enclv pconfig wbnoinvd For information about the instructions, refer Intel SDM May 2019 (325462-070US) and Intel Architecture Instruction Set Extensions May 2019 (319433-037). The instruction decoding can be tested using the perf tools' "x86 instruction decoder - new instructions" test as folllows: $ perf test -v "new " 2>&1 | grep -i cldemote Decoded ok: 0f 1c 00 cldemote (%eax) Decoded ok: 0f 1c 05 78 56 34 12 cldemote 0x12345678 Decoded ok: 0f 1c 84 c8 78 56 34 12 cldemote 0x12345678(%eax,%ecx,8) Decoded ok: 0f 1c 00 cldemote (%rax) Decoded ok: 41 0f 1c 00 cldemote (%r8) Decoded ok: 0f 1c 04 25 78 56 34 12 cldemote 0x12345678 Decoded ok: 0f 1c 84 c8 78 56 34 12 cldemote 0x12345678(%rax,%rcx,8) Decoded ok: 41 0f 1c 84 c8 78 56 34 12 cldemote 0x12345678(%r8,%rcx,8) $ perf test -v "new " 2>&1 | grep -i tpause Decoded ok: 66 0f ae f3 tpause %ebx Decoded ok: 66 0f ae f3 tpause %ebx Decoded ok: 66 41 0f ae f0 tpause %r8d $ perf test -v "new " 2>&1 | grep -i umonitor Decoded ok: 67 f3 0f ae f0 umonitor %ax Decoded ok: f3 0f ae f0 umonitor %eax Decoded ok: 67 f3 0f ae f0 umonitor %eax Decoded ok: f3 0f ae f0 umonitor %rax Decoded ok: 67 f3 41 0f ae f0 umonitor %r8d $ perf test -v "new " 2>&1 | grep -i umwait Decoded ok: f2 0f ae f0 umwait %eax Decoded ok: f2 0f ae f0 umwait %eax Decoded ok: f2 41 0f ae f0 umwait %r8d $ perf test -v "new " 2>&1 | grep -i movdiri Decoded ok: 0f 38 f9 03 movdiri %eax,(%ebx) Decoded ok: 0f 38 f9 88 78 56 34 12 movdiri %ecx,0x12345678(%eax) Decoded ok: 48 0f 38 f9 03 movdiri %rax,(%rbx) Decoded ok: 48 0f 38 f9 88 78 56 34 12 movdiri %rcx,0x12345678(%rax) $ perf test -v "new " 2>&1 | grep -i movdir64b Decoded ok: 66 0f 38 f8 18 movdir64b (%eax),%ebx Decoded ok: 66 0f 38 f8 88 78 56 34 12 movdir64b 0x12345678(%eax),%ecx Decoded ok: 67 66 0f 38 f8 1c movdir64b (%si),%bx Decoded ok: 67 66 0f 38 f8 8c 34 12 movdir64b 0x1234(%si),%cx Decoded ok: 66 0f 38 f8 18 movdir64b (%rax),%rbx Decoded ok: 66 0f 38 f8 88 78 56 34 12 movdir64b 0x12345678(%rax),%rcx Decoded ok: 67 66 0f 38 f8 18 movdir64b (%eax),%ebx Decoded ok: 67 66 0f 38 f8 88 78 56 34 12 movdir64b 0x12345678(%eax),%ecx $ perf test -v "new " 2>&1 | grep -i enqcmd Decoded ok: f2 0f 38 f8 18 enqcmd (%eax),%ebx Decoded ok: f2 0f 38 f8 88 78 56 34 12 enqcmd 0x12345678(%eax),%ecx Decoded ok: 67 f2 0f 38 f8 1c enqcmd (%si),%bx Decoded ok: 67 f2 0f 38 f8 8c 34 12 enqcmd 0x1234(%si),%cx Decoded ok: f3 0f 38 f8 18 enqcmds (%eax),%ebx Decoded ok: f3 0f 38 f8 88 78 56 34 12 enqcmds 0x12345678(%eax),%ecx Decoded ok: 67 f3 0f 38 f8 1c enqcmds (%si),%bx Decoded ok: 67 f3 0f 38 f8 8c 34 12 enqcmds 0x1234(%si),%cx Decoded ok: f2 0f 38 f8 18 enqcmd (%rax),%rbx Decoded ok: f2 0f 38 f8 88 78 56 34 12 enqcmd 0x12345678(%rax),%rcx Decoded ok: 67 f2 0f 38 f8 18 enqcmd (%eax),%ebx Decoded ok: 67 f2 0f 38 f8 88 78 56 34 12 enqcmd 0x12345678(%eax),%ecx Decoded ok: f3 0f 38 f8 18 enqcmds (%rax),%rbx Decoded ok: f3 0f 38 f8 88 78 56 34 12 enqcmds 0x12345678(%rax),%rcx Decoded ok: 67 f3 0f 38 f8 18 enqcmds (%eax),%ebx Decoded ok: 67 f3 0f 38 f8 88 78 56 34 12 enqcmds 0x12345678(%eax),%ecx $ perf test -v "new " 2>&1 | grep -i enqcmds Decoded ok: f3 0f 38 f8 18 enqcmds (%eax),%ebx Decoded ok: f3 0f 38 f8 88 78 56 34 12 enqcmds 0x12345678(%eax),%ecx Decoded ok: 67 f3 0f 38 f8 1c enqcmds (%si),%bx Decoded ok: 67 f3 0f 38 f8 8c 34 12 enqcmds 0x1234(%si),%cx Decoded ok: f3 0f 38 f8 18 enqcmds (%rax),%rbx Decoded ok: f3 0f 38 f8 88 78 56 34 12 enqcmds 0x12345678(%rax),%rcx Decoded ok: 67 f3 0f 38 f8 18 enqcmds (%eax),%ebx Decoded ok: 67 f3 0f 38 f8 88 78 56 34 12 enqcmds 0x12345678(%eax),%ecx $ perf test -v "new " 2>&1 | grep -i encls Decoded ok: 0f 01 cf encls Decoded ok: 0f 01 cf encls $ perf test -v "new " 2>&1 | grep -i enclu Decoded ok: 0f 01 d7 enclu Decoded ok: 0f 01 d7 enclu $ perf test -v "new " 2>&1 | grep -i enclv Decoded ok: 0f 01 c0 enclv Decoded ok: 0f 01 c0 enclv $ perf test -v "new " 2>&1 | grep -i pconfig Decoded ok: 0f 01 c5 pconfig Decoded ok: 0f 01 c5 pconfig $ perf test -v "new " 2>&1 | grep -i wbnoinvd Decoded ok: f3 0f 09 wbnoinvd Decoded ok: f3 0f 09 wbnoinvd Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191115135447.6519-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18x86/insn: perf tools: Add some instructions to the new instructions testAdrian Hunter
Add to the "x86 instruction decoder - new instructions" test the following instructions: cldemote tpause umonitor umwait movdiri movdir64b enqcmd enqcmds encls enclu enclv pconfig wbnoinvd For information about the instructions, refer Intel SDM May 2019 (325462-070US) and Intel Architecture Instruction Set Extensions May 2019 (319433-037). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191115135447.6519-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18selftests, bpf: Workaround an alu32 sub-register spilling issueYonghong Song
Currently, with latest llvm trunk, selftest test_progs failed obj file test_seg6_loop.o with the following error in verifier: infinite loop detected at insn 76 The byte code sequence looks like below, and noted that alu32 has been turned off by default for better generated codes in general: 48: w3 = 100 49: *(u32 *)(r10 - 68) = r3 ... ; if (tlv.type == SR6_TLV_PADDING) { 76: if w3 == 5 goto -18 <LBB0_19> ... 85: r1 = *(u32 *)(r10 - 68) ; for (int i = 0; i < 100; i++) { 86: w1 += -1 87: if w1 == 0 goto +5 <LBB0_20> 88: *(u32 *)(r10 - 68) = r1 The main reason for verification failure is due to partial spills at r10 - 68 for induction variable "i". Current verifier only handles spills with 8-byte values. The above 4-byte value spill to stack is treated to STACK_MISC and its content is not saved. For the above example: w3 = 100 R3_w=inv100 fp-64_w=inv1086626730498 *(u32 *)(r10 - 68) = r3 R3_w=inv100 fp-64_w=inv1086626730498 ... r1 = *(u32 *)(r10 - 68) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) fp-64=inv1086626730498 To resolve this issue, verifier needs to be extended to track sub-registers in spilling, or llvm needs to enhanced to prevent sub-register spilling in register allocation phase. The former will increase verifier complexity and the latter will need some llvm "hacking". Let us workaround this issue by declaring the induction variable as "long" type so spilling will happen at non sub-register level. We can revisit this later if sub-register spilling causes similar or other verification issues. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191117214036.1309510-1-yhs@fb.com
2019-11-18selftests, bpf: Fix test_tc_tunnel hangingJiri Benc
When run_kselftests.sh is run, it hangs after test_tc_tunnel.sh. The reason is test_tc_tunnel.sh ensures the server ('nc -l') is run all the time, starting it again every time it is expected to terminate. The exception is the final client_connect: the server is not started anymore, which ensures no process is kept running after the test is finished. For a sit test, though, the script is terminated prematurely without the final client_connect and the 'nc' process keeps running. This in turn causes the run_one function in kselftest/runner.sh to hang forever, waiting for the runaway process to finish. Ensure a remaining server is terminated on cleanup. Fixes: f6ad6accaa99 ("selftests/bpf: expand test_tc_tunnel with SIT encap") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/bpf/60919291657a9ee89c708d8aababc28ebe1420be.1573821780.git.jbenc@redhat.com
2019-11-18selftests, bpf: xdping is not meant to be run standaloneJiri Benc
The actual test to run is test_xdping.sh, which is already in TEST_PROGS. The xdping program alone is not runnable with 'make run_tests', it immediatelly fails due to missing arguments. Move xdping to TEST_GEN_PROGS_EXTENDED in order to be built but not run. Fixes: cd5385029f1d ("selftests/bpf: measure RTT from xdp using xdping") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/4365c81198f62521344c2215909634407184387e.1573821726.git.jbenc@redhat.com
2019-11-18perf map: Move seldom used ->flags field to second cachelineArnaldo Carvalho de Melo
So we start with: $ pahole -C map ~/bin/perf struct map { union { struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */ struct list_head node; /* 0 16 */ } __attribute__((__aligned__(8))); /* 0 24 */ u64 start; /* 24 8 */ u64 end; /* 32 8 */ _Bool erange_warned:1; /* 40: 0 1 */ _Bool priv:1; /* 40: 1 1 */ /* XXX 6 bits hole, try to pack */ /* XXX 3 bytes hole, try to pack */ u32 prot; /* 44 4 */ u32 flags; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ u64 pgoff; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 reloc; /* 64 8 */ u32 maj; /* 72 4 */ u32 min; /* 76 4 */ u64 ino; /* 80 8 */ u64 ino_generation; /* 88 8 */ u64 (*map_ip)(struct map *, u64); /* 96 8 */ u64 (*unmap_ip)(struct map *, u64); /* 104 8 */ struct dso * dso; /* 112 8 */ refcount_t refcnt; /* 120 4 */ /* size: 128, cachelines: 2, members: 17 */ /* sum members: 116, holes: 2, sum holes: 7 */ /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */ /* padding: 4 */ /* forced alignments: 1 */ } __attribute__((__aligned__(8))); $ and 'flags' is seldom used when printing details about the map or with the "cacheline" sort order, we can move them it to the second cacheline, that will allow combining it with 'refcnt', that is only four bytes: $ pahole -C map ~/bin/perf struct map { union { struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */ struct list_head node; /* 0 16 */ } __attribute__((__aligned__(8))); /* 0 24 */ u64 start; /* 24 8 */ u64 end; /* 32 8 */ _Bool erange_warned:1; /* 40: 0 1 */ _Bool priv:1; /* 40: 1 1 */ /* XXX 6 bits hole, try to pack */ /* XXX 3 bytes hole, try to pack */ u32 prot; /* 44 4 */ u64 pgoff; /* 48 8 */ u64 reloc; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u32 maj; /* 64 4 */ u32 min; /* 68 4 */ u64 ino; /* 72 8 */ u64 ino_generation; /* 80 8 */ u64 (*map_ip)(struct map *, u64); /* 88 8 */ u64 (*unmap_ip)(struct map *, u64); /* 96 8 */ struct dso * dso; /* 104 8 */ refcount_t refcnt; /* 112 4 */ u32 flags; /* 116 4 */ /* size: 120, cachelines: 2, members: 17 */ /* sum members: 116, holes: 1, sum holes: 3 */ /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */ /* forced alignments: 1 */ /* last cacheline: 56 bytes */ } __attribute__((__aligned__(8))); $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-2cdw3zlw1mkamaf7nqtdlxfi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18perf map: Use bitmap for booleansArnaldo Carvalho de Melo
The map->priv and map->erange_warned are seldom used, the first only in tests/vmlinux-kallsyms.c, the later only when hist_entry__inc_addr_samples() returns -ERANGE in 'perf top', which are really rare occasions, so make them a bool bitfield. This will open up space for other members on the first cacheline. $ pahole -C map ~/bin/perf struct map { union { struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */ struct list_head node; /* 0 16 */ } __attribute__((__aligned__(8))); /* 0 24 */ u64 start; /* 24 8 */ u64 end; /* 32 8 */ _Bool erange_warned:1; /* 40: 0 1 */ _Bool priv:1; /* 40: 1 1 */ /* XXX 6 bits hole, try to pack */ /* XXX 3 bytes hole, try to pack */ u32 prot; /* 44 4 */ u32 flags; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ u64 pgoff; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 reloc; /* 64 8 */ u32 maj; /* 72 4 */ u32 min; /* 76 4 */ u64 ino; /* 80 8 */ u64 ino_generation; /* 88 8 */ u64 (*map_ip)(struct map *, u64); /* 96 8 */ u64 (*unmap_ip)(struct map *, u64); /* 104 8 */ struct dso * dso; /* 112 8 */ refcount_t refcnt; /* 120 4 */ /* size: 128, cachelines: 2, members: 17 */ /* sum members: 116, holes: 2, sum holes: 7 */ /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */ /* padding: 4 */ /* forced alignments: 1 */ } __attribute__((__aligned__(8))); $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-g5545pcq4ff0wr17tfb1piqt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18drm/i915: Protect request peeking with RCUChris Wilson
Since the execlists_active() is no longer protected by the engine->active.lock, we need to protect the request pointer with RCU to prevent it being freed as we evaluate whether or not we need to preempt. Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-2-chris@chris-wilson.co.uk (cherry picked from commit 7d148635253328dda7cfe55d57e3c828e9564427) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit 8eb4704b124cbd44f189709959137d77063ecfa1) (cherry picked from commit 7e27238e149ce4f00d9cd801fe3aa0ea55e986a2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18watchdog: jz4740: Drop dependency on MACH_JZ47xxPaul Cercueil
Depending on MACH_JZ47xx prevent us from creating a generic kernel that works on more than one MIPS board. Instead, we just depend on MIPS being set. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191023174714.14362-3-paul@crapouillou.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18watchdog: jz4740: Use regmap provided by TCU driverPaul Cercueil
Since we broke the ABI by changing the clock, the driver was also updated to use the regmap provided by the TCU driver. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Tested-by: Mathieu Malaterre <malat@debian.org> Tested-by: Artur Rojek <contact@artur-rojek.eu> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191023174714.14362-2-paul@crapouillou.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18watchdog: jz4740: Use WDT clock provided by TCU driverPaul Cercueil
Instead of requesting the "ext" clock and handling the watchdog clock divider and gating in the watchdog driver, we now request and use the "wdt" clock that is supplied by the ingenic-timer "TCU" driver. The major benefit is that the watchdog's clock rate and parent can now be specified from within devicetree, instead of hardcoded in the driver. Also, this driver won't poke anymore into the TCU registers to enable/disable the clock, as this is now handled by the TCU driver. On the bad side, we break the ABI with devicetree - as we now request a different clock. In this very specific case it is still okay, as every Ingenic JZ47xx-based board out there compile the devicetree within the kernel; so it's still time to push breaking changes, in order to get a clean devicetree that won't break once it musn't. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Tested-by: Mathieu Malaterre <malat@debian.org> Tested-by: Artur Rojek <contact@artur-rojek.eu> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191023174714.14362-1-paul@crapouillou.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>