git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-07-26	scsi: qedf: Limit number of CQs	Thomas Bogendoerfer
	FCOE offloading failed with: [qed_sp_fcoe_func_start:150(sp-0-3b:00.02)]Cannot satisfy CQ amount. CQs requested 8, CQs available 6. Aborting function start [qed_fcoe_start:821()]Failed to start fcoe [__qedf_probe:3041]:6: Cannot start FCoE function. The reason is a newly introduced check in the qed main part. This change also provides the information about how many CQs are available, so we simply limit the number of requested CQs.. Fixes: 3c5da9427802 ("qed: Share additional information with qedf") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-26	scsi: bnx2i: Simplify cpu hotplug code	Thomas Gleixner
	The CPU hotplug related code of this driver can be simplified by: 1) Consolidating the callbacks into a single state. The CPU thread can be torn down on the CPU which goes offline. There is no point in delaying that to the CPU dead state 2) Let the core code invoke the online/offline callbacks and remove the extra for_each_online_cpu() loops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-26	scsi: bnx2fc: Simplify CPU hotplug code	Thomas Gleixner
	The CPU hotplug related code of this driver can be simplified by: 1) Consolidating the callbacks into a single state. The CPU thread can be torn down on the CPU which goes offline. There is no point in delaying that to the CPU dead state 2) Let the core code invoke the online/offline callbacks and remove the extra for_each_online_cpu() loops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-26	scsi: bnx2i: Prevent recursive cpuhotplug locking	Thomas Gleixner
	The BNX2I module init/exit code installs/removes the hotplug callbacks with the cpu hotplug lock held. This worked with the old CPU locking implementation which allowed recursive locking, but with the new percpu rwsem based mechanism this is not longer allowed. Use the _cpuslocked() variants to fix this. Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-26	scsi: bnx2fc: Prevent recursive cpuhotplug locking	Thomas Gleixner
	The BNX2FC module init/exit code installs/removes the hotplug callbacks with the cpu hotplug lock held. This worked with the old CPU locking implementation which allowed recursive locking, but with the new percpu rwsem based mechanism this is not longer allowed. Use the _cpuslocked() variants to fix this. Reported-by: kernel test robot <fengguang.wu@intel.com> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-26	scsi: bnx2fc: Plug CPU hotplug race	Thomas Gleixner
	bnx2fc_process_new_cqes() has protection against CPU hotplug, which relies on the per cpu thread pointer. This protection is racy because it happens only partially with the per cpu fp_work_lock held. If the CPU is unplugged after the lock is dropped, the wakeup code can dereference a NULL pointer or access freed and potentially reused memory. Restructure the code so the thread check and wakeup happens with the fp_work_lock held. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-27	drm: exynos: mark pm functions as __maybe_unused	Arnd Bergmann
	The rework of the exynos DRM clock handling introduced warnings for configurations that have CONFIG_PM disabled: drivers/gpu/drm/exynos/exynos_hdmi.c:736:13: error: 'hdmi_clk_disable_gates' defined but not used [-Werror=unused-function] static void hdmi_clk_disable_gates(struct hdmi_context hdata) ^~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/exynos/exynos_hdmi.c:717:12: error: 'hdmi_clk_enable_gates' defined but not used [-Werror=unused-function] static int hdmi_clk_enable_gates(struct hdmi_context hdata) The problem is that the PM functions themselves are inside of an #ifdef, but some functions they call are not. This patch removes the #ifdef and instead marks the PM functions as __maybe_unused, which is a more reliable way to get it right. Link: https://patchwork.kernel.org/patch/8436281/ Fixes: 9be7e9898444 ("drm/exynos/hdmi: clock code re-factoring") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm/exynos: select CEC_CORE if CEC_NOTIFIER	Hans Verkuil
	If the s5p-cec driver is a module and the drm exynos driver is built-in, then the CEC core will be a module also, causing the CEC notifier to fail (will be compiled as empty functions). To prevent this select CEC_CORE if CEC_NOTIFIER is set to ensure the CEC core is also built into the kernel. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm/exynos/hdmi: fix disable sequence	Andrzej Hajda
	The "Fixes" patch was incorrectly merged, as a result PHY is prematurely powered off and for example Odroid-U3 cannot disable TV power domain when HDMI cable is unplugged. Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 625e63e2 ("drm/exynos/hdmi: fix pipeline disable order") Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm/exynos: mic: add a bridge at probe	Inki Dae
	This patch moves drm_bridge_add call into probe. It doesn't need to call drm_bridge_add call every time bind callback is called. Changelog v2 - moved drm_bridge_remove call into remove callback. - corrected description. Suggested-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm/exynos/dsi: Remove error handling for bridge_node DT parsing	Hoegeun Kwon
	Remove the error handling of bridge_node because the bridge_node is optional. For example, In case of Exynos SoC, a bridge device such as mDNIe and MIC could be placed between Display Controller and MIPI DSI device but the bridge device is optional. Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm/exynos: dsi: do not try to find bridge	Inki Dae
	It doesn't need to try to find a bridge if bridge node doesn't exist. Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Tested-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm: exynos: hdmi: make of_device_ids const.	Arvind Yadav
	of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. File size before: text data bss dec hex filename 12294 1192 0 13486 34ae drivers/gpu/drm/exynos/exynos_hdmi.o File size after constify hdmi_match_types. text data bss dec hex filename 13318 176 0 13494 34b6 drivers/gpu/drm/exynos/exynos_hdmi.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	drm: exynos: constify mixer_match_types and *_mxr_drv_data.	Arvind Yadav
	File size before: text data bss dec hex filename 9983 1424 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o File size after constify: text data bss dec hex filename 11231 176 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27	exynos_drm: Clean up duplicated assignment in exynos_drm_driver	Gabriel Krisman Bertazi
	num_ioctls is already assigned when declaring the exynos_drm_driver structure. No need to duplicate it here. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-26	bpf: don't zero out the info struct in bpf_obj_get_info_by_fd()	Jakub Kicinski
	The buffer passed to bpf_obj_get_info_by_fd() should be initialized to zeros. Kernel will enforce that to guarantee we can safely extend info structures in the future. Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing is problematic, however, since some members of the info structures may need to be initialized by the callers (for instance pointers to buffers to which kernel is to dump translated and jited images). Remove the zeroing and fix up the in-tree callers before any kernel has been released with this code. As Daniel points out this seems to be the intended operation anyway, since commit 95b9afd3987f ("bpf: Test for bpf ID") is itself setting the buffer pointers before calling bpf_obj_get_info_by_fd(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26	netpoll: Fix device name check in netpoll_setup()	Matthias Kaehlcke
	Apparently netpoll_setup() assumes that netpoll.dev_name is a pointer when checking if the device name is set: if (np->dev_name) { ... However the field is a character array, therefore the condition always yields true. Check instead whether the first byte of the array has a non-zero value. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27	Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-fixes Three misc amd fixes. * 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux: drm/amd/powerplay: fix AVFS voltage offset for Vega10 drm/amdgpu/gfx9: simplify and fix GRBM index selection drm/amdgpu: Fix blocking in RCU critical section(v2)
2017-07-26	NFS: Use raw NFS access mask in nfs4_opendata_access()	Anna Schumaker
	Commit bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's access cache") changed how the access results are stored after an access() call. An NFS v4 OPEN might have access bits returned with the opendata, so we should use the NFS4_ACCESS values when determining the return value in nfs4_opendata_access(). Fixes: bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's access cache") Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Tested-by: Takashi Iwai <tiwai@suse.de>
2017-07-26	bonding: commit link status change after propose	WANG Cong
	Commit de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") moves link status commitment into bond_mii_monitor(), but it still relies on the return value of bond_miimon_inspect() as the hint. We need to return non-zero as long as we propose a link status change. Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26	memory: atmel-ebi: Fix smc cycle xlate converter	Alexander Dahl
	The converter function for translating ns timings in register values was initialized with a wrong function pointer. This resulted in wrong register values also for the setup and pulse registers when configuring the EBI interface trough dts. Includes a small fix in a comment of the smc driver, which was probably just a copy'n'paste mistake. Signed-off-by: Alexander Dahl <ada@thorsis.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-26	memory: atmel-ebi: Allow t_DF timings of zero ns	Alexander Dahl
	As reported in [1] and in [2] it's not possible to set the device tree property 'atmel,smc-tdf-ns' to zero, although the SoC allows a setting of 0ns for the t_DF time. Allow this setting by doing the same thing as in the atmel nand controller driver by setting ncycles to ATMEL_SMC_MODE_TDF_MIN if zero is set in the dts. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/490966.html [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520652.html Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Alexander Dahl <ada@thorsis.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-26	memory: atmel-ebi: Fix smc timing return value evaluation	Alexander Dahl
	Setting optional EBI/SMC properties through device tree always fails due to wrong evaluation of the return value of atmel_ebi_xslate_smc_timings(). If you put some of those properties in your dts file, but not 'atmel,smc-tdf-ns' the local variable 'required' in atmel_ebi_xslate_smc_timings() stays on 'false' after the first 'if' block. This leads to setting 'ret' to -EINVAL in the first run of the following 'for' loop which is then the return value of this function. However if you set 'atmel,smc-tdf-ns' in the dts file and everything in atmel_ebi_xslate_smc_timings() works well, it returns the content of 'required' which is 'true' then. So the function atmel_ebi_xslate_smc_timings() always returns non-zero which lets its call in atmel_ebi_xslate_smc_config() always fail and thus returning -EINVAL, so the EBI configuration for this node fails. Judging from the following code evaluating the local 'required' variable in atmel_ebi_xslate_smc_config() and the call of caps->xlate_config in atmel_ebi_dev_setup() it's probably right to only let the call fail if a negative error code is returned. Signed-off-by: Alexander Dahl <ada@thorsis.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-26	vfio/pci: Use pci_try_reset_function() on initial open	Alex Williamson
	Device lock bites again; if a device .remove() callback races a user calling ioctl(VFIO_GROUP_GET_DEVICE_FD), the unbind request will hold the device lock, but the user ioctl may have already taken a vfio_device reference. In the case of a PCI device, the initial open will attempt to reset the device, which again attempts to get the device lock, resulting in deadlock. Use the trylock PCI reset interface and return error on the open path if reset fails due to lock contention. Link: https://lkml.org/lkml/2017/7/25/381 Reported-by: Wen Congyang <wencongyang2@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-07-26	perf annotate stdio: Set enough columns for --show-total-period	Arnaldo Carvalho de Melo
	Now that we set the first column header according to wether --show-total-period is being used, we need to size it accordingly. Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-pu504ffnit4m334k09hxcbs3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26	perf sort: Use default sort if evlist is empty	David Carrillo-Cisneros
	Fixes bug noted by Jiri in https://lkml.org/lkml/2017/6/13/755 and caused by commit d49dadea7862 ("perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events") not taking into account that evlist is empty in pipe-mode. Before this commit, pipe mode will only show bogus "100.00% N/A" instead of correct output as follows: $ perf record -o - sleep 1 \| perf report -i - # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # # Total Lost Samples: 0 # # Samples: 8 of event 'cycles:ppH' # Event count (approx.): 145658 # # Overhead Trace output # ........ ............ # 100.00% N/A Correct output, after patch: $ perf record -o - sleep 1 \| perf report -i - # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # # Total Lost Samples: 0 # # Samples: 8 of event 'cycles:ppH' # Event count (approx.): 191331 # # Overhead Command Shared Object Symbol # ........ ....... ................. ................................. # 81.63% sleep libc-2.19.so [.] _exit 13.58% sleep ld-2.19.so [.] do_lookup_x 2.34% sleep [kernel.kallsyms] [k] context_switch 2.34% sleep libc-2.19.so [.] __GI___libc_nanosleep 0.11% perf [kernel.kallsyms] [k] __intel_pmu_enable_a Reported-by: Jiri Olsa <jolsa@kernel.org> Report-Link: https://lkml.kernel.org/r/20170613185422.GA6092@krava Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: d49dadea7862 ("perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events") Link: https://lkml.kernel.org/r/20170721051157.47331-1-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26	dm, dax: Make sure dm_dax_flush() is called if device supports it	Vivek Goyal
	Currently dm_dax_flush() is not being called, even if underlying dax device supports write cache, because DAXDEV_WRITE_CACHE is not being propagated up to the DM dax device. If the underlying dax device supports write cache, set DAXDEV_WRITE_CACHE on the DM dax device. This will cause dm_dax_flush() to be called. Fixes: abebfbe2f7 ("dm: add ->flush() dax operation support") Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-26	dm verity fec: fix GFP flags used with mempool_alloc()	NeilBrown
	mempool_alloc() cannot fail for GFP_NOIO allocation, so there is no point testing for failure. One place the code tested for failure was passing "0" as the GFP flags. This is most unusual and is probably meant to be GFP_NOIO, so that is changed. Also, allocation from ->extra_pool and ->prealloc_pool are repeated before releasing the previous allocation. This can deadlock if the code is servicing a write under high memory pressure. To avoid deadlocks, change these to use GFP_NOWAIT and leave the error handling in place. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-26	dm zoned: use GFP_NOIO in I/O path	Damien Le Moal
	Use GFP_NOIO for memory allocations in the I/O path. Other memory allocations in the initialization path can use GFP_KERNEL. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-26	perf annotate: Do not overwrite perf_sample->weight	Arnaldo Carvalho de Melo
	When we parse an event we may get a value from the kernel in response to PERF_SAMPLE_WEIGHT being set in perf_event_attr->sample_type, and if it is not set, then perf_sample->weight will be set to zero, which should be ok according to a discussion with Andi Kleen [1]: 1: https://lkml.kernel.org/r/20170724174637.GS3044@two.firstfloor.org Acked-by: Andi Kleen <andi@firstfloor.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8ev8ufk3lzmvgz37yg9nv3qz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26	include/linux/vfio.h: Guard powerpc-specific functions with ↵	Murilo Opsfelder Araujo
	CONFIG_VFIO_SPAPR_EEH When CONFIG_EEH=y and CONFIG_VFIO_SPAPR_EEH=n, build fails with the following: drivers/vfio/pci/vfio_pci.o: In function `.vfio_pci_release': vfio_pci.c:(.text+0xa98): undefined reference to `.vfio_spapr_pci_eeh_release' drivers/vfio/pci/vfio_pci.o: In function `.vfio_pci_open': vfio_pci.c:(.text+0x1420): undefined reference to `.vfio_spapr_pci_eeh_open' In this case, vfio_pci.c should use the empty definitions of vfio_spapr_pci_eeh_open and vfio_spapr_pci_eeh_release functions. This patch fixes it by guarding these function definitions with CONFIG_VFIO_SPAPR_EEH, the symbol that controls whether vfio_spapr_eeh.c is built, which is where the non-empty versions of these functions are. We need to make use of IS_ENABLED() macro because CONFIG_VFIO_SPAPR_EEH is a tristate option. This issue was found during a randconfig build. Logs are here: http://kisskb.ellerman.id.au/kisskb/buildresult/12982362/ Signed-off-by: Murilo Opsfelder Araujo <mopsfelder@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-07-26	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds
	Pull virtio fixes and cleanups from Michael Tsirkin: "Fixes some minor issues all over the codebase" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-net: fix module unloading virtio-balloon: coding format cleanup virtio-balloon: deflate via a page list virtio_blk: Use sysfs_match_string() helper
2017-07-26	perf stat: Use group read for event groups	Jiri Olsa
	Make perf stat use group read if there are groups defined. The group read will get the values for all member of groups within a single syscall instead of calling read syscall for every event. We can see considerable less amount of kernel cycles spent on single group read, than reading each event separately, like for following perf stat command: # perf stat -e {cycles,instructions} -I 10 -a sleep 1 Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'" Before: 24,325,676 cycles:u 297,040,775 cycles:k 1.038554134 seconds time elapsed After: 25,034,418 cycles:u 158,256,395 cycles:k 1.036864497 seconds time elapsed The perf_evsel__open fallback changes contributed by Andi Kleen. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170726120206.9099-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26	perf evsel: Add read_counter()	Jiri Olsa
	Add perf_evsel__read_counter() to read single or group counter. After calling this function the counter's evsel::counts struct is filled with values for the counter and member of its group if there are any. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170726120206.9099-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26	perf tools: Add perf_evsel__read_size function	Jiri Olsa
	Currently we use the size of struct perf_counts_values to read the event, which prevents us to put any new member to the struct. Adding perf_evsel__read_size to return size of the buffer needed for event read. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170726120206.9099-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26	Merge tag 'perf-core-for-mingo-4.14-20170725' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvemends and fixes for v4.14: New features: - Filter out 'sshd' in the tracer ancestry in 'perf trace' syswide tracing, to elliminate tracing loops (Arnaldo Carvalho de Melo) - Support lookup of symbols in other mount namespaces in 'perf top' (Krister Johansen) - Initial 'clone' syscall args beautifier in 'perf trace' (Arnaldo Carvalho de Melo) User visible changes: - Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace' (Arnaldo Carvalho de Melo) - Process tracing data in 'perf annotate' pipe mode (David Carrillo-Cisneros) - Make 'perf report --branch-history' work without callgraphs(-g) option in perf record (Jin Yao) - Tag branch type/flag on "to" and tag cycles on "from" in 'perf report' (Jin Yao) Fixes: - Fix jvmti linker error when libelf config is disabled (Sudeep Holla) - Fix cgroups refcount usage (Arnaldo Carvalho de Melo) - Fix kernel symbol adjustment for s390x (Thomas Richter) - Fix 'perf report --stdio --show-total-period', it was showing the number of samples, not the total period (Taeung Song) Infrastructure changes: - Add perf_sample dictionary to tracepoint handlers in 'perf script' python, which were already present for other types of events (hardware, etc) (Arun Kalyanasundaram) - Make build fail on vendor events JSON parse error (Andi Kleen) - Adopt strstarts() from the kernel (Arnaldo Carvalho de Melo) Arch specific changes: - Set no_aux_samples for the tracking event in Intel PT (Kan Liang) - Always set no branch for Intel PT dummy event (Kan Liang) Trivial changes: - Simplify some error handlers in 'perf script' (Dan Carpenter) - Add EXCLUDE_EXTLIBS and EXTRA_PERFLIBS to makefile (David Carrillo-Cisneros) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-26	Merge branch 'kvm-ppc-fixes' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master Two commits which fix host crashes. Signed-off-by: Paolo BOnzini <pbonzini@redhat.com>
2017-07-26	KVM: LAPIC: Fix reentrancy issues with preempt notifiers	Wanpeng Li
	Preempt can occur in the preemption timer expiration handler: CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer hv_timer_is_use == true sched_out sched_in kvm_arch_vcpu_load kvm_lapic_restart_hv_timer restart_apic_timer start_hv_timer already-expired timer or sw timer triggerd in the window start_sw_timer cancel_hv_timer /* back in kvm_lapic_expired_hv_timer */ cancel_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops This can be reproduced if CONFIG_PREEMPT is enabled. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm] Call Trace: kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer and start_sw_timer be in preemption-disabled regions, which trivially avoid any reentrancy issue with preempt notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> [Add more WARNs. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	tools/kvm_stat: add '-f help' to get the available event list	Lin Ma
	Signed-off-by: Lin Ma <lma@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	tools/kvm_stat: use variables instead of hard paths in help output	Lin Ma
	Using variables instead of hard paths makes the requirements information more accurate. Signed-off-by: Lin Ma <lma@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	Merge tag 'kvm-s390-master-4.13-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: fixup missing srcu lock We need to hold the srcu lock when accessing memory slots during migration Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	KVM: nVMX: Fix loss of L2's NMI blocking state	Wanpeng Li
	Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1: Before NMI IRET test Sending NMI to self NMI isr running stack 0x461000 Sending nested NMI to self After nested NMI to self Nested NMI isr running rip=40038e After iret After NMI to self FAIL: NMI Commit 4c4a6f790ee862 (KVM: nVMX: track NMI blocking state separately for each VMCS) tracks NMI blocking state separately for vmcs01 and vmcs02. However it is not enough: - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault on IRET, so the L2 can generate #PF which can be intercepted by L0. - L0 walks L1's guest page table and sees the mapping is invalid, it resumes the L1 guest and injects the #PF into L1. At this point the vmcs02 has nmi_known_unmasked=true. - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field of vmcs12 (and fixes the shadow page table) before resuming L2 guest. - L1 executes VMRESUME to resume L2, causing a vmexit to L0 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the interruptibility-state field of vmcs02, but nmi_known_unmasked is still true. - L2 immediately exits to L0 with another page fault, because L0 still has not updated the NGVA->HPA page tables. However, nmi_known_unmasked is true so vmx_recover_nmi_blocking does not do anything. The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode	Wincy Van
	The PI vector for L0 and L1 must be different. If dest vcpu0 is in guest mode while vcpu1 is delivering a non-nested PI to vcpu0, there wont't be any vmexit so that the non-nested interrupt will be delayed. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	x86: irq: Define a global vector for nested posted interrupts	Wincy Van
	We are using the same vector for nested/non-nested posted interrupts delivery, this may cause interrupts latency in L1 since we can't kick the L2 vcpu out of vmx-nonroot mode. This patch introduces a new vector which is only for nested posted interrupts to solve the problems above. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	KVM: x86: do mask out upper bits of PAE CR3	Paolo Bonzini
	This reverts the change of commit f85c758dbee54cc3612a6e873ef7cecdb66ebee5, as the behavior it modified was intended. The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual says: Table 4-7. Use of CR3 with PAE Paging Bit Position(s) Contents 4:0 Ignored 31:5 Physical address of the 32-Byte aligned page-directory-pointer table used for linear-address translation 63:32 Ignored (these bits exist only on processors supporting the Intel-64 architecture) To placate the static checker, write the mask explicitly as an unsigned long constant instead of using a 32-bit unsigned constant. Cc: Dan Carpenter <dan.carpenter@oracle.com> Fixes: f85c758dbee54cc3612a6e873ef7cecdb66ebee5 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	KVM: make pid available for uevents without debugfs	Claudio Imbrenda
	Simplify and improve the code so that the PID is always available in the uevent even when debugfs is not available. This adds a userspace_pid field to struct kvm, as per Radim's suggestion, so that the PID can be retrieved on destruction too. Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Fixes: 286de8f6ac9202 ("KVM: trigger uevents when creating or destroying a VM") Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26	udp: unbreak build lacking CONFIG_XFRM	Paolo Abeni
	We must use pre-processor conditional block or suitable accessors to manipulate skb->sp elsewhere builds lacking the CONFIG_XFRM will break. Fixes: dce4551cb2ad ("udp: preserve head state for IP_CMSG_PASSSEC") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26	ASoC: codecs: msm8916-analog: fix DIG_CLK_CTL_RXD3_CLK_EN define	Damien Riegel
	The wrong bit is assigned to DIG_CLK_CTL_RXD3_CLK_EN, change it for the correct one. Signed-off-by: Damien Riegel <damien.riegel@savoirfairelinux.com> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-07-26	nvme: validate admin queue before unquiesce	Scott Bauer
	With a misbehaving controller it's possible we'll never enter the live state and create an admin queue. When we fail out of reset work it's possible we failed out early enough without setting up the admin queue. We tear down queues after a failed reset, but needed to do some more sanitization. Fixes 443bd90f2cca: "nvme: host: unquiesce queue in nvme_kill_queues()" [ 189.650995] nvme nvme1: pci function 0000:0b:00.0 [ 317.680055] nvme nvme0: Device not ready; aborting reset [ 317.680183] nvme nvme0: Removing after probe failure status: -19 [ 317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 317.681397] general protection fault: 0000 [#1] SMP KASAN [ 317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5 [ 317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 [ 317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000 [ 317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0 [ 317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282 [ 317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3 [ 317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000 [ 317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000 [ 317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0 [ 317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0 [ 317.684371] FS: 0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000 [ 317.684519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0 [ 317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 317.685018] Call Trace: [ 317.685084] nvme_kill_queues+0x4d/0x170 [nvme_core] [ 317.685185] nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme] [ 317.685289] process_one_work+0x771/0x1170 [ 317.685372] worker_thread+0xde/0x11e0 [ 317.685452] ? pci_mmcfg_check_reserved+0x110/0x110 [ 317.685550] kthread+0x2d3/0x3d0 [ 317.685617] ? process_one_work+0x1170/0x1170 [ 317.685704] ? kthread_create_on_node+0xc0/0xc0 [ 317.685785] ret_from_fork+0x25/0x30 [ 317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3 [ 317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40 [ 317.685908] ---[ end trace a3f8704150b1e8b4 ]--- Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-26	xfs: fix multi-AG deadlock in xfs_bunmapi	Christoph Hellwig
	Just like in the allocator we must avoid touching multiple AGs out of order when freeing blocks, as freeing still locks the AGF and can cause the same AB-BA deadlocks as in the allocation path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>