summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-15net: xilinx_emaclite: fix freezes due to unordered I/OAnssi Hannula
The xilinx_emaclite uses __raw_writel and __raw_readl for register accesses. Those functions do not imply any kind of memory barriers and they may be reordered. The driver does not seem to take that into account, though, and the driver does not satisfy the ordering requirements of the hardware. For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read() which try to set MDIO address before initiating the transaction. I'm seeing system freezes with the driver with GCC 5.4 and current Linux kernels on Zynq-7000 SoC immediately when trying to use the interface. In commit 123c1407af87 ("net: emaclite: Do not use microblaze and ppc IO functions") the driver was switched from non-generic in_be32/out_be32 (memory barriers, big endian) to __raw_readl/__raw_writel (no memory barriers, native endian), so apparently the device follows system endianness and the driver was originally written with the assumption of memory barriers. Rather than try to hunt for each case of missing barrier, just switch the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending on endianness instead. Tested on little-endian Zynq-7000 ARM SoC FPGA. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Fixes: 123c1407af87 ("net: emaclite: Do not use microblaze and ppc IO functions") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15net: xilinx_emaclite: fix receive buffer overflowAnssi Hannula
xilinx_emaclite looks at the received data to try to determine the Ethernet packet length but does not properly clamp it if proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer overflow and a panic via skb_panic() as the length exceeds the allocated skb size. Fix those cases. Also add an additional unconditional check with WARN_ON() at the end. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Fixes: bb81b2ddfa19 ("net: add Xilinx emac lite device driver") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15bpf: Rebuild bpf.o for any dependency updateMickaël Salaün
This is needed to force a rebuild of bpf.o when one of its dependencies (e.g. uapi/linux/bpf.h) is updated. Add a phony target. Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Wang Nan <wangnan0@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15bpf: Remove redundant ifdefMickaël Salaün
Remove a useless ifdef __NR_bpf as requested by Wang Nan. Inline one-line static functions as it was in the bpf_sys.h file. Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/r/828ab1ff-4dcf-53ff-c97b-074adb895006@huawei.com Acked-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15mlx4: do not use rwlock in fast pathEric Dumazet
Using a reader-writer lock in fast path is silly, when we can instead use RCU or a seqlock. For mlx4 hwstamp clock, a seqlock is the way to go, removing two atomic operations and false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15goldfish: Sanitize the broken interrupt handlerThomas Gleixner
This interrupt handler is broken in several ways: - It loops forever when the op code is not decodeable - It never returns IRQ_HANDLED because the only way to exit the loop returns IRQ_NONE unconditionally. The whole concept of this is broken. Creating devices in an interrupt handler is beyond any point of sanity. Make it at least behave halfways sane so accidental users do not have to deal with a hard to debug lockup. Fixes: e809c22b8fb028 ("goldfish: add the goldfish virtual bus") Reported-by: Gabriel C <nix.or.die@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-15x86/platform/goldfish: Prevent unconditional loadingThomas Gleixner
The goldfish platform code registers the platform device unconditionally which causes havoc in several ways if the goldfish_pdev_bus driver is enabled: - Access to the hardcoded physical memory region, which is either not available or contains stuff which is completely unrelated. - Prevents that the interrupt of the serial port can be requested - In case of a spurious interrupt it goes into a infinite loop in the interrupt handler of the pdev_bus driver (which needs to be fixed seperately). Add a 'goldfish' command line option to make the registration opt-in when the platform is compiled in. I'm seriously grumpy about this engineering trainwreck, which has seven SOBs from Intel developers for 50 lines of code. And none of them figured out that this is broken. Impressive fail! Fixes: ddd70cf93d78 ("goldfish: platform device for x86") Reported-by: Gabriel C <nix.or.die@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-15USB: serial: keyspan: drop header fileJohan Hovold
Move all declarations and definitions in keyspan.h to keyspan.c, which is the only place were they are used. This specifically moves the driver device-id tables and usb-serial driver definitions to the source file where they are expected to be found. While at it, fix up some multi-line comments and minor white-space issues (spaces instead of tabs and superfluous white space). Note that the information in the comment header of the removed header file is also present in the source file. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2017-02-15USB: serial: io_edgeport: drop io-tables header fileJohan Hovold
Move the driver device-id tables and usb-serial driver definitions to the source file where they are expected to be found. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2017-02-15PCI/MSI: Document pci_alloc_irq_vectors(), deprecate pci_enable_msi()Christoph Hellwig
Document pci_alloc_irq_vectors() instead of the deprecated pci_enable_msi() and pci_enable_msix() APIs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-02-15block: do not allow updates through sysfs until registration completesTahsin Erdogan
When a new disk shows up, sysfs queue directory is created before elevator is registered. This allows a user to attempt a scheduler switch even though the initial registration hasn't completed yet. In one scenario, blk_register_queue() calls elv_register_queue() and right before cfq_registered_queue() is called, another process executes elevator_switch() and replaces q->elevator with deadline scheduler. When cfq_registered_queue() executes it interprets e->elevator_data as struct cfq_data even though it is actually struct deadline_data. Grab q->sysfs_lock in blk_register_queue() to synchronize with sysfs callers. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-15PCI/PME: Restore pcie_pme_driver.removeYinghai Lu
In addition to making PME non-modular, d7def2040077 ("PCI/PME: Make explicitly non-modular") removed the pcie_pme_driver .remove() method, pcie_pme_remove(). pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe(). The fact that we don't free the IRQ after d7def2040077 causes the following crash when removing a PCIe port device via /sys: ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:370! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 1 PID: 14509 Comm: sh Tainted: G W 4.8.0-rc1-yh-00012-gd29438d RIP: 0010:[<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 ... Call Trace: [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40 [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30 [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40 [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50 [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0 [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150 [<ffffffff9785eca5>] device_release_driver+0x25/0x40 [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0 [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [<ffffffff97578810>] remove_store+0x50/0x70 [<ffffffff9785a378>] dev_attr_store+0x18/0x30 [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60 [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190 [<ffffffff971e13f8>] __vfs_write+0x28/0x110 [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e1f04>] vfs_write+0xc4/0x180 [<ffffffff971e3089>] SyS_write+0x49/0xa0 [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0 [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25 ... RIP [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 RSP <ffff89ad3085bc48> ---[ end trace f4505e1dac5b95d3 ]--- Segmentation fault Restore pcie_pme_remove(). [bhelgaas: changelog] Fixes: d7def2040077 ("PCI/PME: Make explicitly non-modular") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org # v4.9+
2017-02-15tracing/probes: Fix a warning message to show correct maximum lengthMasami Hiramatsu
Since tracing/*probe_events will accept a probe definition up to 4096 - 2 ('\n' and '\0') bytes, it must show 4094 instead of 4096 in warning message. Note that there is one possible case of exceed 4094. If user prepare 4096 bytes null-terminated string and syscall write it with the count == 4095, then it can be accepted. However, if user puts a '\n' after that, it must rejected. So IMHO, the warning message should indicate shorter one, since it is safer. Link: http://lkml.kernel.org/r/148673290462.2579.7966778294009665632.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15lightnvm: set default lun range when no luns are specifiedMatias Bjørling
The create target ioctl takes a lun begin and lun end parameter, which defines the range of luns to initialize a target with. If the user does not set the parameters, it default to only using lun 0. Instead, defaults to use all luns in the OCSSD, as it is the usual behaviour users want. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-15lightnvm: fix off-by-one error on target initializationMatias Bjørling
If one specifies the end lun id to be the absolute number of luns, without taking zero indexing into account, the lightnvm core will pass the off-by-one end lun id to target creation, which then panics during nvm_ioctl_dev_create. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-15ARM: dts: armada-385-linksys: fix DSA compatible propertyRalph Sennhauser
The switch to the new DSA binding used "marvell,mv88e6095" for the compatible property which doesn't exist, use "marvell,mv88e6085" instead. Fixes: 455b82f03f52 ("ARM: dts: armada-385-linksys: Utilize new DSA binding") Signed-off-by: Ralph Sennhauser <ralph.sennhauser@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-02-15of: introduce of_graph_get_remote_nodeRob Herring
The OF graph API leaves too much of the graph walking to clients when in many cases the driver doesn't care about accessing the port or endpoint nodes. The drivers typically just want the device connected via a particular graph connection. of_graph_get_remote_node provides this functionality. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-02-15IB/cma: Destination and source addr families must matchMoni Shoua
The destination address in a listening rdma_id does not have an address family. Since address family in both sides of a connection must be the same in rdma_bind_addr() we set the address family of the destination to the address family of the source. This patch serves the logic in cma_port_is_unique() which requires to know if destination address that is associated with a rdma_id is any address (cma_zero_addr() and cma_loopback_addr()). This can happen when port reuse is checked for a port number that is being listened to. Fixes: 19b752a19dce ("IB/cma: Allow port reuse for rdma_id") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15IB/cma: Add default RoCE TOS to CMA configfsMajd Dibbiny
Add new entry to the RDMA-CM configfs that allows users to select default TOS for RDMA-CM QPs. This is useful for users that want to control the TOS for legacy applications without changing their code. Application that sets the TOS explicitly using the rdma_set_option API will continue to work as expected, meaning overriding the configfs value. CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15IB/core: Remove pointer casting from void to net_deviceParav Pandit
This patch avoids unnecessary type casting from void to net_device. CC: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15IB/IPoIB: Add destination address when re-queue packetErez Shitrit
When sending packet to destination that was not resolved yet via path query, the driver keeps the skb and tries to re-send it again when the path is resolved. But when re-sending via dev_queue_xmit the kernel doesn't call to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb and to put the destination address inside them. In that way the dev_start_xmit will have the correct destination, and the driver won't take the destination from the skb->data, while nothing exists there, which causes to packet be be dropped. The test flow is: 1. Run the SM on remote node, 2. Restart the driver. 4. Ping some destination, 3. Observe that first ICMP request will be dropped. Fixes: fc791b633515 ("IB/ipoib: move back IB LL address into the hard header") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15ARM: dts: Fix typo in armada-xp-98dx4251Chris Packham
The compatible should be 98dx4251 not 98dx4521. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-02-15IB/mlx5: Fix configuration of port capabilitiesEli Cohen
When the "ib_virt" cap is set, configuration of port capabilities need to be done through mlx5_core_modify_hca_vport_context. Since modify_hca_vport_context accepts mask and value, there is no need to read the port capabilities and calculate the new cap values so we avoid the mutex when ib_virt is set. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15KVM: svm: inititalize hash table structures directlyDavid Hildenbrand
The hashtable and guarding spinlock are global data structures, we can inititalize them statically. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170124212116.4568-1-david@redhat.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15perf tools: Add missing parse_events_error() prototypeArnaldo Carvalho de Melo
As pointed out by clang, we were not providing a prototype for a function before using it: util/parse-events.y:699:6: error: conflicting types for 'parse_events_error' void parse_events_error(YYLTYPE *loc, void *data, ^ /tmp/build/perf/util/parse-events-bison.c:2224:7: note: previous implicit declaration is here yyerror (&yylloc, _data, scanner, YY_("syntax error")); ^ /tmp/build/perf/util/parse-events-bison.c:65:25: note: expanded from macro 'yyerror' #define yyerror parse_events_error 1 error generated. One line fix it. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170215130605.GC4020@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-15tracing: Fix return value check in trace_benchmark_reg()Wei Yongjun
In case of error, the function kthread_run() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Link: http://lkml.kernel.org/r/20170112135502.28556-1-weiyj.lk@gmail.com Cc: stable@vger.kernel.org Fixes: 81dc9f0e ("tracing: Add tracepoint benchmark tracepoint") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15tracing: Use modern function declarationArnd Bergmann
We get a lot of harmless warnings about this header file at W=1 level because of an unusual function declaration: kernel/trace/trace.h:766:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration] This moves the inline statement where it normally belongs, avoiding the warning. Link: http://lkml.kernel.org/r/20170123122521.3389010-1-arnd@arndb.de Fixes: 4046bf023b06 ("ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15jump_label: Reduce the size of struct static_keyJason Baron
The static_key->next field goes mostly unused. The field is used for associating module uses with a static key. Most uses of struct static_key define a static key in the core kernel and make use of it entirely within the core kernel, or define the static key in a module and make use of it only from within that module. In fact, of the ~3,000 static keys defined, I found only about 5 or so that did not fit this pattern. Thus, we can remove the static_key->next field entirely and overload the static_key->entries field. That is, when all the static_key uses are contained within the same module, static_key->entries continues to point to those uses. However, if the static_key uses are not contained within the module where the static_key is defined, then we allocate a struct static_key_mod, store a pointer to the uses within that struct static_key_mod, and have the static key point at the static_key_mod. This does incur some extra memory usage when a static_key is used in a module that does not define it, but since there are only a handful of such cases there is a net savings. In order to identify if the static_key->entries pointer contains a struct static_key_mod or a struct jump_entry pointer, bit 1 of static_key->entries is set to 1 if it points to a struct static_key_mod and is 0 if it points to a struct jump_entry. We were already using bit 0 in a similar way to store the initial value of the static_key. This does mean that allocations of struct static_key_mod and that the struct jump_entry tables need to be at least 4-byte aligned in memory. As far as I can tell all arches meet this criteria. For my .config, the patch increased the text by 778 bytes, but reduced the data + bss size by 14912, for a net savings of 14,134 bytes. text data bss dec hex filename 8092427 5016512 790528 13899467 d416cb vmlinux.pre 8093205 5001600 790528 13885333 d3df95 vmlinux.post Link: http://lkml.kernel.org/r/1486154544-4321-1-git-send-email-jbaron@akamai.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15tracing/probe: Show subsystem name in messagesMasami Hiramatsu
Show "trace_probe:", "trace_kprobe:" and "trace_uprobe:" headers for each warning/error/info message. This will help people to notice that kprobe/uprobe events caused those messages. Link: http://lkml.kernel.org/r/148646647813.24658.16705315294927615333.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15tracing/hwlat: Update old comment about migrationLuiz Capitulino
The ftrace hwlat does support a cpumask. Link: http://lkml.kernel.org/r/20170213122517.6e211955@redhat.com Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15timers: Make flags output in the timer_start tracepoint usefulThomas Gleixner
The timer flags in the timer_start trace event contain lots of useful information, but the meaning is not clear in the trace output. Making tools rely on the bit positions is bad as they might change over time. Decode the flags in the print out. Tools can retrieve the bits and their meaning from the trace format file. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1702101639290.4036@nanos Requested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.gitKalle Valo
ath.git patches for 4.11. Major changes: ath10k * when trying older firmware versions don't confuse user with error messages ath9k * fix crash in AP mode (regression) * fix relayfs crash (regression) * fix initialisation with AR9340 and AR9550
2017-02-15tracing: Have traceprobe_probes_write() not access userspace unnecessarilySteven Rostedt (VMware)
The code in traceprobe_probes_write() reads up to 4096 bytes from userpace for each line. If userspace passes in several lines to execute, the code will do a large read for each line, even though, it is highly likely that the first read from userspace received all of the lines at once. I changed the logic to do a single read from userspace, and to only read from userspace again if not all of the read from userspace made it in. I tested this by adding printk()s and writing files that would test -1, ==, and +1 the buffer size, to make sure that there's no overflows and that if a single line is written with +1 the buffer size, that it fails properly. Link: http://lkml.kernel.org/r/20170209180458.5c829ab2@gandalf.local.home Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-15kvm: nVMX: Refactor nested_vmx_run()Jim Mattson
Nested_vmx_run is split into two parts: the part that handles the VMLAUNCH/VMRESUME instruction, and the part that modifies the vcpu state to transition from VMX root mode to VMX non-root mode. The latter will be used when restoring the checkpointed state of a vCPU that was in VMX operation when a snapshot was taken. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: nVMX: Split VMCS checks from nested_vmx_run()Jim Mattson
The checks performed on the contents of the vmcs12 are extracted from nested_vmx_run so that they can be used to validate a vmcs12 that has been restored from a checkpoint. Signed-off-by: Jim Mattson <jmattson@google.com> [Change prepare_vmcs02 and nested_vmx_load_cr3's last argument to u32, to match check_vmentry_postreqs. Update comments for singlestep handling. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: nVMX: Refactor nested_get_vmcs12_pages()Jim Mattson
Perform the checks on vmcs12 state early, but defer the gpa->hpa lookups until after prepare_vmcs02. Later, when we restore the checkpointed state of a vCPU in guest mode, we will not be able to do the gpa->hpa lookups when the restore is done. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: nVMX: Refactor handle_vmptrld()Jim Mattson
Handle_vmptrld is split into two parts: the part that handles the VMPTRLD instruction, and the part that establishes the current VMCS pointer. The latter will be used when restoring the checkpointed state of a vCPU that had a valid VMCS pointer when a snapshot was taken. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: nVMX: Refactor handle_vmon()Jim Mattson
Handle_vmon is split into two parts: the part that handles the VMXON instruction, and the part that modifies the vcpu state to transition from legacy mode to VMX operation. The latter will be used when restoring the checkpointed state of a vCPU that was in VMX operation when a snapshot was taken. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: nVMX: Prepare for checkpointing L2 stateJim Mattson
Split prepare_vmcs12 into two parts: the part that stores the current L2 guest state and the part that sets up the exit information fields. The former will be used when checkpointing the vCPU's VMX state. Modify prepare_vmcs02 so that it can construct a vmcs02 midway through L2 execution, using the checkpointed L2 guest state saved into the cached vmcs12 above. Signed-off-by: Jim Mattson <jmattson@google.com> [Rebasing: add from_vmentry argument to prepare_vmcs02 instead of using vmx->nested.nested_run_pending, because it is no longer 1 at the point prepare_vmcs02 is called. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injectionPaolo Bonzini
Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU is blocked", 2015-09-18) the posted interrupt descriptor is checked unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to trigger the scan and, if NMIs or SMIs are not involved, we can avoid the complicated event injection path. Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been there since APICv was introduced. However, without the KVM_REQ_EVENT safety net KVM needs to be much more careful about races between vmx_deliver_posted_interrupt and vcpu_enter_guest. First, the IPI for posted interrupts may be issued between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts. If that happens, kvm_trigger_posted_interrupt returns true, but smp_kvm_posted_intr_ipi doesn't do anything about it. The guest is entered with PIR.ON, but the posted interrupt IPI has not been sent and the interrupt is only delivered to the guest on the next vmentry (if any). To fix this, disable interrupts before setting vcpu->mode. This ensures that the IPI is delayed until the guest enters non-root mode; it is then trapped by the processor causing the interrupt to be injected. Second, the IPI may be issued between kvm_x86_ops->sync_pir_to_irr(vcpu) and vcpu->mode = IN_GUEST_MODE. In this case, kvm_vcpu_kick is called but it (correctly) doesn't do anything because it sees vcpu->mode == OUTSIDE_GUEST_MODE. Again, the guest is entered with PIR.ON but no posted interrupt IPI is pending; this time, the fix for this is to move the RVI update after IN_GUEST_MODE. Both issues were mostly masked by the liberal usage of KVM_REQ_EVENT, though the second could actually happen with VT-d posted interrupts. In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting in another vmentry which would inject the interrupt. This saves about 300 cycles on the self_ipi_* tests of vmexit.flat. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15KVM: x86: do not scan IRR twice on APICv vmentryPaolo Bonzini
Calls to apic_find_highest_irr are scanning IRR twice, once in vmx_sync_pir_from_irr and once in apic_search_irr. Change sync_pir_from_irr to get the new maximum IRR from kvm_apic_update_irr; now that it does the computation, it can also do the RVI write. In order to avoid complications in svm.c, make the callback optional. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15KVM: vmx: move sync_pir_to_irr from apic_find_highest_irr to callersPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15KVM: x86: preparatory changes for APICv cleanupsPaolo Bonzini
Add return value to __kvm_apic_update_irr/kvm_apic_update_irr. Move vmx_sync_pir_to_irr around. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: nVMX: move nested events check to kvm_vcpu_runningPaolo Bonzini
vcpu_run calls kvm_vcpu_running, not kvm_arch_vcpu_runnable, and the former does not call check_nested_events. Once KVM_REQ_EVENT is removed from the APICv interrupt injection path, however, this would leave no place to trigger a vmexit from L2 to L1, causing a missed interrupt delivery while in guest mode. This is caught by the "ack interrupt on exit" test in vmx.flat. [This does not change the calls to check_nested_events in inject_pending_event. That is material for a separate cleanup.] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15KVM: vmx: clear pending interrupts on KVM_SET_LAPICPaolo Bonzini
Pending interrupts might be in the PI descriptor when the LAPIC is restored from an external state; we do not want them to be injected. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15kvm: vmx: Use the hardware provided GPA instead of page walkPaolo Bonzini
As in the SVM patch, the guest physical address is passed by VMX to x86_emulate_instruction already, so mark the GPA as available in vcpu->arch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-15drm: Resurrect atomic rmfb code, v3Maarten Lankhorst
This was somehow lost between v3 and the merged version in Maarten's patch merged as: commit f2d580b9a8149735cbc4b59c4a8df60173658140 Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Wed May 4 14:38:26 2016 +0200 drm/core: Do not preserve framebuffer on rmfb, v4. This introduces a slight behavioral change to rmfb. Instead of disabling a crtc when the primary plane is disabled, we try to preserve it. Apart from old versions of the vmwgfx xorg driver, there is nothing depending on rmfb disabling a crtc. Since vmwgfx is a legacy driver we can safely only disable the plane with atomic. If this commit is rejected by the driver then we will still fall back to the old behavior and turn off the crtc. v2: - Remove plane->fb assignment, done by drm_atomic_clean_old_fb. - Add WARN_ON when atomic_remove_fb fails. - Always call drm_atomic_state_put. v3: - Use drm_drv_uses_atomic_modeset - Handle the case where the first plane-disable-only commit fails with -EINVAL. Some drivers do not support this, fall back to disabling all crtc's in this case. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/66fc3da5-697b-1613-0a67-a5293209f0dc@linux.intel.com
2017-02-15perf pmu: Fix check for unset alias->unit arrayArnaldo Carvalho de Melo
The alias->unit field is an array, so to check that it is not set we should see if it is an empty string, i.e. alias->unit[0], instead of checking alias->unit != NULL, as this will _always_ evaluate to 'true'. Pointed out by clang. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170214182435.GD4458@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-15arm64/kprobes: consistently handle MRS/MSR with XZRMark Rutland
Now that we have XZR-safe helpers for fiddling with registers, use these in the arm64 kprobes code rather than open-coding the logic. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-15arm64: cpufeature: correctly handle MRS to XZRMark Rutland
In emulate_mrs() we may erroneously write back to the user SP rather than XZR if we trap an MRS instruction where Xt == 31. Use the new pt_regs_write_reg() helper to handle this correctly. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: 77c97b4ee21290f5 ("arm64: cpufeature: Expose CPUID registers by emulation") Cc: Andre Przywara <andre.przywara@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>