linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2016-12-08	Merge branch 'stmmac-DMA-burst'	David S. Miller
	Niklas Cassel says: ==================== net: stmmac: make DMA programmable burst length more configurable Make DMA programmable burst length more configurable in the stmmac driver. This is done by adding support for independent pbl for tx/rx through DT. More fine grained tuning of pbl is possible thanks to a DT property saying that we should NOT multiply pbl values by x8/x4 in hardware. All new DT properties are optional, and created in a way that it will not affect any existing DT configurations. Changes since V1: Created cover-letter. Rebased patch set against next-20161205, since conflicting patches to stmmac_platform.c has been merged since V1. Changes since V2: Moved default value initialization of pbl to stmmac_platform.c and added a check for pbl != 0 in stmmac_main.c, to catch a possble pbl == 0 from pci glue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: smmac: allow configuring lower pbl values	Niklas Cassel
	The driver currently always sets the PBLx8/PBLx4 bit, which means that the pbl values configured via the pbl/txpbl/rxpbl DT properties are always multiplied by 8/4 in the hardware. In order to allow the DT to configure lower pbl values, while at the same time not changing behavior of any existing device trees using the pbl/txpbl/rxpbl settings, add a property to disable the multiplication of the pbl by 8/4 in the hardware. Suggested-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: add support for independent DMA pbl for tx/rx	Niklas Cassel
	GMAC and newer supports independent programmable burst lengths for DMA tx/rx. Add new optional devicetree properties representing this. To be backwards compatible, snps,pbl will still be valid, but snps,txpbl/snps,rxpbl will override the value in snps,pbl if set. If the IP is synthesized to use the AXI interface, there is a register and a matching DT property inside the optional stmmac-axi-config DT node for controlling burst lengths, named snps,blen. However, using this register, it is not possible to control tx and rx independently. Also, this register is not available if the IP was synthesized with, e.g., the AHB interface. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: dwmac1000: fix define DMA_BUS_MODE_RPBL_MASK	Niklas Cassel
	DMA_BUS_MODE_RPBL_MASK is really 6 bits, just like DMA_BUS_MODE_PBL_MASK. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: stmmac_platform: fix parsing of DT binding	Niklas Cassel
	commit 64c3b252e9fc ("net: stmmac: fixed the pbl setting with DT") changed the parsing of the DT binding. Before 64c3b252e9fc, snps,fixed-burst and snps,mixed-burst were parsed regardless if the property snps,pbl existed or not. After the commit, fixed burst and mixed burst are only parsed if snps,pbl exists. Now when snps,aal has been added, it too is only parsed if snps,pbl exists. Since the DT binding does not specify that fixed burst, mixed burst or aal depend on snps,pbl being specified, undo changes introduced by 64c3b252e9fc. The issue commit 64c3b252e9fc ("net: stmmac: fixed the pbl setting with DT") tries to address is solved in another way: The databook specifies that all values other than 1, 2, 4, 8, 16, or 32 results in undefined behavior, so snps,pbl = <0> is invalid. If pbl is 0 after parsing, set pbl to DEFAULT_DMA_PBL. This handles the case where the property is omitted, and also handles the case where the property is specified without any data. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: simplify the common DMA init API	Niklas Cassel
	Use struct stmmac_dma_cfg *dma_cfg as an argument rather than using all the struct members as individual arguments. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: return error if no DMA configuration is found	Niklas Cassel
	All drivers except pci glue layer calls stmmac_probe_config_dt. stmmac_probe_config_dt does a kzalloc dma_cfg. pci glue layer does kzalloc dma_cfg explicitly, so all current drivers does a kzalloc dma_cfg. Return an error if no DMA configuration is found, that way we can assume that the DMA configuration always exists. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	NET: usb: cdc_mbim: add quirk for supporting Telit LE922A	Daniele Palmas
	Telit LE922A MBIM based composition does not work properly with altsetting toggle done in cdc_ncm_bind_common. This patch adds CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE quirk to avoid this procedure that, instead, is mandatory for other modems. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Reviewed-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: ethernet: slicoss: use module_pci_driver()	Tobias Klauser
	Use module_pci_driver() to get rid of some boilerplate code. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	spi: mvebu: fix baudrate calculation for armada variant	Uwe Kleine-König
	The calculation of SPR and SPPR doesn't round correctly at several places which might result in baud rates that are too big. For example with tclk_hz = 250000001 and target rate 25000000 it determined a divider of 10 which is wrong. Instead of fixing all the corner cases replace the calculation by an algorithm without a loop which should even be quicker to execute apart from being correct. Fixes: df59fa7f4bca ("spi: orion: support armada extended baud rates") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-08	USB: OHCI: nxp: fix code warnings	Manjunath Goudar
	This patch will fix the checkpatch.pl following warnings: WARNING: Missing a blank line after declarations WARNING: braces {} are not necessary for single statement blocks Signed-off-by: Manjunath Goudar <csmanjuvijay@gmail.com> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Cc: Sylvain Lemieux <slemieux.tyco@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08	USB: OHCI: nxp: remove useless extern declaration	Manjunath Goudar
	Remove usb_disabled() extern declaration as it is already declared as extern in include/linux/usb.h. Signed-off-by: Manjunath Goudar <csmanjuvijay@gmail.com> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Cc: Sylvain Lemieux <slemieux.tyco@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08	USB: OHCI: at91: remove useless extern declaration	Manjunath Goudar
	Remove usb_disabled() extern declaration as it is already declared as extern in include/linux/usb.h. Signed-off-by: Manjunath Goudar <csmanjuvijay@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08	usb: misc: rio500: fix result type for error message	Kim Jae Joong
	Fix variable type for dev_err about usb_bulk_msg() Signed-off-by: Kim Jae Joong <climbbb.kim@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08	Merge branch 'cls_flower-ICMP'	David S. Miller
	Simon Horman says: ==================== net/sched: cls_flower: Support matching on ICMP this series adds support for matching on ICMP type and code to cls_flower. Changes v5->v6: * Restore missing signed-off-by Changes v4->v5: * Drop all helpers Changes v3->v4: * Do not add icmp to struct flow_keys, it is not needed * Do not test for ICMP protocols in packet in __skb_flow_dissect, this is also not needed Changes v2->v3: * Add FLOW_DISSECTOR_KEY_ICMP and use separate structure for ICMP Changes v1->v2: * Include all dissector helpers in first patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net/sched: cls_flower: Support matching on ICMP type and code	Simon Horman
	Support matching on ICMP type and code. Example usage: tc qdisc add dev eth0 ingress tc filter add dev eth0 protocol ip parent ffff: flower \ indev eth0 ip_proto icmp type 8 code 0 action drop tc filter add dev eth0 protocol ipv6 parent ffff: flower \ indev eth0 ip_proto icmpv6 type 128 code 0 action drop Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	flow dissector: ICMP support	Simon Horman
	Allow dissection of ICMP(V6) type and code. This should only occur if a packet is ICMP(V6) and the dissector has FLOW_DISSECTOR_KEY_ICMP set. There are currently no users of FLOW_DISSECTOR_KEY_ICMP. A follow-up patch will allow FLOW_DISSECTOR_KEY_ICMP to be used by the flower classifier. Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: stmmac_platform: use correct setup function for gmac4	Niklas Cassel
	devicetree binding for stmmac states: - compatible: Should be "snps,dwmac-<ip_version>", "snps,dwmac" For backwards compatibility: "st,spear600-gmac" is also supported. Previously, when specifying "snps,dwmac-4.10a", "snps,dwmac" as your compatible string, plat_stmmacenet_data would have both has_gmac and has_gmac4 set. This would lead to stmmac_hw_init calling dwmac1000_setup rather than dwmac4_setup, resulting in a non-functional driver. This happened since the check for has_gmac is done before the check for has_gmac4. However, the order should not matter, so it does not make sense to have both set. If something is valid for both, you should do as the stmmac_interrupt does: if (priv->plat->has_gmac \|\| priv->plat->has_gmac4) ... The places where it was obvious that the author actually meant if (has_gmac \|\| has_gmac4) rather than if (has_gmac) has been updated. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: stmmac: dwmac-generic: add missing compatible strings	Niklas Cassel
	devicetree binding for stmmac states: - compatible: Should be "snps,dwmac-<ip_version>", "snps,dwmac" For backwards compatibility: "st,spear600-gmac" is also supported. Since dwmac-generic.c calls stmmac_probe_config_dt explicitly, another alternative would have been to remove all compatible strings other than "snps,dwmac" and "st,spear600-gmac" from dwmac-generic.c. However, that would probably do more good than harm, since when trying to figure out what hardware a certain driver supports, you usually look at the compatible strings in the struct of_device_id, and not in some function defined in a completely different file. No functional change intended. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	bindings: net: stmmac: correct note about TSO	Niklas Cassel
	snps,tso was previously placed under AXI BUS Mode parameters, suggesting that the property should be in the stmmac-axi-config node. TSO (TCP Segmentation Offloading) has nothing to do with AXI BUS Mode parameters, and the parser actually expects it to be in the root node, not in the stmmac-axi-config. Also added a note about snps,tso only being available on GMAC4 and newer. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: ll_temac: Utilize of_get_mac_address()	Tobias Klauser
	Do not open code getting the MAC address exclusively from the "local-mac-address" property, but instead use of_get_mac_address() which looks up the MAC address using the 3 typical property names. Also avoid casting away the const qualifier of the return value by making temac_init_mac_address() take a const void* address. Follows commit b34296a9c047 ("net: ethoc: Utilize of_get_mac_address()"). Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: axienet: Utilize of_get_mac_address()	Tobias Klauser
	Do not open code getting the MAC address exclusively from the "local-mac-address" property, but instead use of_get_mac_address() which looks up the MAC address using the 3 typical property names. Also avoid casting away the const qualifier of the return value by making axienet_set_mac_address() take a const void* address. Follows commit b34296a9c047 ("net: ethoc: Utilize of_get_mac_address()"). Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	Merge branch 'cls_flower-flags'	David S. Miller
	Or Gerlitz says: ==================== net/sched: cls_flower: Add support for matching on dissection flags This series add the UAPI to provide set of flags for matching, where the flags provided from user-space are mapped to flow-dissector flags. The 1st flag allows to match on whether the packet is an IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag. v2->v3: - replace BIT() with << (kbuild test robot) v1->v2: - dropped the flow dissector patch (#1) as no changes are needed there (Jiri) - applied code review comments from Jiri to the flower patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net/mlx5e: Offload TC matching on packets being IP fragments	Or Gerlitz
	Enable offloading of matching on packets being fragments. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net/sched: cls_flower: Add support for matching on flags	Or Gerlitz
	Add UAPI to provide set of flags for matching, where the flags provided from user-space are mapped to flow-dissector flags. The 1st flag allows to match on whether the packet is an IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	net: mvneta: Indent some statements	Dan Carpenter
	These two statements were not indented correctly so it's sort of confusing. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	drivers: net: xgene: uninitialized variable in xgene_enet_free_pagepool()	Dan Carpenter
	We never set "slots" in this function. Fixes: a9380b0f7be8 ("drivers: net: xgene: Add support for Jumbo frame") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	vhost: remove unnecessary smp_mb from vhost_work_queue	Peng Tao
	test_and_set_bit() already implies a memory barrier. Signed-off-by: Peng Tao <bergwolf@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	vhost-vsock: remove unused vq variable	Peng Tao
	Signed-off-by: Peng Tao <bergwolf@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	icmp: correct return value of icmp_rcv()	Zhang Shengju
	Currently, icmp_rcv() always return zero on a packet delivery upcall. To make its behavior more compliant with the way this API should be used, this patch changes this to let it return NET_RX_SUCCESS when the packet is proper handled, and NET_RX_DROP otherwise. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08	spi: Add support for Armada 3700 SPI Controller	Romain Perier
	Marvell Armada 3700 SoC comprises an SPI Controller. This Controller supports up to 4 SPI slave devices, with dedicated chip selects,supports SPI mode 0/1/2 and 3, CPIO or Fifo mode with DMA transfers and different SPI transfer mode (Single, Dual or Quad). This commit adds basic driver support for FIFO mode. In this mode, dedicated registers are used to store the instruction, the address, the read mode and the data. Write and Read FIFO are used to store the outcoming or incoming data. The data FIFOs are accessible via DMA or by the CPU. Only the CPU is supported for now. Signed-off-by: Romain Perier <romain.perier@free-electrons.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-08	spi: armada-3700: Add documentation for the Armada 3700 SPI Controller	Romain Perier
	This adds the devicetree bindings documentation for the SPI controller present in the Marvell Armada 3700 SoCs. Signed-off-by: Romain Perier <romain.perier@free-electrons.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-08	rpmsg: qcom_smd: Correct return value for O_NONBLOCK	Bjorn Andersson
	qcom_smd_send() should return -EAGAIN for non-blocking channels with insufficient space, so that we can propagate this event to user space. Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend") Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-12-08	can: peak: fix bad memory access and free sequence	추지호
	Fix for bad memory access while disconnecting. netdev is freed before private data free, and dev is accessed after freeing netdev. This makes a slub problem, and it raise kernel oops with slub debugger config. Signed-off-by: Jiho Chu <jiho.chu@samsung.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-12-08	KVM: x86: Handle the kthread worker using the new API	Petr Mladek
	Use the new API to create and destroy the "kvm-pit" kthread worker. The API hides some implementation details. In particular, kthread_create_worker() allocates and initializes struct kthread_worker. It runs the kthread the right way and stores task_struct into the worker structure. kthread_destroy_worker() flushes all pending works, stops the kthread and frees the structure. This patch does not change the existing behavior except for dynamically allocating struct kthread_worker and storing only the pointer of this structure. It is compile tested only because I did not find an easy way how to run the code. Well, it should be pretty safe given the nature of the change. Signed-off-by: Petr Mladek <pmladek@suse.com> Message-Id: <1476877847-11217-1-git-send-email-pmladek@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-08	KVM: nVMX: invvpid handling improvements	Jan Dakinevich
	- Expose all invalidation types to the L1 - Reject invvpid instruction, if L1 passed zero vpid value to single context invalidations Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-08	KVM: nVMX: check host CR3 on vmentry and vmexit	Ladi Prosek
	This commit adds missing host CR3 checks. Before entering guest mode, the value of CR3 is checked for reserved bits. After returning, nested_vmx_load_cr3 is called to set the new CR3 value and check and load PDPTRs. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry	Ladi Prosek
	Loading CR3 as part of emulating vmentry is different from regular CR3 loads, as implemented in kvm_set_cr3, in several ways. * different rules are followed to check CR3 and it is desirable for the caller to distinguish between the possible failures * PDPTRs are not loaded if PAE paging and nested EPT are both enabled * many MMU operations are not necessary This patch introduces nested_vmx_load_cr3 suitable for CR3 loads as part of nested vmentry and vmexit, and makes use of it on the nested vmentry path. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: propagate errors from prepare_vmcs02	Ladi Prosek
	It is possible that prepare_vmcs02 fails to load the guest state. This patch adds the proper error handling for such a case. L1 will receive an INVALID_STATE vmexit with the appropriate exit qualification if it happens. A failure to set guest CR3 is the only error propagated from prepare_vmcs02 at the moment. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT	Ladi Prosek
	KVM does not correctly handle L1 hypervisors that emulate L2 real mode with PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used (see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two related issues: 1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are overwritten in ept_load_pdptrs because the registers are believed to have been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is running with PAE enabled but PDPTRs have been set up by L1. 2) When L2 is about to enable paging and loads its CR3, we, again, attempt to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees that this will succeed (it's just a CR3 load, paging is not enabled yet) and if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is then lost and L2 crashes right after it enables paging. This patch replaces the kvm_set_cr3 call with a simple register write if PAE and EPT are both on. CR3 is not to be interpreted in this case. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry	David Matlack
	vmx_set_cr0() modifies GUEST_EFER and "IA-32e mode guest" in the current VMCS. Call vmx_set_efer() after vmx_set_cr0() so that emulated VM-entry is more faithful to VMCS12. This patch correctly causes VM-entry to fail when "IA-32e mode guest" is 1 and GUEST_CR0.PG is 0. Previously this configuration would succeed and "IA-32e mode guest" would silently be disabled by KVM. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID	David Matlack
	MSR_IA32_CR{0,4}_FIXED1 define which bits in CR0 and CR4 are allowed to be 1 during VMX operation. Since the set of allowed-1 bits is the same in and out of VMX operation, we can generate these MSRs entirely from the guest's CPUID. This lets userspace avoiding having to save/restore these MSRs. This patch also initializes MSR_IA32_CR{0,4}_FIXED1 from the CPU's MSRs by default. This is a saner than the current default of -1ull, which includes bits that the host CPU does not support. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation	David Matlack
	KVM emulates MSR_IA32_VMX_CR{0,4}_FIXED1 with the value -1ULL, meaning all CR0 and CR4 bits are allowed to be 1 during VMX operation. This does not match real hardware, which disallows the high 32 bits of CR0 to be 1, and disallows reserved bits of CR4 to be 1 (including bits which are defined in the SDM but missing according to CPUID). A guest can induce a VM-entry failure by setting these bits in GUEST_CR0 and GUEST_CR4, despite MSR_IA32_VMX_CR{0,4}_FIXED1 indicating they are valid. Since KVM has allowed all bits to be 1 in CR0 and CR4, the existing checks on these registers do not verify must-be-0 bits. Fix these checks to identify must-be-0 bits according to MSR_IA32_VMX_CR{0,4}_FIXED1. This patch should introduce no change in behavior in KVM, since these MSRs are still -1ULL. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: support restore of VMX capability MSRs	David Matlack
	The VMX capability MSRs advertise the set of features the KVM virtual CPU can support. This set of features varies across different host CPUs and KVM versions. This patch aims to addresses both sources of differences, allowing VMs to be migrated across CPUs and KVM versions without guest-visible changes to these MSRs. Note that cross-KVM- version migration is only supported from this point forward. When the VMX capability MSRs are restored, they are audited to check that the set of features advertised are a subset of what KVM and the CPU support. Since the VMX capability MSRs are read-only, they do not need to be on the default MSR save/restore lists. The userspace hypervisor can set the values of these MSRs or read them from KVM at VCPU creation time, and restore the same value after every save/restore. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: generate non-true VMX MSRs based on true versions	David Matlack
	The "non-true" VMX capability MSRs can be generated from their "true" counterparts, by OR-ing the default1 bits. The default1 bits are fixed and defined in the SDM. Since we can generate the non-true VMX MSRs from the true versions, there's no need to store both in struct nested_vmx. This also lets userspace avoid having to restore the non-true MSRs. Note this does not preclude emulating MSR_IA32_VMX_BASIC[55]=0. To do so, we simply need to set all the default1 bits in the true MSRs (such that the true MSRs and the generated non-true MSRs are equal). Signed-off-by: David Matlack <dmatlack@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.	Kyle Huey
	The trap flag stays set until software clears it. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: x86: Add kvm_skip_emulated_instruction and use it.	Kyle Huey
	kvm_skip_emulated_instruction calls both kvm_x86_ops->skip_emulated_instruction and kvm_vcpu_check_singlestep, skipping the emulated instruction and generating a trap if necessary. Replacing skip_emulated_instruction calls with kvm_skip_emulated_instruction is straightforward, except for: - ICEBP, which is already inside a trap, so avoid triggering another trap. - Instructions that can trigger exits to userspace, such as the IO insns, MOVs to CR8, and HALT. If kvm_skip_emulated_instruction does trigger a KVM_GUESTDBG_SINGLESTEP exit, and the handling code for IN/OUT/MOV CR8/HALT also triggers an exit to userspace, the latter will take precedence. The singlestep will be triggered again on the next instruction, which is the current behavior. - Task switch instructions which would require additional handling (e.g. the task switch bit) and are instead left alone. - Cases where VMLAUNCH/VMRESUME do not proceed to the next instruction, which do not trigger singlestep traps as mentioned previously. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12	Kyle Huey
	We can't return both the pass/fail boolean for the vmcs and the upcoming continue/exit-to-userspace boolean for skip_emulated_instruction out of nested_vmx_check_vmcs, so move skip_emulated_instruction out of it instead. Additionally, VMENTER/VMRESUME only trigger singlestep exceptions when they advance the IP to the following instruction, not when they a) succeed, b) fail MSR validation or c) throw an exception. Add a separate call to skip_emulated_instruction that will later not be converted to the variant that checks the singlestep flag. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: VMX: Reorder some skip_emulated_instruction calls	Kyle Huey
	The functions being moved ahead of skip_emulated_instruction here don't need updated IPs, and skipping the emulated instruction at the end will make it easier to return its value. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: x86: Add a return value to kvm_emulate_cpuid	Kyle Huey
	Once skipping the emulated instruction can potentially trigger an exit to userspace (via KVM_GUESTDBG_SINGLESTEP) kvm_emulate_cpuid will need to propagate a return value. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>