summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-08-01ahci_platform: Remove unneeded ahci_driver.probe assignmentAnton Vorontsov
The driver is using platform_driver_probe() during initialization, so ahci_driver.probe hook is never used. But it causes the following (harmless, luckily) section mismatch: WARNING: vmlinux.o(.data+0x2fb20): Section mismatch in reference from the variable ahci_driver to the function .init.text:ahci_probe() This patch removes the ahci_driver.probe assignment, thus fixes the warning. p.s. Note that there's another patch[1] from Rene Bolldorf that tried to solve the same issue by __refdata annotation. __refdata says that this reference is actually OK, but in fact it is not OK, because dereferencing .probe() will cause problems. So the proper fix is to remove the assignment. [1] http://kerneltrap.org/mailarchive/linux-kernel/2010/3/18/4549547 Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2010-08-01ahci_platform: Provide for vendor specific initJassi Brar
Some AHCI implementations may use Vendor Specific HBA[A0h, FFh] and/or Port[70h, 7Fh] registers to 'prepare' for initialization. For that, the platform needs memory mapped address of AHCI registers. This patch adds the 'mmio' argument and reorders the call to platform init function. Signed-off-by: Jassi Brar <jassi.brar@samsung.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2010-08-02x86,mmiotrace: Add support for tracing STOS instructionMarcin Slusarz
Add support for stos access tracing with mmiotrace. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Pekka Paalanen <pq@iki.fi> Cc: Nouveau <nouveau@lists.freedesktop.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100731205101.GA5860@joi.lan> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-08-02perf, sched migration: Librarize task states and event headers helpersFrederic Weisbecker
Librarize the task state and event headers helpers as they can be generally useful. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Nikhil Rao <ncrao@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
2010-08-02perf, sched migration: Librarize the GUI classFrederic Weisbecker
Export the GUI facility in the common library path. It is going to be useful for other scheduler views. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Nikhil Rao <ncrao@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
2010-08-02perf, sched migration: Make the GUI class client agnosticFrederic Weisbecker
Make the perf migration GUI generic so that it can be reused for other kinds of trace painting. No more notion of CPUs or runqueue from the GUI class, it's now used as a library by the trace parser. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Nikhil Rao <ncrao@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
2010-08-02perf, sched migration: Make it vertically scrollableFrederic Weisbecker
With scheduler traces covering more than two cpus, rectangles of the CPUs 3 and more are not visibles. This makes the vertical navigation scrollable so that all of the CPUs rectangles are available. We also want to be able to zoom vertically, so that we can fit at best the screen with CPU rectangles, but that's for later. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Nikhil Rao <ncrao@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
2010-08-02perf, sched migration: Parameterize cpu height and spacingNikhil Rao
Without vertical zoom, it is not possible to see all CPUs in a trace taken on a larger machine. This patch parameterizes the height and spacing of CPUs so that you can fit more cpus into the screen. Ideally we should dynamically size/space the CPU rectangles with some minimum threshold. Until then, this patch is a stop-gap. Signed-off-by: Nikhil Rao <ncrao@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-08-02perf, sched migration: Fix key bindingsNikhil Rao
EVT_KEY_DOWN and EVT_LEFT_DOWN events are not bound to the RootFrame event handler. As a result, zoom/scroll via keyboard events do not work. This patch adds the missing bindings. Signed-off-by: Nikhil Rao <ncrao@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-08-02perf, sched migration: Ignore unhandled task statesFrederic Weisbecker
Stop printing an error message when we don't have the letter for a given task state. All we need to know is if the task is in the TASK_RUNNING state. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Nikhil Rao <ncrao@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
2010-08-02perf, sched migration: Handle ignored migrate out eventsFrederic Weisbecker
Migrate out events may happen on tasks that are not in the runqueue, for example this is the case for tasks that are sleeping. In this case, we don't want to log the migrate out event in the source runqueue because the task is not eventually in the runqueue and we have already logged its sleep event. This fixes timeslices that spuriously propagate a sleep event from the previous timeslice. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Nikhil Rao <ncrao@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
2010-08-02perf: New migration tool overviewFrederic Weisbecker
This brings a GUI tool that displays an overview of the load of tasks proportion in each CPUs. The CPUs forward progress is cut in timeslices. A new timeslice is created for every runqueue event: a task gets pushed out or pulled in the runqueue. For each timeslice, every CPUs rectangle is colored with a red power that describes the local load against the total load. This more red is the rectangle, the higher is the given CPU load. This load is the number of tasks running on the CPU, without any distinction against the scheduler policy of the tasks, for now. Also for each timeslice, the event origin is depicted on the CPUs that triggered it using a thin colored line on top of the rectangle timeslice. These events are: * sleep: a task went to sleep and has then been pulled out the runqueue. The origin color in the thin line is dark blue. * wake up: a task woke up and has then been pushed in the runqueue. The origin color is yellow. * wake up new: a new task woke up and has then been pushed in the runqueue. The origin color is green. * migrate in: a task migrated in the runqueue due to a load balancing operation. The origin color is violet. * migrate out: reverse of the previous one. Migrate in events usually have paired migrate out events in another runqueue. The origin color is light blue. Clicking on a timeslice provides the runqueue event details and the runqueue state. The CPU rectangles can be navigated using the usual arrow controls. Horizontal zooming in/out is possible with the "+" and "-" buttons. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Pierre Tardy <tardyp@gmail.com> Cc: Nikhil Rao <ncrao@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com>
2010-08-02tracing: Drop cpparg() macroFrederic Weisbecker
Drop the cpparg() macro that wraps CPP parameters. We already have the PARAM() macro for that, no need to have several versions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com>
2010-08-02perf: Use tracepoint_synchronize_unregister() to flush any pending ↵Frederic Weisbecker
tracepoint call We use synchronize_sched() to ensure a tracepoint won't be called while/after we release the perf buffers it references. But the tracepoint API has its own API for that: tracepoint_synchronize_unregister(). Use it instead as it's self-explanatory and eases maintainance. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com>
2010-08-01powerpc/5200/i2c: improve i2c bus error recoveryAlbrecht Dreß
This patch improves the recovery of the MPC's I2C bus from errors like bus hangs resulting in timeouts: 1. make the bus timeout configurable, as it depends on the bus clock and the attached slave chip(s); default is still 1 second; 2. detect any of the cases indicated by the CF, BB and RXAK MSR flags if a timeout occurs, and add a missing (required) MAL reset; 3. use a more reliable method to fixup the bus if a hang has been detected. The sequence is sent 9 times which seems to be necessary if a slave "misses" more than one clock cycle. For 400 kHz bus speed, the fixup is also ~70us (81us vs. 150us) faster. Tested on a custom Lite5200b derived board, with a Dallas RTC, AD sensors and NXP IO expander chips attached to the i2c. Changes vs. v1: - use improved bus fixup sequence for all chips (not only the 5200) - calculate real clock from defaults if no clock is given in the device tree - better description (I hope) of the changes. I didn't split the changes in this file into three parts as recommended by Grant, as they actually belong together (i.e. they address one single problem, just in three places of one single source file). Signed-off-by: Albrecht Dreß <albrecht.dress@arcor.de> [grant.likely@secretlab.ca: fixup for ->node to ->dev.of_node transition] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01of/xilinxfb: update tft compatible versionsAdrian Alonso
* Add tft display module compatibility for new hardware modules Signed-off-by: Adrian Alonso <aalonso00@gmail.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/fsl-diu-fb: Support setting display mode using EDIDAnatolij Gustschin
Adds support for encoding display mode information in the device tree using verbatim EDID block. If the EDID entry in the DIU node is present, the driver will build mode database using EDID data and allow setting the display modes from this database. Otherwise display mode will be set using mode entries from driver's internal database as usual. This patch also updates device tree bindings. Signed-off-by: Anatolij Gustschin <agust@denx.de> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/5121: doc/dts-bindings: update doc of FSL DIU bindingsAnatolij Gustschin
Update compatible and interrupt properties description. Furthermore an example for the MPC5121 has been added. Signed-off-by: Anatolij Gustschin <agust@denx.de> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/5121: shared DIU framebuffer supportAnatolij Gustschin
MPC5121 DIU configuration/setup as initialized by the boot loader currently will get lost while booting Linux. As a result displaying the boot splash is not possible through the boot process. To prevent this we reserve configured DIU frame buffer address range while booting and preserve AOI descriptor and gamma table so that DIU continues displaying through the whole boot process. On first open from user space DIU frame buffer driver releases the reserved frame buffer area and continues to operate as usual. Signed-off-by: John Rigby <jcrigby@gmail.com> Signed-off-by: Anatolij Gustschin <agust@denx.de> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/5121: move fsl-diu-fb.h to include/linuxAnatolij Gustschin
Some DIU structures will be used in platform code in subsequent MPC5121 DIU patch, so we move this header to be able to include it elsewhere. Signed-off-by: Anatolij Gustschin <agust@denx.de> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/5121: fsl-diu-fb: fix issue with re-enabling DIU area descriptorAnatolij Gustschin
On MPC5121e Rev 2.0 re-configuring the DIU area descriptor by writing new descriptor address doesn't always work. As a result, DIU continues to display using old area descriptor even if the new one has been written to the descriptor register of the plane. Add the code from Freescale MPC5121EADS BSP for writing descriptor addresses properly. This fixes the problem for Rev 2.0 silicon. Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/512x: add clock structure for Video-IN (VIU) unitAnatolij Gustschin
Allows using clk_get()/clk_enable()/clk_disable() for VIU clock in the v4l2 video driver. Signed-off-by: Hongjun Chen <hong-jun.chen@freescale.com> Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/5121: add initial support for PDM360NG boardAnatolij Gustschin
Adds IFM PDM360NG device tree and platform code. Currently following is supported: - Spansion S29GL512P 256 MB NOR flash - ST Micro NAND 1 GiB flash - DIU, please use "fbcon=map:5 video=fslfb:800x480-32@60" at the kernel command line to enable PrimeView PM070WL3 Display support. - FEC - I2C - RTC, EEPROM - MSCAN - PSC UART, please pass "console=tty0 console=ttyPSC5,115200" on the kernel command line. - SPI, ADS7845 Touchscreen - USB0/1 Host - USB0 OTG Host/Device - VIU, Overlay/Capture support Signed-off-by: Markus Fischer <markus.fischer.ec@ifm.com> Signed-off-by: Wolfgang Grandegger <wg@denx.de> Signed-off-by: Michael Weiss <michael.weiss@ifm.com> Signed-off-by: Detlev Zundel <dzu@denx.de> Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01powerpc/512x: Group mpc512x board's selection menuAnatolij Gustschin
Allow board selection in a drop-down board sub-menu like many other platforms do. Before the patch: ... [ ] Freescale MPC5121E ADS [ ] Generic support for simple MPC5121 based boards [ ] 52xx-based boards ... Patched: ... [*] 512x-based boards [ ] Freescale MPC5121E ADS [ ] Generic support for simple MPC5121 based boards [ ] 52xx-based boards ... This is a cleanup before adding new board selection entry. Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01x86-32, asm: Directly access per-cpu GDTBrian Gerst
Use a direct per-cpu reference for the GDT instead of using a scratch register. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1280594903-6341-2-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-01x86-64, asm: Directly access per-cpu ISTBrian Gerst
Use a direct per-cpu reference for the IST instead of using a scratch register. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1280594903-6341-1-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-01Linux 2.6.35v2.6.35Linus Torvalds
2010-08-01NFS: Fix a typo in include/linux/nfs_fs.hTrond Myklebust
nfs_commit_inode() needs to be defined irrespectively of whether or not we are supporting NFSv3 and NFSv4. Allow the compiler to optimise away code in the NFSv2-only case by converting it into an inlined stub function. Reported-and-tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-01ext4: force block allocation on quota_offDmitry Monakhov
Perform full sync procedure so that any delayed allocation blocks are allocated so quota will be consistent. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-08-01ext4: fix freeze deadlock under IOEric Sandeen
Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks when freezing a filesystem which had active IO; the vfs_check_frozen level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing through. Duh. Changing the test to FREEZE_TRANS should let the normal freeze syncing get through the fs, but still block any transactions from starting once the fs is completely frozen. I tested this by running fsstress in the background while periodically snapshotting the fs and running fsck on the result. I ran into occasional deadlocks, but different ones. I think this is a fine fix for the problem at hand, and the other deadlocky things will need more investigation. Reported-by: Phillip Susi <psusi@cfl.rr.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-08-01workqueue: mark init_workqueues() as early_initcall()Suresh Siddha
Mark init_workqueues() as early_initcall() and thus it will be initialized before smp bringup. init_workqueues() registers for the hotcpu notifier and thus it should cope with the processors that are brought online after the workqueues are initialized. x86 smp bringup code uses workqueues and uses a workaround for the cold boot process (as the workqueues are initialized post smp_init()). Marking init_workqueues() as early_initcall() will pave the way for cleaning up this code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2010-08-01workqueue: explain for_each_*cwq_cpu() iteratorsTejun Heo
for_each_*cwq_cpu() are similar to regular CPU iterators except that it also considers the pseudo CPU number used for unbound workqueues. Explain them. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>
2010-08-01KVM: Remove unnecessary divide operationsJoerg Roedel
This patch converts unnecessary divide and modulo operations in the KVM large page related code into logical operations. This allows to convert gfn_t to u64 while not breaking 32 bit builds. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: Fix IOMMU memslot reference warningSheng Yang
This patch fixes the following warning. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- include/linux/kvm_host.h:259 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 no locks held by qemu-system-x86/29679. stack backtrace: Pid: 29679, comm: qemu-system-x86 Not tainted 2.6.35-rc3+ #200 Call Trace: [<ffffffff810a224e>] lockdep_rcu_dereference+0xa8/0xb1 [<ffffffffa018a06f>] kvm_iommu_unmap_memslots+0xc9/0xde [kvm] [<ffffffffa018a0c4>] kvm_iommu_unmap_guest+0x40/0x4e [kvm] [<ffffffffa018f772>] kvm_arch_destroy_vm+0x1a/0x186 [kvm] [<ffffffffa01800d0>] kvm_put_kvm+0x110/0x167 [kvm] [<ffffffffa0180ecc>] kvm_vcpu_release+0x18/0x1c [kvm] [<ffffffff81156f5d>] fput+0x22a/0x3a0 [<ffffffff81152288>] filp_close+0xb4/0xcd [<ffffffff8106599f>] put_files_struct+0x1b7/0x36b [<ffffffff81065830>] ? put_files_struct+0x48/0x36b [<ffffffff8131ee59>] ? do_raw_spin_unlock+0x118/0x160 [<ffffffff81065bc0>] exit_files+0x6d/0x75 [<ffffffff81068348>] do_exit+0x47d/0xc60 [<ffffffff8177e7b5>] ? _raw_spin_unlock_irq+0x30/0x36 [<ffffffff81068bfa>] do_group_exit+0xcf/0x134 [<ffffffff81080790>] get_signal_to_deliver+0x732/0x81d [<ffffffff81095996>] ? cpu_clock+0x4e/0x60 [<ffffffff81002082>] do_notify_resume+0x117/0xc43 [<ffffffff810a2fa3>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81080d79>] ? sys_rt_sigtimedwait+0x2b5/0x3bf [<ffffffff8177d9f2>] ? trace_hardirqs_off_thunk+0x3a/0x3c [<ffffffff81003221>] ? sysret_signal+0x5/0x3d [<ffffffff8100343b>] int_signal+0x12/0x17 Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: PPC: Make use of hash based Shadow MMUAlexander Graf
We just introduced generic functions to handle shadow pages on PPC. This patch makes the respective backends make use of them, getting rid of a lot of duplicate code along the way. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: PPC: Add generic hpte management functionsAlexander Graf
Currently the shadow paging code keeps an array of entries it knows about. Whenever the guest invalidates an entry, we loop through that entry, trying to invalidate matching parts. While this is a really simple implementation, it is probably the most ineffective one possible. So instead, let's keep an array of lists around that are indexed by a hash. This way each PTE can be added by 4 list_add, removed by 4 list_del invocations and the search only needs to loop through entries that share the same hash. This patch implements said lookup and exports generic functions that both the 32-bit and 64-bit backend can use. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: MMU: cleanup FNAME(fetch)() functionsXiao Guangrong
Cleanup this function that we are already get the direct sp's access Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: MMU: fix direct sp's access corruptedXiao Guangrong
If the mapping is writable but the dirty flag is not set, we will find the read-only direct sp and setup the mapping, then if the write #PF occur, we will mark this mapping writable in the read-only direct sp, now, other real read-only mapping will happily write it without #PF. It may hurt guest's COW Fixed by re-install the mapping when write #PF occur. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: MMU: fix conflict access permissions in direct spXiao Guangrong
In no-direct mapping, we mark sp is 'direct' when we mapping the guest's larger page, but its access is encoded form upper page-struct entire not include the last mapping, it will cause access conflict. For example, have this mapping: [W] / PDE1 -> |---| P[W] | | LPA \ PDE2 -> |---| [R] P have two children, PDE1 and PDE2, both PDE1 and PDE2 mapping the same lage page(LPA). The P's access is WR, PDE1's access is WR, PDE2's access is RO(just consider read-write permissions here) When guest access PDE1, we will create a direct sp for LPA, the sp's access is from P, is W, then we will mark the ptes is W in this sp. Then, guest access PDE2, we will find LPA's shadow page, is the same as PDE's, and mark the ptes is RO. So, if guest access PDE1, the incorrect #PF is occured. Fixed by encode the last mapping access into direct shadow page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: MMU: fix writable sync sp mappingXiao Guangrong
While we sync many unsync sp at one time(in mmu_sync_children()), we may mapping the spte writable, it's dangerous, if one unsync sp's mapping gfn is another unsync page's gfn. For example: SP1.pte[0] = P SP2.gfn's pfn = P [SP1.pte[0] = SP2.gfn's pfn] First, we write protected SP1 and SP2, but SP1 and SP2 are still the unsync sp. Then, sync SP1 first, it will detect SP1.pte[0].gfn only has one unsync-sp, that is SP2, so it will mapping it writable, but we plan to sync SP2 soon, at this point, the SP2->unsync is not reliable since later we sync SP2 but SP2->gfn is already writable. So the final result is: SP2 is the sync page but SP2.gfn is writable. This bug will corrupt guest's page table, fixed by mark read-only mapping if the mapped gfn has shadow pages. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: VMX: Execute WBINVD to keep data consistency with assigned devicesSheng Yang
Some guest device driver may leverage the "Non-Snoop" I/O, and explicitly WBINVD or CLFLUSH to a RAM space. Since migration may occur before WBINVD or CLFLUSH, we need to maintain data consistency either by: 1: flushing cache (wbinvd) when the guest is scheduled out if there is no wbinvd exit, or 2: execute wbinvd on all dirty physical CPUs when guest wbinvd exits. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: Document KVM specific review itemsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: Simplify vcpu_enter_guest() mmu reload logic slightlyAvi Kivity
No need to reload the mmu in between two different vcpu->requests checks. kvm_mmu_reload() may trigger KVM_REQ_TRIPLE_FAULT, but that will be caught during atomic guest entry later. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: Search the LAPIC's for one that will accept a PIC interruptChris Lalancette
Older versions of 32-bit linux have a "Checking 'hlt' instruction" test where they repeatedly call the 'hlt' instruction, and then expect a timer interrupt to kick the CPU out of halt. This happens before any LAPIC or IOAPIC setup happens, which means that all of the APIC's are in virtual wire mode at this point. Unfortunately, the current implementation of virtual wire mode is hardcoded to only kick the BSP, so if a crash+kexec occurs on a different vcpu, it will never get kicked. This patch makes pic_unlock() do the equivalent of kvm_irq_delivery_to_apic() for the IOAPIC code. That is, it runs through all of the vcpus looking for one that is in virtual wire mode. In the normal case where LAPICs and IOAPICs are configured, this won't be used at all. In the bootstrap phase of a modern OS, before the LAPICs and IOAPICs are configured, this will have exactly the same behavior as today; VCPU0 is always looked at first, so it will always get out of the loop after the first iteration. This will only go through the loop more than once during a kexec/kdump, in which case it will only do it a few times until the kexec'ed kernel programs the LAPIC and IOAPIC. Signed-off-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: ia64: cleanup kvm_ia64_sync_dirty_log()Takuya Yoshikawa
kvm_ia64_sync_dirty_log() is a helper function for kvm_vm_ioctl_get_dirty_log() which copies ia64's arch specific dirty bitmap to general one in memslot. So doing sanity checks in this function is unnatural. We move these checks outside of this and change the prototype appropriately. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: ia64: fix dirty_log_lock spin_lock section not to include get_dirty_log()Takuya Yoshikawa
kvm_get_dirty_log() calls copy_to_user(). So we need to narrow the dirty_log_lock spin_lock section not to include this. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: PPC: Make BAT only guest segments workAlexander Graf
When a guest sets its SR entry to invalid, we may still find a corresponding entry in a BAT. So we need to make sure we're not faulting on invalid SR entries, but instead just claim them to be BAT resolved. This resolves breakage experienced when using libogc based guests. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: PPC: Use kernel hash functionAlexander Graf
The linux kernel already provides a hash function. Let's reuse that instead of reinventing the wheel! Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: PPC: Remove obsolete kvmppc_mmu_find_pteAlexander Graf
Initially we had to search for pte entries to invalidate them. Since the logic has improved since then, we can just get rid of the search function. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: Fix a race condition for usage of is_hwpoison_address()Huang Ying
is_hwpoison_address accesses the page table, so the caller must hold current->mm->mmap_sem in read mode. So fix its usage in hva_to_pfn of kvm accordingly. Comment is_hwpoison_address to remind other users. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>