summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-03-16KVM: Refactor error handling for setting memory regionSean Christopherson
Replace a big pile o' gotos with returns to make it more obvious what error code is being returned, and to prepare for refactoring the functional, i.e. post-checks, portion of __kvm_set_memory_region(). Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Explicitly free allocated-but-unused dirty bitmapSean Christopherson
Explicitly free an allocated-but-unused dirty bitmap instead of relying on kvm_free_memslot() if an error occurs in __kvm_set_memory_region(). There is no longer a need to abuse kvm_free_memslot() to free arch specific resources as arch specific code is now called only after the common flow is guaranteed to succeed. Arch code can still fail, but it's responsible for its own cleanup in that case. Eliminating the error path's abuse of kvm_free_memslot() paves the way for simplifying kvm_free_memslot(), i.e. dropping its @dont param. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop kvm_arch_create_memslot()Sean Christopherson
Remove kvm_arch_create_memslot() now that all arch implementations are effectively nops. Removing kvm_arch_create_memslot() eliminates the possibility for arch specific code to allocate memory prior to setting a memslot, which sets the stage for simplifying kvm_free_memslot(). Cc: Janosch Frank <frankja@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: x86: Allocate memslot resources during prepare_memory_region()Sean Christopherson
Allocate the various metadata structures associated with a new memslot during kvm_arch_prepare_memory_region(), which paves the way for removing kvm_arch_create_memslot() altogether. Moving x86's memory allocation only changes the order of kernel memory allocations between x86 and common KVM code. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: PPC: Move memslot memory allocation into prepare_memory_region()Sean Christopherson
Allocate the rmap array during kvm_arch_prepare_memory_region() to pave the way for removing kvm_arch_create_memslot() altogether. Moving PPC's memory allocation only changes the order of kernel memory allocations between PPC and common KVM code. No functional change intended. Acked-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Don't free new memslot if allocation of said memslot failsSean Christopherson
The two implementations of kvm_arch_create_memslot() in x86 and PPC are both good citizens and free up all local resources if creation fails. Return immediately (via a superfluous goto) instead of calling kvm_free_memslot(). Note, the call to kvm_free_memslot() is effectively an expensive nop in this case as there are no resources to be freed. No functional change intended. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Reinstall old memslots if arch preparation failsSean Christopherson
Reinstall the old memslots if preparing the new memory region fails after invalidating a to-be-{re}moved memslot. Remove the superfluous 'old_memslots' variable so that it's somewhat clear that the error handling path needs to free the unused memslots, not simply the 'old' memslots. Fixes: bc6678a33d9b9 ("KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update") Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: x86: Allocate new rmap and large page tracking when moving memslotSean Christopherson
Reallocate a rmap array and recalcuate large page compatibility when moving an existing memslot to correctly handle the alignment properties of the new memslot. The number of rmap entries required at each level is dependent on the alignment of the memslot's base gfn with respect to that level, e.g. moving a large-page aligned memslot so that it becomes unaligned will increase the number of rmap entries needed at the now unaligned level. Not updating the rmap array is the most obvious bug, as KVM accesses garbage data beyond the end of the rmap. KVM interprets the bad data as pointers, leading to non-canonical #GPs, unexpected #PFs, etc... general protection fault: 0000 [#1] SMP CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:rmap_get_first+0x37/0x50 [kvm] Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3 RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246 RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012 RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570 RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8 R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000 R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8 FS: 00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0 Call Trace: kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm] __kvm_set_memory_region.part.64+0x559/0x960 [kvm] kvm_set_memory_region+0x45/0x60 [kvm] kvm_vm_ioctl+0x30f/0x920 [kvm] do_vfs_ioctl+0xa1/0x620 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4c/0x170 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f7ed9911f47 Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47 RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004 RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700 R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000 R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000 Modules linked in: kvm_intel kvm irqbypass ---[ end trace 0c5f570b3358ca89 ]--- The disallow_lpage tracking is more subtle. Failure to update results in KVM creating large pages when it shouldn't, either due to stale data or again due to indexing beyond the end of the metadata arrays, which can lead to memory corruption and/or leaking data to guest/userspace. Note, the arrays for the old memslot are freed by the unconditional call to kvm_free_memslot() in __kvm_set_memory_region(). Fixes: 05da45583de9b ("KVM: MMU: large page support") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: x86: Move gpa_val and gpa_available into the emulator contextSean Christopherson
Move the GPA tracking into the emulator context now that the context is guaranteed to be initialized via __init_emulate_ctxt() prior to dereferencing gpa_{available,val}, i.e. now that seeing a stale gpa_available will also trigger a WARN due to an invalid context. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: x86: Add EMULTYPE_PF when emulation is triggered by a page faultSean Christopherson
Add a new emulation type flag to explicitly mark emulation related to a page fault. Move the propation of the GPA into the emulator from the page fault handler into x86_emulate_instruction, using EMULTYPE_PF as an indicator that cr2 is valid. Similarly, don't propagate cr2 into the exception.address when it's *not* valid. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: apic: remove unused function apic_lvt_vector()Miaohe Lin
The function apic_lvt_vector() is unused now, remove it. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: VMX: Add 'else' to split mutually exclusive caseMiaohe Lin
Each if branch in handle_external_interrupt_irqoff() is mutually exclusive. Add 'else' to make it clear and also avoid some unnecessary check. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: x86: eliminate some unreachable codeMiaohe Lin
These code are unreachable, remove them. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: x86: Fix print format and coding styleMiaohe Lin
Use %u to print u32 var and correct some coding style. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: vmx: rewrite the comment in vmx_get_mt_maskChia-I Wu
Better reflect the structure of the code and metion why we could not always honor the guest. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: Gurchetan Singh <gurchetansingh@chromium.org> Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Convert some printf's to pr_info'sAndrew Jones
We leave some printf's because they inform the user the test is being skipped. QUIET should not disable those. We also leave the printf's used for help text. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Rework debug message printingAndrew Jones
There were a few problems with the way we output "debug" messages. The first is that we used DEBUG() which is defined when NDEBUG is not defined, but NDEBUG will never be defined for kselftests because it relies too much on assert(). The next is that most of the DEBUG() messages were actually "info" messages, which users may want to turn off if they just want a silent test that either completes or asserts. Finally, a debug message output from a library function, and thus for all tests, was annoying when its information wasn't interesting for a test. Rework these messages so debug messages only output when DEBUG is defined and info messages output unless QUIET is defined. Also name the functions pr_debug and pr_info and make sure that when they're disabled we eat all the inputs. The later avoids unused variable warnings when the variables were only defined for the purpose of printing. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Time guest demand pagingBen Gardon
In order to quantify demand paging performance, time guest execution during demand paging. Signed-off-by: Ben Gardon <bgardon@google.com> [Move timespec-diff to test_util.h] Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Support multiple vCPUs in demand paging testBen Gardon
Most VMs have multiple vCPUs, the concurrent execution of which has a substantial impact on demand paging performance. Add an option to create multiple vCPUs to each access disjoint regions of memory. Signed-off-by: Ben Gardon <bgardon@google.com> [guest_code() can't return, use GUEST_ASSERT(). Ensure the number of guests pages is compatible with the host.] Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Add support for vcpu_args_set to aarch64 and s390xBen Gardon
Currently vcpu_args_set is only implemented for x86. This makes writing tests with multiple vCPUs difficult as each guest vCPU must either a.) do the same thing or b.) derive some kind of unique token from it's registers or the architecture. To simplify the process of writing tests with multiple vCPUs for s390 and aarch64, add set args functions for those architectures. Signed-off-by: Ben Gardon <bgardon@google.com> [Fixed array index (num => i) and made some style changes.] Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Pass args to vCPU in global vCPU args structBen Gardon
In preparation for supporting multiple vCPUs in the demand paging test, pass arguments to the vCPU in a consolidated global struct instead of syncing multiple globals. Signed-off-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Add memory size parameter to the demand paging testBen Gardon
Add an argument to allow the demand paging test to work on larger and smaller guest sizes. Signed-off-by: Ben Gardon <bgardon@google.com> [Rewrote parse_size() to simplify and provide user more flexibility as to how sizes are input. Also fixed size overflow assert.] Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Add configurable demand paging delayBen Gardon
When running the demand paging test with the -u option, the User Fault FD handler essentially adds an arbitrary delay to page fault resolution. To enable better simulation of a real demand paging scenario, add a configurable delay to the UFFD handler. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: selftests: Add demand paging content to the demand paging testBen Gardon
The demand paging test is currently a simple page access test which, while potentially useful, doesn't add much versus the existing dirty logging test. To improve the demand paging test, add a basic userfaultfd demand paging implementation. Signed-off-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: arm64: Use the correct timer structure to access the physical counterKarimAllah Ahmed
Use the physical timer structure when reading the physical counter instead of using the virtual timer structure. Thankfully, nothing is accessing this code path yet (at least not until we enable save/restore of the physical counter). It doesn't hurt for this to be correct though. Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> [maz: amended commit log] Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Fixes: 84135d3d18da ("KVM: arm/arm64: consolidate arch timer trap handlers") Link: https://lore.kernel.org/r/1584351546-5018-1-git-send-email-karahmed@amazon.de
2020-03-16irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependencyMarc Zyngier
On a very heavily loaded D05 with GICv4, I managed to trigger the following lockdep splat: [ 6022.598864] ====================================================== [ 6022.605031] WARNING: possible circular locking dependency detected [ 6022.611200] 5.6.0-rc4-00026-geee7c7b0f498 #680 Tainted: G E [ 6022.618061] ------------------------------------------------------ [ 6022.624227] qemu-system-aar/7569 is trying to acquire lock: [ 6022.629789] ffff042f97606808 (&p->pi_lock){-.-.}, at: try_to_wake_up+0x54/0x7a0 [ 6022.637102] [ 6022.637102] but task is already holding lock: [ 6022.642921] ffff002fae424cf0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x5c/0x98 [ 6022.651350] [ 6022.651350] which lock already depends on the new lock. [ 6022.651350] [ 6022.659512] [ 6022.659512] the existing dependency chain (in reverse order) is: [ 6022.666980] [ 6022.666980] -> #2 (&irq_desc_lock_class){-.-.}: [ 6022.672983] _raw_spin_lock_irqsave+0x50/0x78 [ 6022.677848] __irq_get_desc_lock+0x5c/0x98 [ 6022.682453] irq_set_vcpu_affinity+0x40/0xc0 [ 6022.687236] its_make_vpe_non_resident+0x6c/0xb8 [ 6022.692364] vgic_v4_put+0x54/0x70 [ 6022.696273] vgic_v3_put+0x20/0xd8 [ 6022.700183] kvm_vgic_put+0x30/0x48 [ 6022.704182] kvm_arch_vcpu_put+0x34/0x50 [ 6022.708614] kvm_sched_out+0x34/0x50 [ 6022.712700] __schedule+0x4bc/0x7f8 [ 6022.716697] schedule+0x50/0xd8 [ 6022.720347] kvm_arch_vcpu_ioctl_run+0x5f0/0x978 [ 6022.725473] kvm_vcpu_ioctl+0x3d4/0x8f8 [ 6022.729820] ksys_ioctl+0x90/0xd0 [ 6022.733642] __arm64_sys_ioctl+0x24/0x30 [ 6022.738074] el0_svc_common.constprop.3+0xa8/0x1e8 [ 6022.743373] do_el0_svc+0x28/0x88 [ 6022.747198] el0_svc+0x14/0x40 [ 6022.750761] el0_sync_handler+0x124/0x2b8 [ 6022.755278] el0_sync+0x140/0x180 [ 6022.759100] [ 6022.759100] -> #1 (&rq->lock){-.-.}: [ 6022.764143] _raw_spin_lock+0x38/0x50 [ 6022.768314] task_fork_fair+0x40/0x128 [ 6022.772572] sched_fork+0xe0/0x210 [ 6022.776484] copy_process+0x8c4/0x18d8 [ 6022.780742] _do_fork+0x88/0x6d8 [ 6022.784478] kernel_thread+0x64/0x88 [ 6022.788563] rest_init+0x30/0x270 [ 6022.792390] arch_call_rest_init+0x14/0x1c [ 6022.796995] start_kernel+0x498/0x4c4 [ 6022.801164] [ 6022.801164] -> #0 (&p->pi_lock){-.-.}: [ 6022.806382] __lock_acquire+0xdd8/0x15c8 [ 6022.810813] lock_acquire+0xd0/0x218 [ 6022.814896] _raw_spin_lock_irqsave+0x50/0x78 [ 6022.819761] try_to_wake_up+0x54/0x7a0 [ 6022.824018] wake_up_process+0x1c/0x28 [ 6022.828276] wakeup_softirqd+0x38/0x40 [ 6022.832533] __tasklet_schedule_common+0xc4/0xf0 [ 6022.837658] __tasklet_schedule+0x24/0x30 [ 6022.842176] check_irq_resend+0xc8/0x158 [ 6022.846609] irq_startup+0x74/0x128 [ 6022.850606] __enable_irq+0x6c/0x78 [ 6022.854602] enable_irq+0x54/0xa0 [ 6022.858431] its_make_vpe_non_resident+0xa4/0xb8 [ 6022.863557] vgic_v4_put+0x54/0x70 [ 6022.867469] kvm_arch_vcpu_blocking+0x28/0x38 [ 6022.872336] kvm_vcpu_block+0x48/0x490 [ 6022.876594] kvm_handle_wfx+0x18c/0x310 [ 6022.880938] handle_exit+0x138/0x198 [ 6022.885022] kvm_arch_vcpu_ioctl_run+0x4d4/0x978 [ 6022.890148] kvm_vcpu_ioctl+0x3d4/0x8f8 [ 6022.894494] ksys_ioctl+0x90/0xd0 [ 6022.898317] __arm64_sys_ioctl+0x24/0x30 [ 6022.902748] el0_svc_common.constprop.3+0xa8/0x1e8 [ 6022.908046] do_el0_svc+0x28/0x88 [ 6022.911871] el0_svc+0x14/0x40 [ 6022.915434] el0_sync_handler+0x124/0x2b8 [ 6022.919951] el0_sync+0x140/0x180 [ 6022.923773] [ 6022.923773] other info that might help us debug this: [ 6022.923773] [ 6022.931762] Chain exists of: [ 6022.931762] &p->pi_lock --> &rq->lock --> &irq_desc_lock_class [ 6022.931762] [ 6022.942101] Possible unsafe locking scenario: [ 6022.942101] [ 6022.948007] CPU0 CPU1 [ 6022.952523] ---- ---- [ 6022.957039] lock(&irq_desc_lock_class); [ 6022.961036] lock(&rq->lock); [ 6022.966595] lock(&irq_desc_lock_class); [ 6022.973109] lock(&p->pi_lock); [ 6022.976324] [ 6022.976324] *** DEADLOCK *** This is happening because we have a pending doorbell that requires retrigger. As SW retriggering is done in a tasklet, we trigger the circular dependency above. The easy cop-out is to provide a retrigger callback that doesn't require acquiring any extra lock. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200310184921.23552-5-maz@kernel.org
2020-03-16ARM: sa1111: Fix irq_retrigger callback return valueMarc Zyngier
The irq_retrigger callback is supposed to return 0 when retrigger has failed, and a non-zero value otherwise. Tell the core code that the driver has succedded in using the HW to retrigger the interrupt (if ever). Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200310184921.23552-4-maz@kernel.org
2020-03-16irqchip/atmel-aic5: Fix irq_retrigger callback return valueMarc Zyngier
The irq_retrigger callback is supposed to return 0 when retrigger has failed, and a non-zero value otherwise. Tell the core code that the driver has succedded in using the HW to retrigger the interrupt. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200310184921.23552-3-maz@kernel.org
2020-03-16irqchip/atmel-aic: Fix irq_retrigger callback return valueMarc Zyngier
The irq_retrigger callback is supposed to return 0 when retrigger has failed, and a non-zero value otherwise. Tell the core code that the driver has succedded in using the HW to retrigger the interrupt. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200310184921.23552-2-maz@kernel.org
2020-03-16irqchip/gic-v3-its: Probe ITS page size for all GITS_BASERn registersMarc Zyngier
The GICv3 ITS driver assumes that once it has latched on a page size for a given BASER register, it can use the same page size as the maximum page size for all subsequent BASER registers. Although it worked so far, nothing in the architecture guarantees this, and Nianyao Tang hit this problem on some undisclosed implementation. Let's bite the bullet and probe the the supported page size on all BASER registers before starting to populate the tables. This simplifies the setup a bit, at the expense of a few additional MMIO accesses. Signed-off-by: Marc Zyngier <maz@kernel.org> Reported-by: Nianyao Tang <tangnianyao@huawei.com> Tested-by: Nianyao Tang <tangnianyao@huawei.com> Link: https://lore.kernel.org/r/1584089195-63897-1-git-send-email-zhangshaokun@hisilicon.com
2020-03-16irqchip/bcm2835: Quiesce IRQs left enabled by bootloaderLukas Wunner
Per the spec, the BCM2835's IRQs are all disabled when coming out of power-on reset. Its IRQ driver assumes that's still the case when the kernel boots and does not perform any initialization of the registers. However the Raspberry Pi Foundation's bootloader leaves the USB interrupt enabled when handing over control to the kernel. Quiesce IRQs and the FIQ if they were left enabled and log a message to let users know that they should update the bootloader once a fixed version is released. If the USB interrupt is not quiesced and the USB driver later on claims the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel), interrupt latency for all other peripherals increases and occasional lockups occur. That's because both the FIQ and the normal USB interrupt fire simultaneously: On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0 and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit but remains relatively constant at 23.5 MB/s with this commit. The lockups occur when CPU 0 receives a USB interrupt while holding a lock which CPU 1 is trying to acquire while the FIQ is temporarily disabled on CPU 1. At best users get RCU CPU stall warnings, but most of the time the system just freezes. Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Link: https://lore.kernel.org/r/f97868ba4e9b86ddad71f44ec9d8b3b7d8daa1ea.1582618537.git.lukas@wunner.de
2020-03-16irqchip/sifive-plic: Add support for multiple PLICsAtish Patra
Current, PLIC driver can support only 1 PLIC on the board. However, there can be multiple PLICs present on a two socket systems in RISC-V. Modify the driver so that each PLIC handler can have a information about individual PLIC registers and an irqdomain associated with it. Tested on two socket RISC-V system based on VCU118 FPGA connected via OmniXtend protocol. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20200302231146.15530-3-atish.patra@wdc.com
2020-03-16irqchip/sifive-plic: Enable/Disable external interrupts upon cpu online/offlineAtish Patra
Currently, PLIC threshold is only initialized once in the beginning. However, threshold can be set to disabled if a CPU is marked offline with CPU hotplug feature. This will not allow to change the irq affinity to a CPU that just came online. Add PLIC specific CPU hotplug callbacks and enable the threshold when a CPU comes online. Take this opportunity to move the external interrupt enable code from trap init to PLIC driver as well. On cpu offline path, the driver performs the exact opposite operations i.e. disable the interrupt and the threshold. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20200302231146.15530-2-atish.patra@wdc.com
2020-03-16MIPS: c-r4k: Invalidate BMIPS5000 ZSCM prefetch linesKamal Dasu
Zephyr secondary cache is 256KB, 128B lines. 32B sectors. A secondary cache line can contain two instruction cache lines (64B), or four data cache lines (32B). Hardware prefetch Cache detects stream access, and prefetches ahead of processor access. Add support to invalidate BMIPS5000 cpu zephyr secondary cache module (ZSCM) on DMA from device so that data returned is coherent during DMA read operations. Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-03-16MIPS: pass non-NULL dev_id on shared request_irq()afzal mohammed
Recently all usages of setup_irq() was replaced by request_irq(). request_irq() does a few sanity checks that were not done in setup_irq(), if they fail irq registration will fail. One of the check is to ensure that non-NULL dev_id is passed in the case of shared irq. This caused malta on qemu to hang. Fix it by passing handler as dev_id to all request_irq()'s that are shared. For sni, instead of passing non-NULL dev_id, remove shared irq flags. Fixes: ac8fd122e070 ("MIPS: Replace setup_irq() by request_irq()") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-03-16floppy: rename the global "fdc" variable to "current_fdc"Willy Tarreau
This is done in order to remove the confusion that arises at some places in the code where local variables or arguments shadow the global variable. It is already visible that some places are a bit awkward and iterate over the global variable, for the sole reason that they used to rely on it being named "fdc" in order to get the correct address when using FD_DOR. These ones are easy to spot by searching for "for (current_fdc...". Some more cleanup is definitely possible. For example "fdc_state[current_fdc].somefield" is used all over the code and would probably be better with "fdc_state->somefield" with fdc_state being set when current_fdc is assigned. This would require to pass the pointer to the current state instead of the current_fdc to the I/O functions. Link: https://lore.kernel.org/r/20200301195555.11154-7-w@1wt.eu Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: separate the FDC's base address from its registersWilly Tarreau
FDC registers FD_STATUS, FD_DATA, FD_DOR, FD_DIR and FD_DCR used to be defined relative to FD_IOPORT, which is the FDC's base address, itself a macro depending on the "fdc" local or global variable. This patch changes this so that the register macros above now only reference the address offset, and that the FDC's address is explicitly passed in each call to fd_inb() and fd_outb(), thus removing the macro. With this change there is no more implicit usage of the local/global "fdc" variable. One place in the ARM code used to check if the port was equal to FD_DOR, this was changed to testing the register by applying a mask to the port, as was already done in the sparc code. There are still occurrences of fd_inb() and fd_outb() in the PARISC code and these ones remain unaffected since they already used to work with a base address and a register offset. The sparc, m68k and parisc code could now be slightly cleaned up to benefit from the macro definitions above instead of the equivalent hard-coded values. Link: https://lore.kernel.org/r/20200301195555.11154-6-w@1wt.eu Cc: Ian Molton <spyro@f2s.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: introduce new functions fdc_inb() and fdc_outb()Willy Tarreau
These two functions replace fd_inb() and fd_outb() in that they take the FDC in argument. This will ease the separation of the base address and the port everywhere the code is used. Link: https://lore.kernel.org/r/20200301195555.11154-5-w@1wt.eu Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: prepare ARM code to simplify base address separationWilly Tarreau
The fd_outb() macro on ARM relies on a special fd_setdor() macro when the register is FD_DOR and both will need to be changed to accept a separate base address. Let's just remerge them to simplify the change and make this code more easily reviewable. Link: https://lore.kernel.org/r/20200301195555.11154-4-w@1wt.eu Cc: Ian Molton <spyro@f2s.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: remove incomplete support for second FDC from ARM codeWilly Tarreau
The ARM code was written with the apparent hope to one day support a second FDC except that the code was incomplete and only touches the first one, which is also reflected by N_FDC==1. However this made its fd_outb() macro artificially depend on the global or local "fdc" variable. Let's get rid of this and make it explicit it doesn't rely on this variable anymore. Link: https://lore.kernel.org/r/20200301195555.11154-3-w@1wt.eu Cc: Ian Molton <spyro@f2s.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: remove dead code for drives scanning on ARMWilly Tarreau
On ARM, function fd_scandrives pre-dates Git era, is #ifed 0 out, not used, and cannot even compile since it references an fdc variable that's not declared anywhere (supposed to be the global one that we're turning to current_fdc apparently). There was also an ifdefde out include of mach/floppy.h that does not exist anymore either. Let's get rid of them since they complicate the fixing of the driver. Link: https://lore.kernel.org/r/20200301195555.11154-2-w@1wt.eu Cc: Ian Molton <spyro@f2s.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand the reply_buffer macrosWilly Tarreau
Several macros were used to access reply_buffer[] at discrete positions without making it obvious they were relying on this. These ones have been replaced by their offset in the reply buffer to make these accesses more obvious. Link: https://lore.kernel.org/r/20200224212352.8640-11-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand the R/W / format command macrosWilly Tarreau
Various macros were used to access raw_cmd for R/W or format commands without making it obvious that raw_cmd->cmd[] was used. Let's expand the macros to make this more obvious. Link: https://lore.kernel.org/r/20200224212352.8640-10-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro DRWEWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using global variable "current_drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-9-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro DRSWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using global variable "current_drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-8-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro DPWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using global variable "current_drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-7-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro UDRWEWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using local variable "drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-6-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro UDRSWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using local variable "drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-5-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro UDPWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using local variable "drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-4-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-16floppy: cleanup: expand macro UFDCSWilly Tarreau
This macro doesn't bring much value and only slightly obfuscates the code by silently using local variable "drive", let's expand it. Link: https://lore.kernel.org/r/20200224212352.8640-3-w@1wt.eu Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>