summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-05-04blk-mq: split blk_mq_alloc_and_init_hctx into two partsMing Lei
Split blk_mq_alloc_and_init_hctx into two parts, and one is blk_mq_alloc_hctx() for allocating all hctx resources, another is blk_mq_init_hctx() for initializing hctx, which serves as counter-part of blk_mq_exit_hctx(). Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org Cc: Martin K . Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-04blk-mq: free hw queue's resource in hctx's release handlerMing Lei
Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit 45a9c9d909b2 ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before 45a9c9d909b2, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: 8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done") c2856ae2f315 ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reported-by: James Smart <james.smart@broadcom.com> Fixes: 45a9c9d909b2 ("blk-mq: Fix a use-after-free") Cc: stable@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-04blk-mq: move cancel of requeue_work into blk_mq_releaseMing Lei
With holding queue's kobject refcount, it is safe for driver to schedule requeue. However, blk_mq_kick_requeue_list() may be called after blk_sync_queue() is done because of concurrent requeue activities, then requeue work may not be completed when freeing queue, and kernel oops is triggered. So moving the cancel of requeue_work into blk_mq_release() for avoiding race between requeue and freeing queue. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-04blk-mq: grab .q_usage_counter when queuing request from plug code pathMing Lei
Just like aio/io_uring, we need to grab 2 refcount for queuing one request, one is for submission, another is for completion. If the request isn't queued from plug code path, the refcount grabbed in generic_make_request() serves for submission. In theroy, this refcount should have been released after the sumission(async run queue) is done. blk_freeze_queue() works with blk_sync_queue() together for avoiding race between cleanup queue and IO submission, given async run queue activities are canceled because hctx->run_work is scheduled with the refcount held, so it is fine to not hold the refcount when running the run queue work function for dispatch IO. However, if request is staggered into plug list, and finally queued from plug code path, the refcount in submission side is actually missed. And we may start to run queue after queue is removed because the queue's kobject refcount isn't guaranteed to be grabbed in flushing plug list context, then kernel oops is triggered, see the following race: blk_mq_flush_plug_list(): blk_mq_sched_insert_requests() insert requests to sw queue or scheduler queue blk_mq_run_hw_queue Because of concurrent run queue, all requests inserted above may be completed before calling the above blk_mq_run_hw_queue. Then queue can be freed during the above blk_mq_run_hw_queue(). Fixes the issue by grab .q_usage_counter before calling blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is safe because the queue is absolutely alive before inserting request. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: - PPC and ARM bugfixes from submaintainers - Fix old Windows versions on AMD (recent regression) - Fix old Linux versions on processors without EPT - Fixes for LAPIC timer optimizations * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: nVMX: Fix size checks in vmx_set_nested_state KVM: selftests: make hyperv_cpuid test pass on AMD KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE KVM: x86: Whitelist port 0x7e for pre-incrementing %rip Documentation: kvm: fix dirty log ioctl arch lists KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP KVM: arm/arm64: Ensure vcpu target is unset on reset failure KVM: lapic: Convert guest TSC to host time domain if necessary KVM: lapic: Allow user to disable adaptive tuning of timer advancement KVM: lapic: Track lapic timer advance per vCPU KVM: lapic: Disable timer advancement if adaptive tuning goes haywire x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012 KVM: x86: Consider LAPIC TSC-Deadline timer expired if deadline too short KVM: PPC: Book3S: Protect memslots while validating user address KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit KVM: arm/arm64: vgic-v3: Retire pending interrupts on disabling LPIs ...
2019-05-03parisc: Update huge TLB page support to use per-pagetable spinlockJohn David Anglin
This patch updates the parisc huge TLB page support to use per-pagetable spinlocks. This patch requires Mikulas' per-pagetable spinlock patch and the revised TLB serialization patch from Helge and myself. With Mikulas' patch, we need to use the per-pagetable spinlock for page table updates. The TLB lock is only used to serialize TLB flushes on machines with the Merced bus. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Use per-pagetable spinlockMikulas Patocka
PA-RISC uses a global spinlock to protect pagetable updates in the TLB fault handlers. When multiple cores are taking TLB faults simultaneously, the cache line containing the spinlock becomes a bottleneck. This patch embeds the spinlock in the top level page directory, so that every process has its own lock. It improves performance by 30% when doing parallel compilations. At least on the N class systems, only one PxTLB inter processor broadcast can be active at any one time on the Merced bus. If a Merced bus is found, this patch serializes the TLB flushes with the pa_tlb_flush_lock spinlock. v1: Initial patch by Mikulas v2: Added Merced detection by Helge v3: Revised TLB serialization by Dave & Helge Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Allow live-patching of __meminit functionsHelge Deller
When making the text sections writeable with set_kernel_text_rw(1), include all text sections including those in the __init section. Otherwise functions marked with __meminit will stay read-only. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # 4.20+
2019-05-03parisc: Add memory barrier to asm pdc and sync instructionsHelge Deller
Add compiler memory barriers to ensure the compiler doesn't reorder memory operations around these instructions. Cc: stable@vger.kernel.org # v4.20+ Fixes: 3847dab77421 ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Add memory clobber to TLB purgesJohn David Anglin
The pdtlb and pitlb instructions are strongly ordered. The asms invoking these instructions should be compiler memory barriers to ensure the compiler doesn't reorder memory operations around these instructions. Signed-off-by: John David Anglin <dave.anglin@bell.net> CC: stable@vger.kernel.org # v4.20+ Fixes: 3847dab77421 ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Use ldcw instruction for SMP spinlock release barrierJohn David Anglin
There are only a couple of instructions that can function as a memory barrier on parisc. Currently, we use the sync instruction as a memory barrier when releasing a spinlock. However, the ldcw instruction is a better barrier when we have a handy memory location since it operates in the cache on coherent machines. This patch updates the spinlock release code to use ldcw. I also changed the "stw,ma" instructions to "stw" instructions as it is not an adequate barrier. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Remove lock code to serialize TLB operations in pacache.SJohn David Anglin
TLB operations only need to be serialized on machines with the Merced (Stretch) bus. The only machines in this category are L and N class, and they require a 64-bit PA 2.0 kernel. On these machines, we use local TLB purges in the tmpalias routines. We don't need to serialize TLB purges on all other machines. Thus, the lock/unlock code can be removed when CONFIG_PA20 is not defined. Further, when CONFIG_PA20 is not defined, alternative patching converts the TLB purges to local purges when PA 2.0 hardware has been detected. Signed-off-by: John David Anglin <dave.anglin@bell.net> Tested-By: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Switch from DISCONTIGMEM to SPARSEMEMHelge Deller
The commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") breaks memory management on a parisc c8000 workstation with this memory layout: 0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB 1) Start 0x0000000100000000 End 0x00000001bfdfffff Size 3070 MB 2) Start 0x0000004040000000 End 0x00000040ffffffff Size 3072 MB With the patch 1c30844d2dfe, the kernel will incorrectly reclaim the first zone when it fills up, ignoring the fact that there are two completely free zones. Basiscally, it limits cache size to 1GiB. The parisc kernel is currently using the DISCONTIGMEM implementation, but isn't NUMA. Avoid this issue or strange work-arounds by switching to the more commonly used SPARSEMEM implementation. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: enable wide mode earlySven Schnelle
The idle task might have been allocated above 4GB. With the current code we cannot access that memory because the CPU is still running in narrow mode. This was found on a J5000 machine and the patch is required to enable SPARSEMEM on that machine. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: update feature listsSven Schnelle
Update lists to reflect that we have now KGDB and kretprobes support. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Show n/a if product number not availableHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: remove unused flags parameter in __patch_text()Sven Schnelle
It's not used by patch_map()/patch_unmap(), so lets remove it. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03doc: update kprobes supported architecture listSven Schnelle
Now that kprobes and kretprobes are implemented, update the list in Documentation to reflect that. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Implement kretprobesSven Schnelle
Implement kretprobes on parisc, parts stolen from powerpc. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: remove kprobes.h from generic-ySven Schnelle
We're providing our own version now. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Implement kprobesSven Schnelle
Implement kprobes support for PA-RISC. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: add functions required by KPROBE_EVENTSSven Schnelle
implement regs_get_register(), regs_get_kernel_stack_nth() and regs_within_kernel_stack() Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: PA-Linux requires at least 32 MB RAMHelge Deller
Even a 32-bit kernel requires at least 27 MB to decompress itself, so halt the system with a message if the system has less memory than 32 MB. Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Skip registering LED when running in QEMUHelge Deller
No need to spend CPU cycles when we run on QEMU. Signed-off-by: Helge Deller <deller@gmx.de> CC: stable@vger.kernel.org # v4.9+
2019-05-03parisc: Tune LASI LAN for QEMUHelge Deller
Do not loose cycles when we run on QEMU, and fix one trivial typo. Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Export running_on_qemu symbol for modulesHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de> CC: stable@vger.kernel.org # v4.9+
2019-05-03parisc: add KGDB supportSven Schnelle
This patch add KGDB support to PA-RISC. It also implements single-stepping utilizing the recovery counter. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: add parisc code patchingSven Schnelle
Instead of re-mapping the whole kernel text with RWX rights add a patch_text() which can be used to replace instructions in the kernel .text section. Based on the ARM implementation. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: add set_fixmap()/clear_fixmap()Sven Schnelle
These functions will be used for adding code patching functions later. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03parisc: Consider stack randomization for mmap base only when necessaryAlexandre Ghiti
Do not offset mmap base address because of stack randomization if current task does not want randomization. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-03hwmon: (lm75) Add support for TMP75BIker Perez del Palomar Sustatxa
The TMP75B has a different control register, supports 12-bit resolution and the default conversion rate is 37 Hz. Signed-off-by: Iker Perez del Palomar Sustatxa <iker.perez@codethink.co.uk> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-05-03dt-bindings: hwmon: Add tmp75b to lm75.txtIker Perez del Palomar Sustatxa
Update the LM75's devicetree definition to allow Texas Instruments TMP75B be probed. Signed-off-by: Iker Perez del Palomar Sustatxa <iker.perez@codethink.co.uk> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-05-03livepatch: Remove duplicated code for early initializationPetr Mladek
kobject_init() call added one more operation that has to be done when doing the early initialization of both static and dynamic livepatch structures. It would have been easier when the early initialization code was not duplicated. Let's deduplicate it for future generations of livepatching hackers. The patch does not change the existing behavior. Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-05-03livepatch: Remove custom kobject state handlingPetr Mladek
kobject_init() always succeeds and sets the reference count to 1. It allows to always free the structures via kobject_put() and the related release callback. Note that the custom kobject state handling was used only because we did not know that kobject_put() can and actually should get called even when kobject_init_and_add() fails. The patch should not change the existing behavior. Suggested-by: "Tobin C. Harding" <tobin@kernel.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-05-03Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C driver bugfixes and a MAINTAINERS update for you" * 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: Prevent runtime suspend of adapter when Host Notify is required i2c: synquacer: fix enumeration of slave devices MAINTAINERS: friendly takeover of i2c-gpio driver i2c: designware: ratelimit 'transfer when suspended' errors i2c: imx: correct the method of getting private data in notifier_call
2019-05-03nohz_full: Allow the boot CPU to be nohz_fullNicholas Piggin
Allow the boot CPU/CPU0 to be nohz_full. Have the boot CPU take the do_timer duty during boot until a housekeeping CPU can take over. This is supported when CONFIG_PM_SLEEP_SMP is not configured, or when it is configured and the arch allows suspend on non-zero CPUs. nohz_full has been trialed at a large supercomputer site and found to significantly reduce jitter. In order to deploy it in production, they need CPU0 to be nohz_full because their job control system requires the application CPUs to start from 0, and the housekeeping CPUs are placed higher. An equivalent job scheduling that uses CPU0 for housekeeping could be achieved by modifying their system, but it is preferable if nohz_full can support their environment without modification. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-6-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03sched/isolation: Require a present CPU in housekeeping maskNicholas Piggin
During housekeeping mask setup, currently a possible CPU is required. That does not guarantee the CPU would be available at boot time, so check to ensure that at least one present CPU is in the mask. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-5-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03kernel/cpu: Allow non-zero CPU to be primary for suspend / kexec freezeNicholas Piggin
This patch provides an arch option, ARCH_SUSPEND_NONZERO_CPU, to opt-in to allowing suspend to occur on one of the housekeeping CPUs rather than hardcoded CPU0. This will allow CPU0 to be a nohz_full CPU with a later change. It may be possible for platforms with hardware/firmware restrictions on suspend/wake effectively support this by handing off the final stage to CPU0 when kernel housekeeping is no longer required. Another option is to make housekeeping / nohz_full mask dynamic at runtime, but the complexity could not be justified at this time. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-4-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03power/suspend: Add function to disable secondaries for suspendNicholas Piggin
This adds a function to disable secondary CPUs for suspend that are not necessarily non-zero / non-boot CPUs. Platforms will be able to use this to suspend using non-zero CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-3-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03intel_th: msu: Add current window trackingAlexander Shishkin
Now that we have a way to switch between MSC buffer windows, add code to track the current window. The hardware register NWSA that contains the address of the next window is unfortunately not always usable, and since the driver has full control of the window switching, there is no reason not to keep this on the software side. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: msu: Add a sysfs attribute to trigger window switchAlexander Shishkin
Now that we have the means to trigger a window switch for the MSU trace store, add a sysfs file to allow triggering it from userspace. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: msu: Correct the block wrap detectionAlexander Shishkin
In multi window mode the MSU will set "window wrap" bit to indicate block wrapping as well. Take this into account when checking data blocks. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: Add switch triggering supportAlexander Shishkin
Add support for asserting window switch trigger when tracing to MSU output ports. This allows for software controlled switching between windows of the MSU buffer, which can be used for double buffering while exporting the trace data further from the MSU. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: gth: Factor out trace start/stopAlexander Shishkin
The trace enable/disable functions of the GTH include the code that starts and stops trace flom from the sources. This start/stop functionality will also be used in the window switch trigger sequence. Factor out start/stop code from the larger trace enable/disable code in preparation for the window switch sequence. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: msu: Factor out pipeline drainingAlexander Shishkin
The code that waits for the pipeline empty condition of the MSU is currently called in the path that disables the trace. We will also need this in the window switch trigger sequence. Therefore, factor out this code and make it accessible to the GTH device. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: msu: Switch over to scatterlistAlexander Shishkin
Instead of using a home-grown array of pointers to the DMA pages, switch over to scatterlist data types and accessors, which has all the convenient accessors, can be used to batch-map DMA memory and is convenient for passing around between different layers, which will be useful when MSU buffer management has to cross the boundaries of the MSU driver. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: msu: Replace open-coded list_{first,last,next}_entry variantsAlexander Shishkin
There are a few places in the code where open-coded versions of list entry accessors list_first_entry()/list_last_entry()/list_next_entry() are used. Replace those with the standard macros. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: Only report useful IRQs to subdevicesAlexander Shishkin
The only type of IRQ triggering event that is useful to us at the moment is the "last block" interrupt of the MSU. This interrupt can only be enabled via "MINTCTL" register that doesn't exist in earlier version of the Intel TH. Enumerate the presence of MINTCTL via per-device driver data structure and only instantiate the IRQ resource for subdevices if this capability is present. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: msu: Start handling IRQsAlexander Shishkin
We intend to use the interrupt to detect Last Block condition in the MSU driver, which we can use for double-buffering software-managed data transfers. Add an interrupt handler to the MSU driver. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-03intel_th: pci: Use MSI interrupt signallingAlexander Shishkin
Since Intel TH is capable of MSI interrupt signalling, make use of it. The way it works is, each of the 7 interrupt triggering events has its own vector in this mode, as opposed to interrupt line delivery, where all events are signalled via the same line. Failing to enable MSI, the driver falls back to using an interrupt line. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>