summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-06-27sched/cgroup: Fix cpu_cgroup_fork() handlingVincent Guittot
A new fair task is detached and attached from/to task_group with: cgroup_post_fork() ss->fork(child) := cpu_cgroup_fork() sched_move_task() task_move_group_fair() Which is wrong, because at this point in fork() the task isn't fully initialized and it cannot 'move' to another group, because its not attached to any group as yet. In fact, cpu_cgroup_fork() needs a small part of sched_move_task() so we can just call this small part directly instead sched_move_task(). And the task doesn't really migrate because it is not yet attached so we need the following sequence: do_fork() sched_fork() __set_task_cpu() cgroup_post_fork() set_task_rq() # set task group and runqueue wake_up_new_task() select_task_rq() can select a new cpu __set_task_cpu post_init_entity_util_avg attach_task_cfs_rq() activate_task enqueue_task This patch makes that happen. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> [ Added TASK_SET_GROUP to set depth properly. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27sched/fair: Fix PELT integrity for new groupsPeter Zijlstra
Vincent reported that when a new task is moved into a new cgroup it gets attached twice to the load tracking: sched_move_task() task_move_group_fair() detach_task_cfs_rq() set_task_rq() attach_task_cfs_rq() attach_entity_load_avg() se->avg.last_load_update = cfs_rq->avg.last_load_update // == 0 enqueue_entity() enqueue_entity_load_avg() update_cfs_rq_load_avg() now = clock() __update_load_avg(&cfs_rq->avg) cfs_rq->avg.last_load_update = now // ages load/util for: now - 0, load/util -> 0 if (migrated) attach_entity_load_avg() se->avg.last_load_update = cfs_rq->avg.last_load_update; // now != 0 The problem is that we don't update cfs_rq load_avg before all entity attach/detach operations. Only enqueue_task() and migrate_task() do this. By fixing this, the above will not happen, because the sched_move_task() attach will have updated cfs_rq's last_load_update time before attach, and in turn the attach will have set the entity's last_load_update stamp. Note that there is a further problem with sched_move_task() calling detach on a task that hasn't yet been attached; this will be taken care of in a subsequent patch. Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yuyang Du <yuyang.du@intel.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27sched/fair: Fix and optimize the fork() pathPeter Zijlstra
The task_fork_fair() callback already calls __set_task_cpu() and takes rq->lock. If we move the sched_class::task_fork callback in sched_fork() under the existing p->pi_lock, right after its set_task_cpu() call, we can avoid doing two such calls and omit the IRQ disabling on the rq->lock. Change to __set_task_cpu() to skip the migration bits, this is a new task, not a migration. Similarly, make wake_up_new_task() use __set_task_cpu() for the same reason, the task hasn't actually migrated as it hasn't ever ran. This cures the problem of calling migrate_task_rq_fair(), which does remove_entity_from_load_avg() on tasks that have never been added to the load avg to begin with. This bug would result in transiently messed up load_avg values, averaged out after a few dozen milliseconds. This is probably the reason why this bug was not found for such a long time. Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()Pan Xinhui
queued_spin_lock_slowpath() should not worry about another queued_spin_lock_slowpath() running in interrupt context and changing node->count by accident, because node->count keeps the same value every time we enter/leave queued_spin_lock_slowpath(). On some architectures this_cpu_dec() will save/restore irq flags, which has high overhead. Use the much cheaper __this_cpu_dec() instead. Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman.Long@hpe.com Link: http://lkml.kernel.org/r/1465886247-3773-1-git-send-email-xinhui.pan@linux.vnet.ibm.com [ Rewrote changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27Merge branch 'sched/urgent' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Add {rd,wr}lbr_{to,from} wrappersPeter Zijlstra
The whole rdmsr()/wrmsr() for lbr_from got a little unweildy with the sign extension quirk, provide a few simple wrappers to clean things up. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Add MSR_LAST_BRANCH_FROM_x quirk for ctx switchDavid Carrillo-Cisneros
Add quirk for context switch to save/restore the value of MSR_LAST_BRANCH_FROM_x when LBR is enabled and there is potential for kernel addresses to be in the lbr_from register. To test this patch, use a perf tool and kernel with the patch next in this series. That patch removes the work around that masked the hw bug: $ ./lbr_perf record --call-graph lbr -e cycles:k sleep 1 where lbr_perf is the patched perf tool, that allows to specify :k on lbr mode. The above command will trigger a #GPF : WARNING: CPU: 28 PID: 14096 at arch/x86/mm/extable.c:65 ex_handler_wrmsr_unsafe+0x70/0x80 unchecked MSR access error: WRMSR to 0x681 (tried to write 0x1fffffff81010794) ... Call Trace: [<ffffffff8167af49>] dump_stack+0x4d/0x63 [<ffffffff810b9b15>] __warn+0xe5/0x100 [<ffffffff810b9be9>] warn_slowpath_fmt+0x49/0x50 [<ffffffff810abb40>] ex_handler_wrmsr_unsafe+0x70/0x80 [<ffffffff810abc42>] fixup_exception+0x42/0x50 [<ffffffff81079d1a>] do_general_protection+0x8a/0x160 [<ffffffff81684ec2>] general_protection+0x22/0x30 [<ffffffff810101b9>] ? intel_pmu_lbr_sched_task+0xc9/0x380 [<ffffffff81009d7c>] intel_pmu_sched_task+0x3c/0x60 [<ffffffff81003a2b>] x86_pmu_sched_task+0x1b/0x20 [<ffffffff81192a5b>] perf_pmu_sched_task+0x6b/0xb0 [<ffffffff8119746d>] __perf_event_task_sched_in+0x7d/0x150 [<ffffffff810dd9dc>] finish_task_switch+0x15c/0x200 [<ffffffff8167f894>] __schedule+0x274/0x6cc [<ffffffff8167fdd9>] schedule+0x39/0x90 [<ffffffff81675398>] exit_to_usermode_loop+0x39/0x89 [<ffffffff810028ce>] prepare_exit_to_usermode+0x2e/0x30 [<ffffffff81683c1b>] retint_user+0x8/0x10 Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-5-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Fix trivial formatting and style bugDavid Carrillo-Cisneros
Replace spaces by tabs in LBR_FROM_* constants to align with newly defined constant. Use BIT_ULL. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-4-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug when no TSXDavid Carrillo-Cisneros
Intel's SDM states that bits 61:62 in MSR_LAST_BRANCH_FROM_x are the TSX flags for formats with LBR_TSX flags (i.e. LBR_FORMAT_EIP_EFLAGS2). However, when the CPU has TSX support deactivated, bits 61:62 actually behave as follows: - For wrmsr(), bits 61:62 are considered part of the sign extension. - When capturing branches, the LBR hw will always clear bits 61:62. regardless of the sign extension. Therefore, if: 1) LBR has TSX format. 2) CPU has no TSX support enabled. ... then any value passed to wrmsr() must be sign extended to 63 bits and any value from rdmsr() must be converted to have a sign extension of 61 bits, ignoring the values at TSX flags. This bug was masked by the work-around to the Intel's CPU bug: BJ94. "LBR May Contain Incorrect Information When Using FREEZE_LBRS_ON_PMI" in Document Number: 324643-037US. The aforementioned work-around uses hw flags to filter out all kernel branches, limiting LBR callstack to user level execution only. Since user addresses are not sign extended, they do not trigger the wrmsr() bug in MSR_LAST_BRANCH_FROM_x when saved/restored at context switch. To verify the hw bug: $ perf record -b -e cycles sleep 1 $ rdmsr -p 0 0x680 0x1fffffffb0b9b0cc $ wrmsr -p 0 0x680 0x1fffffffb0b9b0cc write(): Input/output error The quirk for LBR_FROM_ MSRs is required before calls to wrmsrl() and after rdmsrl(). This patch introduces it for wrmsrl()'s done for testing LBR support. Future patch in series adds the quirk for context switch, that would be required if LBR callstack is to be enabled for ring 0. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-3-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Print LBR support statement after validationDavid Carrillo-Cisneros
The following commit: 338b522ca43c ("perf/x86/intel: Protect LBR and extra_regs against KVM lying") added an additional test to LBR support detection that is performed after printing the LBR support statement to dmesg. Move the LBR support output after the very last test, to make sure we print the true status of LBR support. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-2-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27Merge tag 'v4.7-rc5' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusionPeter Zijlstra
Commit: fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities") did something non-obvious but also did it buggy yet latent. The problem was exposed for real by a later commit in the v4.7 merge window: 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels") ... after which tg->load_avg and cfs_rq->load.weight had different units (10 bit fixed point and 20 bit fixed point resp.). Add a comment to explain the use of cfs_rq->load.weight over the 'natural' cfs_rq->avg.load_avg and add scale_load_down() to correct for the difference in unit. Since this is (now, as per a previous commit) the only user of calc_tg_weight(), collapse it. The effects of this bug should be randomly inconsistent SMP-balancing of cgroups workloads. Reported-by: Jirka Hladky <jhladky@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels") Fixes: fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27sched/fair: Fix effective_load() to consistently use smoothed loadPeter Zijlstra
Starting with the following commit: fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities") calc_tg_weight() doesn't compute the right value as expected by effective_load(). The difference is in the 'correction' term. In order to ensure \Sum rw_j >= rw_i we cannot use tg->load_avg directly, since that might be lagging a correction on the current cfs_rq->avg.load_avg value. Therefore we use tg->load_avg - cfs_rq->tg_load_avg_contrib + cfs_rq->avg.load_avg. Now, per the referenced commit, calc_tg_weight() doesn't use cfs_rq->avg.load_avg, as is later used in @w, but uses cfs_rq->load.weight instead. So stop using calc_tg_weight() and do it explicitly. The effects of this bug are wake_affine() making randomly poor choices in cgroup-intense workloads. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v4.3+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27hwrng: omap - Fix assumption that runtime_get_sync will always succeedNishanth Menon
pm_runtime_get_sync does return a error value that must be checked for error conditions, else, due to various reasons, the device maynot be enabled and the system will crash due to lack of clock to the hardware module. Before: 12.562784] [00000000] *pgd=fe193835 12.562792] Internal error: : 1406 [#1] SMP ARM [...] 12.562864] CPU: 1 PID: 241 Comm: modprobe Not tainted 4.7.0-rc4-next-20160624 #2 12.562867] Hardware name: Generic DRA74X (Flattened Device Tree) 12.562872] task: ed51f140 ti: ed44c000 task.ti: ed44c000 12.562886] PC is at omap4_rng_init+0x20/0x84 [omap_rng] 12.562899] LR is at set_current_rng+0xc0/0x154 [rng_core] [...] After the proper checks: [ 94.366705] omap_rng 48090000.rng: _od_fail_runtime_resume: FIXME: missing hwmod/omap_dev info [ 94.375767] omap_rng 48090000.rng: Failed to runtime_get device -19 [ 94.382351] omap_rng 48090000.rng: initialization failed. Fixes: 665d92fa85b5 ("hwrng: OMAP: convert to use runtime PM") Cc: Paul Walmsley <paul@pwsan.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27MAINTAINERS: update maintainer for qatTadeusz Struk
Add Giovanni and Salvatore who will take over the qat maintenance. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: sha1-mb - rename sha-mb to sha1-mbMegha Dey
Until now, there was only support for the SHA1 multibuffer algorithm. Hence, there was just one sha-mb folder. Now, with the introduction of the SHA256 multi-buffer algorithm , it is logical to name the existing folder as sha1-mb. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: tcrypt - Add speed tests for SHA multibuffer algorithmsMegha Dey
The existing test suite to calculate the speed of the SHA algorithms assumes serial (single buffer)) computation of data. With the SHA multibuffer algorithms, we work on 8 lanes of data in parallel. Hence, the need to introduce a new test suite to calculate the speed for these algorithms. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: sha256-mb - Crypto computation (x8 AVX2)Megha Dey
This patch introduces the assembly routines to do SHA256 computation on buffers belonging to several jobs at once. The assembly routines are optimized with AVX2 instructions that have 8 data lanes and using AVX2 registers. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: sha256-mb - Algorithm data structuresMegha Dey
This patch introduces the data structures and prototypes of functions needed for computing SHA256 hash using multi-buffer. Included are the structures of the multi-buffer SHA256 job, job scheduler in C and x86 assembly. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: sha256-mb - submit/flush routines for AVX2Megha Dey
This patch introduces the routines used to submit and flush buffers belonging to SHA256 crypto jobs to the SHA256 multibuffer algorithm. It is implemented mostly in assembly optimized with AVX2 instructions. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: sha256-mb - Enable multibuffer supportMegha Dey
Add the config CRYPTO_SHA256_MB which will enable the computation using the SHA256 multi-buffer algorithm. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27crypto: sha256-mb - SHA256 multibuffer job manager and glue codeMegha Dey
This patch introduces the multi-buffer job manager which is responsible for submitting scatter-gather buffers from several SHA256 jobs to the multi-buffer algorithm. It also contains the flush routine to that's called by the crypto daemon to complete the job when no new jobs arrive before the deadline of maximum latency of a SHA256 crypto job. The SHA256 multi-buffer crypto algorithm is defined and initialized in this patch. Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27Documentation: devicetree: bindings: Add BCM5301x bindingFlorian Fainelli
Document the binding used by the Broadcom BCM5301x (Northstar) SoC random number generator. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-27net: smsc911x: Fix bug where PHY interrupts are overwritten by 0Jeremy Linton
By default, mdiobus_alloc() sets the PHYs to polling mode, but a pointer size memcpy means that a couple IRQs end up being overwritten with a value of 0. This means that PHY_POLL is disabled and results in unpredictable behavior depending on the PHY's location on the MDIO bus. Remove that memcpy and the now unused phy_irq member to force the SMSC911x PHYs into polling mode 100% of the time. Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-26Linux 4.7-rc5v4.7-rc5Linus Torvalds
2016-06-26staging: fsl-mc: convert mc command build/parse to use C structsIoana Radulescu
The layer abstracting the building of commands and extracting responses is currently based on macros that shift and mask the command fields and requires exposing offset/size values as macro parameters and makes the code harder to read. For clarity and maintainability, instead use an implementation based on mapping the MC command definitions to C structures. These structures contain the hardware command fields (which are naturally-aligned) and individual fields are little-endian ordering (the byte ordering of the hardware). As such, there is no need to perform the conversion between core and hardware (LE) endianness in mc_send_command(), but instead each individual field in a command will be converted separately if needed by the function building the command or extracting the response. This patch does not introduce functional changes, both the hardware ABIs and the APIs exposed for the DPAA2 objects remain the same. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: properly set hwirq in msi set_descStuart Yoder
For an MSI domain the hwirq is an arbitrary but unique id to identify an interrupt. Previously the hwirq was set to the MSI index of the interrupt, but that only works if there is one DPRC. Additional DPRCs require an expanded namespace. Use both the ICID (which is unique per DPRC) and the MSI index to compose a hwirq value. Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: dprc: fix ordering problem freeing resources in remove of dprcStuart Yoder
When unbinding a dprc from the dprc driver the cleanup of the resource pools must happen after irq pool cleanup is done. Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: dprc: add missing irq freeStuart Yoder
add missing free of the Linux irq when tearing down interrupts Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: fix asymmetry in destroy of mc_ioBharat Bhushan
An mc_io represents a mapped MC portal. Previously, an mc_io was created for the root dprc in fsl_mc_bus_probe() and for child dprcs in dprc_probe(). But the free of that data structure happened in the general bus remove callback. This asymmetry resulted in some bugs due to unwanted destroys of mc_io object in some scenarios (e.g. vfio). Fix this bug by making things symmetric-- mc_io created in fsl_mc_bus_probe() is freed in fsl_mc_bus_remove(). The mc_io created in dprc_probe() is freed in dprc_remove(). Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com> [Stuart: added check for root dprc and reworded commit message] Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: make fsl_mc_is_root_dprc() globalStuart Yoder
make fsl_mc_is_root_dprc() global so that the dprc driver can use it Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: export mc_get_versionStuart Yoder
some drivers (built as modules) rely on mc_get_version() Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: add support for device table matchingStuart Yoder
Move the definition of fsl_mc_device_id to its proper location in mod_devicetable.h, and add fsl-mc bus support to devicetable-offsets.c and file2alias.c to enable device table matching. With this patch udev based module loading of fsl-mc drivers is supported. Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: clean up the device id structStuart Yoder
-rename the struct used for fsl-mc device ids to be more consistent with other busses -remove the now obsolete and unused version fields Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: implement uevent callback and set the modaliasStuart Yoder
Replace placeholder code in the uevent callback to properly set the MODALIAS env variable. Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26staging: fsl-mc: add support for the modalias sysfs attributeStuart Yoder
In order to support uevent based module loading implement modalias support for the fsl-mc bus driver. Aliases are based on vendor and object/device id and are of the form "fsl-mc:vNdN". Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26usb: renesas_usbhs: make usbhs_write32() staticBen Dooks
The usbhs_write32 function is not used outside of the rcar3.c file, so fix the following sparse warning by making it static: drivers/usb/renesas_usbhs/rcar3.c:26:6: warning: symbol 'usbhs_write32' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26usb: early/ehci-dbgp: make it explicitly non-modularPaul Gortmaker
The Kconfig currently controlling compilation of this code is: arch/x86/Kconfig.debug:config EARLY_PRINTK_DBGP arch/x86/Kconfig.debug: bool "Early printk via EHCI debug port" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modularity so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: get rid of platform dataHeikki Krogerus
No more users for it. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26usb: dwc3: host: use build-in property instead of platform dataHeikki Krogerus
This should allow xhci to remove handling of platform data. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: plat: adapt to unified device property interfaceHeikki Krogerus
Requesting the only property that the driver needs using the unified device property interface so it will be available for all types of platforms, not just the ones using DT. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: rename and simplify last_trb_on_last_seg() helperMathias Nyman
It's only used with rings that have link trbs Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: remove enqueue_is_link() helperMathias Nyman
Only used in one place, replace with trb_is_link() helper Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: rework inc_deq() and fix off by one error.Mathias Nyman
inc_deq() is called both for rings with link trbs and the event ring without link trbs. The last_trb() check in inc_deq() has a off by one error, going beyond allocated array when checking if trb == [TRBS_PER_SEGMENT], and the whole inc_deq() depend on this. Rewrite the inc_deq() funciton, remove the faulty last_trb() helper, add new last_trb_on_seg() and last_trb_on_ring() helpers Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: use and add separate function for checking for link trbsMathias Nyman
Add a new is_link_trb() function that only checks for link trbs. We want to split generic last_trb() function which is used for both event rings without link trbs, and endpoint and command rings with links. This will allow us to easier check for link trbs added mid segments. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: clean up event ring checks from inc_enq()Mathias Nyman
Remove the event ring related checks in inc_enq() Host hardware is the producer of events on the event ring, driver will not queue anything, or call inc_enq() for the event ring. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: TD-fragment, align the unsplittable case with a bounce bufferMathias Nyman
If the last trb before a link is not packet size aligned, and is not splittable then use a bounce buffer for that chunk of max packet size unalignable data. Allocate a max packet size bounce buffer for every segment of a bulk endpoint ring at the same time as allocating the ring. If we need to align the data before the link trb in that segment then copy the data to the segment bounce buffer, dma map it, and enqueue it. Once the td finishes, or is cancelled, unmap it. For in transfers we need to first map the bounce buffer, then queue it, after it finishes, copy the bounce buffer to the original sg list, and finally unmap it Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: align the last trb before link if it is easily splittable.Mathias Nyman
TD fragments section 4.11.7.1 in xhci specs have additional requirements on how trbs in TDs must be organized. TD fragments shall not span transfer ring segments and TD fragments must be packet aligned. Normally we don't care about TD fragments, on TD is one big fragment, but if a TD spans ring segments it will be treated as two fragments, and we need to comply with the alignment requirements. For us this means that the payload data must be packet aligned in the last trb before a link trb. In most mass storage bulk tranfers we are lucky as the block size aligns nicely with packet size, and there are no issues. However, usb network adapters using scatterlists can hit this alignment issue, and usbtest in kernel triggers this in minutes. This patch is a partial solution, it solves the easy case when the last trb before the link trb contains a packet boundary. If that is the case then just split the trb at the boundary. If not, then just print a debug message and continue as we have always done, hoping for the best Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: don't rely on precalculated value of needed trbs in the enqueue loopMathias Nyman
Queue trbs until all payload data in the urb is tranferred. The actual number of trbs might need to change from the pre-calculated number when the packet alignment restrictions for td fragments in xhci 4.11.7.1 are taken into account. Long term plan is to get rid of calculating the needed trbs in advance all together. It's an unnecessary extra walk through the scatterlist. This change also allows some bulk queue function simplifications Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-26xhci: use boolean to indicate last trb in td remainder calculationMathias Nyman
We only need to know if we are queuing the last trb for a TD when calculating the td remainder field. The total number of trbs left is not used. We won't be able to trust the pre-calculated number of trbs used if we need to align trb data by splitting or merging trbs in order to satisfy comply with data alignment requirements in xhci specs section 4.11.7.1. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>