Age | Commit message (Collapse) | Author |
|
Several subsystems in the kernel (task scheduler and/or thermal at the
time of writing) can benefit from knowing about the energy consumed by
CPUs. Yet, this information can come from different sources (DT or
firmware for example), in different formats, hence making it hard to
exploit without a standard API.
As an attempt to address this, introduce a centralized Energy Model
(EM) management framework which aggregates the power values provided
by drivers into a table for each performance domain in the system. The
power cost tables are made available to interested clients (e.g. task
scheduler or thermal) via platform-agnostic APIs. The overall design
is represented by the diagram below (focused on Arm-related drivers as
an example, but applicable to any architecture):
+---------------+ +-----------------+ +-------------+
| Thermal (IPA) | | Scheduler (EAS) | | Other |
+---------------+ +-----------------+ +-------------+
| | em_pd_energy() |
| | em_cpu_get() |
+-----------+ | +--------+
| | |
v v v
+---------------------+
| |
| Energy Model |
| |
| Framework |
| |
+---------------------+
^ ^ ^
| | | em_register_perf_domain()
+----------+ | +---------+
| | |
+---------------+ +---------------+ +--------------+
| cpufreq-dt | | arm_scmi | | Other |
+---------------+ +---------------+ +--------------+
^ ^ ^
| | |
+--------------+ +---------------+ +--------------+
| Device Tree | | Firmware | | ? |
+--------------+ +---------------+ +--------------+
Drivers (typically, but not limited to, CPUFreq drivers) can register
data in the EM framework using the em_register_perf_domain() API. The
calling driver must provide a callback function with a standardized
signature that will be used by the EM framework to build the power
cost tables of the performance domain. This design should offer a lot of
flexibility to calling drivers which are free of reading information
from any location and to use any technique to compute power costs.
Moreover, the capacity states registered by drivers in the EM framework
are not required to match real performance states of the target. This
is particularly important on targets where the performance states are
not known by the OS.
The power cost coefficients managed by the EM framework are specified in
milli-watts. Although the two potential users of those coefficients (IPA
and EAS) only need relative correctness, IPA specifically needs to
compare the power of CPUs with the power of other components (GPUs, for
example), which are still expressed in absolute terms in their
respective subsystems. Hence, specifying the power of CPUs in
milli-watts should help transitioning IPA to using the EM framework
without introducing new problems by keeping units comparable across
sub-systems.
On the longer term, the EM of other devices than CPUs could also be
managed by the EM framework, which would enable to remove the absolute
unit. However, this is not absolutely required as a first step, so this
extension of the EM framework is left for later.
On the client side, the EM framework offers APIs to access the power
cost tables of a CPU (em_cpu_get()), and to estimate the energy
consumed by the CPUs of a performance domain (em_pd_energy()). Clients
such as the task scheduler can then use these APIs to access the shared
data structures holding the Energy Model of CPUs.
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: adharmap@codeaurora.org
Cc: chris.redpath@arm.com
Cc: currojerez@riseup.net
Cc: dietmar.eggemann@arm.com
Cc: edubezval@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: javi.merino@kernel.org
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: patrick.bellasi@arm.com
Cc: pkondeti@codeaurora.org
Cc: skannan@codeaurora.org
Cc: smuckle@google.com
Cc: srinivas.pandruvada@linux.intel.com
Cc: thara.gopinath@linaro.org
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Cc: viresh.kumar@linaro.org
Link: https://lkml.kernel.org/r/20181203095628.11858-4-quentin.perret@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Schedutil requests frequency by aggregating utilization signals from
the scheduler (CFS, RT, DL, IRQ) and applying a 25% margin on top of
them. Since Energy Aware Scheduling (EAS) needs to be able to predict
the frequency requests, it needs to forecast the decisions made by the
governor.
In order to prepare the introduction of EAS, introduce
schedutil_freq_util() to centralize the aforementioned signal
aggregation and make it available to both schedutil and EAS. Since
frequency selection and energy estimation still need to deal with RT and
DL signals slightly differently, schedutil_freq_util() is called with a
different 'type' parameter in those two contexts, and returns an
aggregated utilization signal accordingly. While at it, introduce the
map_util_freq() function which is designed to make schedutil's 25%
margin usable easily for both sugov and EAS.
As EAS will be able to predict schedutil's frequency requests more
accurately than any other governor by design, it'd be sensible to make
sure EAS cannot be used without schedutil. This will be done later, once
EAS has actually been introduced.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: adharmap@codeaurora.org
Cc: chris.redpath@arm.com
Cc: currojerez@riseup.net
Cc: dietmar.eggemann@arm.com
Cc: edubezval@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: javi.merino@kernel.org
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: patrick.bellasi@arm.com
Cc: pkondeti@codeaurora.org
Cc: rjw@rjwysocki.net
Cc: skannan@codeaurora.org
Cc: smuckle@google.com
Cc: srinivas.pandruvada@linux.intel.com
Cc: thara.gopinath@linaro.org
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Cc: viresh.kumar@linaro.org
Link: https://lkml.kernel.org/r/20181203095628.11858-3-quentin.perret@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
By default, arch_scale_cpu_capacity() is only visible from within the
kernel/sched folder. Relocate it to include/linux/sched/topology.h to
make it visible to other clients needing to know about the capacity of
CPUs, such as the Energy Model framework.
This also shrinks the <linux/sched/topology.h> public header.
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: adharmap@codeaurora.org
Cc: chris.redpath@arm.com
Cc: currojerez@riseup.net
Cc: dietmar.eggemann@arm.com
Cc: edubezval@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: javi.merino@kernel.org
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: patrick.bellasi@arm.com
Cc: pkondeti@codeaurora.org
Cc: rjw@rjwysocki.net
Cc: skannan@codeaurora.org
Cc: smuckle@google.com
Cc: srinivas.pandruvada@linux.intel.com
Cc: thara.gopinath@linaro.org
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Cc: viresh.kumar@linaro.org
Link: https://lkml.kernel.org/r/20181203095628.11858-2-quentin.perret@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
::smt_gain is used to compute the capacity of CPUs of a SMT core with the
constraint 1 < ::smt_gain < 2 in order to be able to compute number of CPUs
per core. The field has_free_capacity of struct numa_stat, which was the
last user of this computation of number of CPUs per core, has been removed
by:
2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")
We can now remove this constraint on core capacity and use the defautl value
SCHED_CAPACITY_SCALE for SMT CPUs. With this remove, SCHED_CAPACITY_SCALE
becomes the maximum compute capacity of CPUs on every systems. This should
help to simplify some code and remove fields like rd->max_cpu_capacity
Furthermore, arch_scale_cpu_capacity() is used with a NULL sd in several other
places in the code when it wants the capacity of a CPUs to scale
some metrics like in pelt, deadline or schedutil. In case on SMT, the value
returned is not the capacity of SMT CPUs but default SCHED_CAPACITY_SCALE.
So remove it.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1535548752-4434-4-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Sparse reports the current declaration of two perf percpu variables
with this warning:
warning: incorrect type in initializer (different address spaces)
expected void const [noderef] <asn:3>*__vpp_verify
got struct perf_cpu_context *<noident>
While it's normally perfectly fine to place GCC attributes anywhere
in the definition, this particular attribute is for a checking
compiler's such as Sparse's benefit, which doesn't want __percpu
on pointers.
So reorder the attribute to come after the structure type, not after
the pointer type.
[ mingo: Rewrote the changelog. ]
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1543310012-7967-1-git-send-email-mojha@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
It turns out the version field in the lock_class structure isn't used
anywhere. Just remove it.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: iommu@lists.linux-foundation.org
Cc: kasan-dev@googlegroups.com
Link: https://lkml.kernel.org/r/1542653726-5655-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
With the only caller now gone, we can clean up this part of dma-debug's
exposed internals and make way to tweak the allocation behaviour.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Qian Cai <cai@lca.pw>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The secure boot mode may not be detected on boot for some reason (eg.
buggy firmware). This patch attempts one more time to detect the
secure boot mode.
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
On x86, there are two methods of verifying a kexec'ed kernel image
signature being loaded via the kexec_file_load syscall - an architecture
specific implementaton or a IMA KEXEC_KERNEL_CHECK appraisal rule. Neither
of these methods verify the kexec'ed kernel image signature being loaded
via the kexec_load syscall.
Secure boot enabled systems require kexec images to be signed. Therefore,
this patch loads an IMA KEXEC_KERNEL_CHECK policy rule on secure boot
enabled systems not configured with CONFIG_KEXEC_VERIFY_SIG enabled.
When IMA_APPRAISE_BOOTPARAM is configured, different IMA appraise modes
(eg. fix, log) can be specified on the boot command line, allowing unsigned
or invalidly signed kernel images to be kexec'ed. This patch permits
enabling IMA_APPRAISE_BOOTPARAM or IMA_ARCH_POLICY, but not both.
Signed-off-by: Eric Richter <erichte@linux.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Builtin IMA policies can be enabled on the boot command line, and replaced
with a custom policy, normally during early boot in the initramfs. Build
time IMA policy rules were recently added. These rules are automatically
enabled on boot and persist after loading a custom policy.
There is a need for yet another type of policy, an architecture specific
policy, which is derived at runtime during kernel boot, based on the
runtime secure boot flags. Like the build time policy rules, these rules
persist after loading a custom policy.
This patch adds support for loading an architecture specific IMA policy.
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Co-Developed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
|
|
Deferred invalidation is an ECS specific feature. It will not be
supported when IOMMU works in scalable mode. As we deprecated the
ECS support, remove deferred invalidation and cleanup the code.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This patch enables the current SVA (Shared Virtual Address)
implementation to work in the scalable mode.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This adds an interface to setup the PASID entries for first
level page table translation.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This patch enables the translation for requests without PASID in
the scalable mode by setting up the root and context entries.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This adds the interfaces to setup or tear down the structures
for second level page table translations. This includes types
of second level only translation and pass through.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Intel vt-d spec rev3.0 requires software to use 256-bit
descriptors in invalidation queue. As the spec reads in
section 6.5.2:
Remapping hardware supporting Scalable Mode Translations
(ECAP_REG.SMTS=1) allow software to additionally program
the width of the descriptors (128-bits or 256-bits) that
will be written into the Queue. Software should setup the
Invalidation Queue for 256-bit descriptors before progra-
mming remapping hardware for scalable-mode translation as
128-bit descriptors are treated as invalid descriptors
(see Table 21 in Section 6.5.2.10) in scalable-mode.
This patch adds 256-bit invalidation descriptor support
if the hardware presents scalable mode capability.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
So that they could also be used in other source files.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The Intel vt-d spec rev3.0 introduces a new translation
mode called scalable mode, which enables PASID-granular
translations for first level, second level, nested and
pass-through modes. At the same time, the previous
Extended Context (ECS) mode is deprecated (no production
ever implements ECS).
This patch adds enumeration for Scalable Mode and removes
the deprecated ECS enumeration. It provides a boot time
option to disable scalable mode even hardware claims to
support it.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
After the recent refactoring, snd_hdac_display_power() doesn't return
any error, hence it can be defined to return void.
This makes many error checks redundant and allows us to reduce them
gracefully.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The current HD-audio code manages the DRM audio power via too complex
redirections, and this seems even still unbalanced in a corner case as
Intel DRM CI has been intermittently reporting. This patch is a big
surgery for addressing the complexity and the possible unbalance.
Basically the patch changes the display PM in the following ways:
- Both HD-audio controller and codec drivers call a single helper,
snd_hdac_display_power(). (Formerly, the display power control from
a codec was done indirectly via link_power bus ops.)
- snd_hdac_display_power() receives the codec address index. For
turning on/off from the controller, pass HDA_CODEC_IDX_CONTROLLER.
- snd_hdac_display_power() doesn't manage refcounts any longer, but
keeps the power status in bitmap. If any of controller or codecs is
turned on, the function updates the DRM power state via get_power()
or put_power().
Also this refactor allows us more cleanup:
- The link_power bus ops is dropped, so there is no longer indirect
management, as mentioned in the above.
- hdac_device link_power_control flag is moved to hda_codec
display_power_control flag, as it's only for HDA legacy.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106525
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
This patch adds implementation for global suspend/resume for
devfreq framework. System suspend will next use these functions.
Suggested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Suggested-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Lukasz Luba <l.luba@partner.samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
|
|
The patch prepares devfreq device for handling suspend/resume
functionality. The new fields will store needed information during this
process. Devfreq framework handles opp-suspend DT entry and there is no
need of modyfications in the drivers code. It uses atomic variables to
make sure no race condition affects the process.
Suggested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Suggested-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Lukasz Luba <l.luba@partner.samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-12-11
The following pull-request contains BPF updates for your *net-next* tree.
It has three minor merge conflicts, resolutions:
1) tools/testing/selftests/bpf/test_verifier.c
Take first chunk with alignment_prevented_execution.
2) net/core/filter.c
[...]
case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
case bpf_ctx_range(struct __sk_buff, wire_len):
return false;
[...]
3) include/uapi/linux/bpf.h
Take the second chunk for the two cases each.
The main changes are:
1) Add support for BPF line info via BTF and extend libbpf as well
as bpftool's program dump to annotate output with BPF C code to
facilitate debugging and introspection, from Martin.
2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter
and all JIT backends, from Jiong.
3) Improve BPF test coverage on archs with no efficient unaligned
access by adding an "any alignment" flag to the BPF program load
to forcefully disable verifier alignment checks, from David.
4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for
proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz.
5) Extend tc BPF programs to use a new __sk_buff field called wire_len
for more accurate accounting of packets going to wire, from Petar.
6) Improve bpftool to allow dumping the trace pipe from it and add
several improvements in bash completion and map/prog dump,
from Quentin.
7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for
kernel addresses and add a dedicated BPF JIT backend allocator,
from Ard.
8) Add a BPF helper function for IR remotes to report mouse movements,
from Sean.
9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info
member naming consistent with existing conventions, from Yonghong
and Song.
10) Misc cleanups and improvements in allowing to pass interface name
via cmdline for xdp1 BPF example, from Matteo.
11) Fix a potential segfault in BPF sample loader's kprobes handling,
from Daniel T.
12) Fix SPDX license in libbpf's README.rst, from Andrey.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
NXP (Freecale) imx HSIC design has some special requirements, add
some flags at host code to handle them.
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Tested-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
|
|
This adds a function named devm_gpiod_unhinge() that removes
the resource management from a GPIO descriptor.
I am not sure if this is the best anglosaxon name for the
function, no other managed resources have an equivalent
currently, but I chose "unhinge" as the closest intuitive
thing I could imagine that fits Rusty Russell's API design
criterions "the obvious use is the correct one" and
"the name tells you how to use it".
The idea came out of a remark from Mark Brown that it should
be possible to handle over management of a resource from
devres to the regulator core, and indeed we can do that.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
This function already exist inside gpiolib, we were just
reluctant to make it available to the kernel at large as
the devm_* seemed to be enough for anyone.
However we found out that regulators need to do their own
lifecycle/refcounting on GPIO descriptors and explicitly
call gpiod_put() when done with a descriptor, so export
this function so we can hand the refcounting over to the
regulator core for these descriptors after retrieveal.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The egdev mechanism was replaced by the TC indirect block notifications
platform.
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Cc: John Hurley <john.hurley@netronome.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Changed the is_gretap_dev and is_ip6gretap_dev logic from structure
comparison to string comparison of the rtnl_link_ops kind field.
This approach aligns with the current identification methods and function
names of vxlan and geneve network devices.
Convert mlxsw to use these helpers and use them in downstream mlx5 patch.
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev
conflicts.
Highlights:
1) RDMA ODP (On Demand Paging) improvements and moving ODP logic to
mlx5 RDMA driver
2) Improved mlx5 core driver and device events handling and provided API
for upper layers to subscribe to device events.
3) RDMA only code cleanup from mlx5 core
4) Add helper to get CQE opcode
5) Rework handling of port module events
6) shared mlx5_ifc.h updates to avoid conflicts
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In uapi bpf.h, currently we have the following fields in
the struct bpf_prog_info:
__u32 func_info_cnt;
__u32 line_info_cnt;
__u32 jited_line_info_cnt;
The above field names "func_info_cnt" and "line_info_cnt"
also appear in union bpf_attr for program loading.
The original intention is to keep the names the same
between bpf_prog_info and bpf_attr
so it will imply what we returned to user space will be
the same as what the user space passed to the kernel.
Such a naming convention in bpf_prog_info is not consistent
with other fields like:
__u32 nr_jited_ksyms;
__u32 nr_jited_func_lens;
This patch made this adjustment so in bpf_prog_info
newly introduced *_info_cnt becomes nr_*_info.
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This isn't used anywhere across the mlx5 driver stack,
remove it.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Update the flow steering command formatting according to the extended
destination API.
Note that the FW dictates that multi destination FTEs that involve at
least one encap must use the extended destination format, while single
destination ones must use the legacy format.
Using extended destination format requires FW support. Check for its
capabilities and return error if not supported.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Change the driver flow destination struct to use bit flags with the vhca
id valid being the 1st one. The flags field is more extendable and will
be used in downstream patch.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Extended destinations provide the ability to configure different
encapsulation properties per destination on a single FTE. This is
needed for use-cases such as remote mirroring over tunneled networks.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
GRE RFC defines a 32 bit key field. NVGRE RFC splits the 32 bit
key field to 24 bit VSID (gre_key_h) and 8 bit flow entropy (gre_key_l).
Define the two key parsing alternatives in a union, thus enabling both
access methods.
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Will be used in downstream patch to monitor counter changes
by the HCA and report it to the driver by an event.
The driver will update its counters cached data accordingly.
Signed-off-by: Eyal Davidovich <eyald@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add support for SFDP (JESD216B) 4-byte Address Instruction Table. This
table is optional but when available, we parse it to get the 4-byte
address op codes supported by the memory.
Using these op codes is stateless as opposed to entering the 4-byte
address mode or setting the Base Address Register (BAR).
Flashes that have the 4BAIT table declared can now support
SPINOR_OP_PP_1_1_4_4B and SPINOR_OP_PP_1_4_4_4B opcodes.
Tested on MX25L25673G.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@microchip.com>
[tudor.ambarus@microchip.com:
- rework erase and page program logic,
- pass DMA-able buffer to spi_nor_read_sfdp(),
- introduce SPI_NOR_HAS_4BAIT
- various minor updates.]
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
|
|
Add SPDX tags to replace the license boiler-plate and fix the
MODULE_LICENSE() definition in spi-nor.c to match the license text
(GPL v2).
Interestingly, spi-nor.h and spi-nor.c do not use the same license
(GPL v2+ for spi-nor.h, GPL v2 for spi-nor.c).
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
|
|
Some flash_info entries have the SPI_NOR_4B_OPCODES flag set to let the
core know that the flash supports 4B opcode. While this solution works
fine for id-based caps detection, it doesn't work that well when relying
on SFDP-based caps detection. Let's add an SNOR_F_4B_OPCODES flag so
that the SFDP parsing code can set it when appropriate.
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
|
|
Some MICRON related macros in spi-nor domain were ST.
Rename entries related to STMicroelectronics under macro SNOR_MFR_ST.
Added entry of MFR Id for Micron flashes, 0x002C.
Signed-off-by: Yogesh Gaur <yogeshnarayan.gaur@nxp.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
|
|
Add the flag LED_INIT_DEFAULT_TRIGGER for indicating that trigger
being set is a default trigger for the LED class device, and
thus it should be initialized with settings provided in the fwnode.
Set the flag in the led_trigger_set_default(). It is expected to be
cleared in the activate() op of a trigger after trigger fwnode
initialization data is parsed and applied. This should happen only
once after LED class device registration, to allow leaving triggers
in the idle state on re-apply and let the users apply their own
settings without interference from the default ones.
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
On i.MX6QP/i.MX6Q/i.MX6DL, there are DCIC1/DCIC2 clocks
gate in CCM_CCGR0 register, add them into clock tree for
clock management.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID subsystem fixes from Jiri Kosina:
- two device-specific quirks from Hans de Goede and Nic Soudée
- reintroduction of (mistakenly remocved) ABS_RESERVED from Peter
Hutterer
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
Input: restore EV_ABS ABS_RESERVED
HID: quirks: fix RetroUSB.com devices
HID: ite: Add USB id match for another ITE based keyboard rfkill key quirk
|
|
Merge in kexec_file_load() support from Akashi Takahiro.
|
|
The function ftrace_replace_code() is the ftrace engine that does the
work to modify all the nops into the calls to the function callback in
all the functions being traced.
The generic version which is normally called from stop machine, but an
architecture can implement a non stop machine version and still use the
generic ftrace_replace_code(). When an architecture does this,
ftrace_replace_code() may be called from a schedulable context, where
it can allow the code to be preemptible, and schedule out.
In order to allow an architecture to make ftrace_replace_code()
schedulable, a new command flag is added called:
FTRACE_MAY_SLEEP
Which can be or'd to the command that is passed to
ftrace_modify_all_code() that calls ftrace_replace_code() and will have
it call cond_resched() in the loop that modifies the nops into the
calls to the ftrace trampolines.
Link: http://lkml.kernel.org/r/20181204192903.8193-1-anders.roxell@linaro.org
Link: http://lkml.kernel.org/r/20181205183303.828422192@goodmis.org
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The trace_add/remove_event_call_nolock() functions were added to allow
the tace_add/remove_event_call() code be called when the event_mutex
lock was already taken. Now that all callers are done within the
event_mutex, there's no reason to have two different interfaces.
Remove the current wrapper trace_add/remove_event_call()s and rename the
_nolock versions back to the original names.
Link: http://lkml.kernel.org/r/154140866955.17322.2081425494660638846.stgit@devbox
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The previous patches deleted all the code that needed the second value
returned from part_in_flight - now the kernel only uses the first value.
Consequently, part_in_flight (and blk_mq_in_flight) may be changed so that
it only returns one value.
This patch just refactors the code, there's no functional change.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now when part_round_stats is gone, we can switch to per-cpu in-flight
counters.
We use the local-atomic type local_t, so that if part_inc_in_flight or
part_dec_in_flight is reentrantly called from an interrupt, the value will
be correct.
The other counters could be corrupted due to reentrant interrupt, but the
corruption only results in slight counter skew - the in_flight counter
must be exact, so it needs local_t.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|