Age | Commit message (Collapse) | Author |
|
Struct pci_host_bridge can be allocated by PCI host bridge drivers which
usually allocate and map memory through devm managed interfaces.
Add a devm version for the pci_alloc_host_bridge() interface to simplify
PCI host controller driver porting and simplify the driver failure paths.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
|
|
Commit a52d1443bba1 ("PCI: Export host bridge registration interface")
exported the pci_alloc_host_bridge() interface so that PCI host controllers
drivers can make use of it.
Introduce pci_alloc_host_bridge() kernel counterpart to free the host
bridge data structures, pci_free_host_bridge(), export it and update kernel
functions releasing host bridge objects allocated memory to make use of it.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
|
|
At the point where the kvm-vfio pseudo device wants to release its
vfio group reference, we can't always acquire a new reference to make
that happen. The group can be in a state where we wouldn't allow a
new reference to be added. This new helper function allows a caller
to match a file to a group to facilitate this. Given a file and
group, report if they match. Thus the caller needs to already have a
group reference to match to the file. This allows the deletion of a
group without acquiring a new reference.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Cc: stable@vger.kernel.org
|
|
Some irq controllers have writeonly/multipurpose register layouts. In
those cases we read invalid data back. Here we add the option
mask_writeonly as masking option.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Currently, cgroup only supports delegation to !root users and cgroup
namespaces don't get any special treatments. This limits the
usefulness of cgroup namespaces as they by themselves can't be safe
delegation boundaries. A process inside a cgroup can change the
resource control knobs of the parent in the namespace root and may
move processes in and out of the namespace if cgroups outside its
namespace are visible somehow.
This patch adds a new mount option "nsdelegate" which makes cgroup
namespaces delegation boundaries. If set, cgroup behaves as if write
permission based delegation took place at namespace boundaries -
writes to the resource control knobs from the namespace root are
denied and migration crossing the namespace boundary aren't allowed
from inside the namespace.
This allows cgroup namespace to function as a delegation boundary by
itself.
v2: Silently ignore nsdelegate specified on !init mounts.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Aravind Anbudurai <aru7@fb.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Eric Biederman <ebiederm@xmission.com>
|
|
svc_rdma_marshal.c has one remaining exported function --
svc_rdma_xdr_decode_req -- and it has a single call site. Take
the same approach as the sendto path, and move this function
into the source file where it is called.
This is a refactoring change only.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Update to get f0c3192ceee3 "virtio_net: lower limit on buffer size".
That bug was interfering with my nfsd testing.
|
|
Many subsystems will not use refcount_t unless there is a way to build the
kernel so that there is no regression in speed compared to atomic_t. This
adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation
which has the validation but is slightly slower. When not enabled,
refcount_t uses the basic unchecked atomic_t routines, which results in
no code changes compared to just using atomic_t directly.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Jann Horn <jannh@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arozansk@redhat.com
Cc: axboe@kernel.dk
Cc: linux-arch <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
No need to differentiate fabrics from pci/loop, also lower
it to 32 as we don't really need 256 inflight admin commands.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
dmam_alloc_noncoherent is a trivial wrapper around dmam_alloc_attrs,
that hardcodes one particular flag. Make the devres code more
flexible by allowing the callers to pass arbitrary flags.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
|
|
This function was never used since it was added.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
And update the documentation - dma_mapping_error has been supported
everywhere for a long time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Current busy-wait loops are implemented by repeatedly calling cpu_relax()
to give an arch option for a low-latency option to improve power and/or
SMT resource contention.
This poses some difficulties for powerpc, which has SMT priority setting
instructions (priorities determine how ifetch cycles are apportioned).
powerpc's cpu_relax() is implemented by setting a low priority then
setting normal priority. This has several problems:
- Changing thread priority can have some execution cost and potential
impact to other threads in the core. It's inefficient to execute them
every time around a busy-wait loop.
- Depending on implementation details, a `low ; medium` sequence may
not have much if any affect. Some software with similar pattern
actually inserts a lot of nops between, in order to cause a few fetch
cycles with the low priority.
- The busy-wait loop runs with regular priority. This might only be a few
fetch cycles, but if there are several threads running such loops, they
could cause a noticable impact on a non-idle thread.
Implement spin_begin, spin_end primitives that can be used around busy
wait loops, which default to no-ops. And spin_cpu_relax which defaults to
cpu_relax.
This will allow architectures to hook the entry and exit of busy-wait
loops, and will allow powerpc to set low SMT priority at entry, and
normal priority at exit.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next
|
|
The run_wake flag in struct dev_pm_info is used to indicate whether
or not the device is capable of generating remote wakeup signals at
run time (or in the system working state), but the distinction
between runtime remote wakeup and system wakeup signaling has always
been rather artificial. The only practical reason for it to exist
at the core level was that ACPI and PCI treated those two cases
differently, but that's not the case any more after recent changes.
For this reason, get rid of the run_wake flag and, when applicable,
use device_set_wakeup_capable() and device_can_wakeup() instead of
device_set_run_wake() and device_run_wake(), respectively.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
After previous changes it is not necessary to distinguish between
device wakeup for run time and device wakeup from system sleep states
any more, so rework the PCI device wakeup settings code accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The pme_interrupt flag in struct pci_dev is set when PMEs generated
by the device are going to be signaled via root port PME interrupts.
Ironically enough, that information is only used by the code setting
up device wakeup through ACPI which returns as soon as it sees the
pme_interrupt flag set while setting up "remote runtime wakeup".
That is questionable, however, because in theory there may be PCIe
devices using out-of-band PME signaling under root ports handled
by the native PME code or devices requiring wakeup power setup to be
carried out by AML. For such devices, ACPI wakeup should be invoked
regardless of whether or not native PME signaling is used in general.
For this reason, drop the pme_interrupt flag and rework the code
using it which then allows the ACPI-based device wakeup handling
in PCI to be consolidated to use one code path for both "runtime
remote wakeup" and system wakeup (from sleep states).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Require all dax-drivers to register a ->copy_from_iter() operation so
that it is clear which dax_operations are optional and which must be
implemented for filesystem-dax to operate.
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Now that all callers of the pmem api have been converted to dax helpers that
call back to the pmem driver, we can remove include/linux/pmem.h and
asm/pmem.h.
Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Kill this globally defined wrapper and move to libnvdimm so that we can
ultimately remove include/linux/pmem.h and asm/pmem.h.
Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Only used inside the bounce code, and opencoding it makes it more obvious
what is going on.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This adds support for Directives in NVMe, particular for the Streams
directive. Support for Directives is a new feature in NVMe 1.3. It
allows a user to pass in information about where to store the data, so
that it the device can do so most effiently. If an application is
managing and writing data with different life times, mixing differently
retentioned data onto the same locations on flash can cause write
amplification to grow. This, in turn, will reduce performance and life
time of the device.
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Useful to verify that things are working the way they should.
Reading the file will return number of kb written with each
write hint. Writing the file will reset the statistics. No care
is taken to ensure that we don't race on updates.
Drivers will write to q->write_hints[] if they handle a given
write hint.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No functional changes in this patch, we just use up some holes
in the bio and request structures to define a write hint that
we psas down the stack.
Ensure that we don't merge requests that have different life time
hints assigned to them, and that we inherit the write hint when
cloning a bio.
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Define a set of write life time hints:
RWH_WRITE_LIFE_NOT_SET No hint information set
RWH_WRITE_LIFE_NONE No hints about write life time
RWH_WRITE_LIFE_SHORT Data written has a short life time
RWH_WRITE_LIFE_MEDIUM Data written has a medium life time
RWH_WRITE_LIFE_LONG Data written has a long life time
RWH_WRITE_LIFE_EXTREME Data written has an extremely long life time
The intent is for these values to be relative to each other, no
absolute meaning should be attached to these flag names.
Add an fcntl interface for querying these flags, and also for
setting them as well:
F_GET_RW_HINT Returns the read/write hint set on the
underlying inode.
F_SET_RW_HINT Set one of the above write hints on the
underlying inode.
F_GET_FILE_RW_HINT Returns the read/write hint set on the
file descriptor.
F_SET_FILE_RW_HINT Set one of the above write hints on the
file descriptor.
The user passes in a 64-bit pointer to get/set these values, and
the interface returns 0/-1 on success/error.
Sample program testing/implementing basic setting/getting of write
hints is below.
Add support for storing the write life time hint in the inode flags
and in struct file as well, and pass them to the kiocb flags. If
both a file and its corresponding inode has a write hint, then we
use the one in the file, if available. The file hint can be used
for sync/direct IO, for buffered writeback only the inode hint
is available.
This is in preparation for utilizing these hints in the block layer,
to guide on-media data placement.
/*
* writehint.c: get or set an inode write hint
*/
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdbool.h>
#include <inttypes.h>
#ifndef F_GET_RW_HINT
#define F_LINUX_SPECIFIC_BASE 1024
#define F_GET_RW_HINT (F_LINUX_SPECIFIC_BASE + 11)
#define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12)
#endif
static char *str[] = { "RWF_WRITE_LIFE_NOT_SET", "RWH_WRITE_LIFE_NONE",
"RWH_WRITE_LIFE_SHORT", "RWH_WRITE_LIFE_MEDIUM",
"RWH_WRITE_LIFE_LONG", "RWH_WRITE_LIFE_EXTREME" };
int main(int argc, char *argv[])
{
uint64_t hint;
int fd, ret;
if (argc < 2) {
fprintf(stderr, "%s: file <hint>\n", argv[0]);
return 1;
}
fd = open(argv[1], O_RDONLY);
if (fd < 0) {
perror("open");
return 2;
}
if (argc > 2) {
hint = atoi(argv[2]);
ret = fcntl(fd, F_SET_RW_HINT, &hint);
if (ret < 0) {
perror("fcntl: F_SET_RW_HINT");
return 4;
}
}
ret = fcntl(fd, F_GET_RW_HINT, &hint);
if (ret < 0) {
perror("fcntl: F_GET_RW_HINT");
return 3;
}
printf("%s: hint %s\n", argv[1], str[hint]);
close(fd);
return 0;
}
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Inorder to support recording of tgid, the following changes are made:
* Introduce a new API (tracing_record_taskinfo) to additionally record the tgid
along with the task's comm at the same time. This has has the benefit of not
setting trace_cmdline_save before all the information for a task is saved.
* Add a new API tracing_record_taskinfo_sched_switch to record task information
for 2 tasks at a time (previous and next) and use it from sched_switch probe.
* Preserve the old API (tracing_record_cmdline) and create it as a wrapper
around the new one so that existing callers aren't affected.
* Reuse the existing sched_switch and sched_wakeup probes to record tgid
information and add a new option 'record-tgid' to enable recording of tgid
When record-tgid option isn't enabled to being with, we take care to make sure
that there's isn't memory or runtime overhead.
Link: http://lkml.kernel.org/r/20170627020155.5139-1-joelaf@google.com
Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Tested-by: Michael Sartain <mikesart@gmail.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The subset of wake-enabled host events is defined by the EC, but the EC
may still send non-wake host events if we're in the process of
suspending. Get the mask of wake-enabled host events from the EC and
filter out non-wake events to prevent spurious aborted suspend
attempts.
Signed-off-by: Shawn Nematbakhsh <shawnn@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Acked-for-MFD-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Benson Leung <bleung@chromium.org>
|
|
Add Innova IPSec ESP crypto offload configuration paths.
Detect Innova IPSec device and set the NETIF_F_HW_ESP flag.
Configure Security Associations using the API introduced in a previous
patch.
Add Software-parser hardware descriptor layout
Software-Parser (swp) is a hardware feature in ConnectX which allows the
host software to specify protocol header offsets in the TX path, thus
overriding the hardware parser.
This is useful for protocols that the ASIC may not be able to parse on
its own.
Note that due to inline metadata, XDP is not supported in Innova IPSec.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add routines for manipulating the hardware IPSec SA database (SADB).
In Innova IPSec, a Security Association (SA) is added or deleted
via a command message over the SBU connection.
The HW then sends a response message over the same connection.
Add implementation for Innova IPSec (FPGA-based) hardware.
These routines will be used by the IPSec offload support in a later patch
However they may also be used by others such as RDMA and RoCE IPSec.
mlx5/accel is a middle acceleration layer to allow mlx5e and other ULPs
to work directly with mlx5_core rather than Innova FPGA or other mlx5
acceleration providers.
In this patchset we add Innova IPSec support and mlx5/accel delegates
IPSec offloads to Innova routines.
In the future, when IPSec/TLS or any other acceleration gets integrated
into ConnectX chip, mlx5/accel layer will provide the integrated
acceleration, rather than the Innova one.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add interface to initialize and interact with Innova FPGA SBU
connections.
A client driver may use these functions to set up a high-speed DMA
connection with its SBU hardware logic, and send/receive messages
over this connection.
A later patch in this patchset will make use of these functions for
Innova IPSec offload in mlx5 Ethernet driver.
Add commands to retrieve Innova FPGA SBU capabilities, and to
read/write Innova FPGA configuration space registers and memory,
over internal I2C.
At high level, the FPGA configuration space is divided such:
0x00000000 - 0x007fffff is reserved for the SBU
0x00800000 - 0xffffffff is reserved for the Shell
0x400000000 - ... is DDR memory
A later patchset will add support for accessing FPGA CrSpace and memory
over a high-speed connection. This is the reason for the ACCESS_TYPE
enumeration, which currently only supports I2C.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The Innova FPGA includes shell hardware and Sandbox-Unit (SBU) hardware.
The shell hardware is handled by mlx5_core itself, while the SBU is
handled by a client driver.
Reset the SBU to a well-known initial state when initializing a new
device, and set the FPGA to bypass mode when uninitializing a device.
This allows the client driver to assume that its device has been
reset when a new device is detected.
During SBU reset, the FPGA is put into SBU-bypass mode. In this mode
packets do not pass through the SBU, so it cannot affect the network
data stream at all.
A factory-image does not have an SBU, so skip these flows.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The FPGA QP is a high-bandwidth communication channel between the host
CPU and the FPGA device. It allows performing DMA operations between
host memory and the FPGA logic via the ConnectX chip.
Add ConnectX FW commands which create and manipulate FPGA QPs.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Previously, only mlx5_ib enabled RoCE on the port, but FPGA needs it as
well.
Add support for counting number of enables, so that FPGA and IB can work
in parallel and independently.
Program the HW to enable RoCE on the first enable call, and program to
disable RoCE on the last disable call.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Reserved GIDs are entries in the GID table in use by the mlx5_core
and its submodules (e.g. FPGA, SRIOV, E-Swtich, netdev).
The entries are reserved at the high indexes of the GID table.
A mlx5 submodule may reserve a certain amount of GIDs for its own use
during the load sequence by calling mlx5_core_reserve_gids, and must
also take care to un-reserve these GIDs when it closes.
Reservation is only allowed during the load sequence and before any
interfaces (e.g. mlx5_ib or mlx5_en) are up.
After reservation, a submodule may call mlx5_core_reserved_gid_alloc/
free to allocate entries from the reserved GIDs pool.
Reserve a GID table entry for every supported FPGA QP.
A later patch in the patchset will remove them from being reported to
IB core.
Another such patch will make use of these for FPGA QPs in Innova NIC.
Added lib/mlx5.h to serve as a library for mlx5 submodlues, and to
expose only public mlx5 API, more mlx5 library files will be added in
future submissions.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Draining the health workqueue will ignore future health works including
the one that report hardware failure and thus we can't enter error state
Instead cancel the recovery flow and make sure only recovery flow won't
be scheduled.
Fixes: 5e44fca50470 ('net/mlx5: Only cancel recovery work when cleaning up device')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The function converts strings like ttyS0 and ttyUSB0 to dev_t like
(4, 64) and (188, 0). It does this by scanning tty_drivers list for
corresponding device name and index. If the driver is not registered,
this function returns -ENODEV. It also acquires tty_mutex.
Signed-off-by: Okash Khawaja <okash.khawaja@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The goal of this change is to give users a uniform and meaningful
result when they read /sys/...cpufreq/scaling_cur_freq
on modern x86 hardware, as compared to what they get today.
Modern x86 processors include the hardware needed
to accurately calculate frequency over an interval --
APERF, MPERF, and the TSC.
Here we provide an x86 routine to make this calculation
on supported hardware, and use it in preference to any
driver driver-specific cpufreq_driver.get() routine.
MHz is computed like so:
MHz = base_MHz * delta_APERF / delta_MPERF
MHz is the average frequency of the busy processor
over a measurement interval. The interval is
defined to be the time between successive invocations
of aperfmperf_khz_on_cpu(), which are expected to to
happen on-demand when users read sysfs attribute
cpufreq/scaling_cur_freq.
As with previous methods of calculating MHz,
idle time is excluded.
base_MHz above is from TSC calibration global "cpu_khz".
This x86 native method to calculate MHz returns a meaningful result
no matter if P-states are controlled by hardware or firmware
and/or if the Linux cpufreq sub-system is or is-not installed.
When this routine is invoked more frequently, the measurement
interval becomes shorter. However, the code limits re-computation
to 10ms intervals so that average frequency remains meaningful.
Discerning users are encouraged to take advantage of
the turbostat(8) utility, which can gracefully handle
concurrent measurement intervals of arbitrary length.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Linux 4.12-rc7
Needed at least rc6 for drm-misc-next-fixes, may as well go to rc7
|
|
When a module filter is added to set_ftrace_filter, if the module is not
loaded, it is cached. This should be considered an active filter, and
function tracing should be filtered by this. That is, if a cached module
filter is the only filter set, then no function tracing should be happening,
as all the functions available will be filtered out.
This makes sense, as the reason to add a cached module filter, is to trace
the module when you load it. There shouldn't be any other tracing happening
until then.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
When writing in a module filter into set_ftrace_filter for a module that is
not yet loaded, it it cached, and will be executed when the module is loaded
(although that is not implemented yet at this commit). Display the list of
cached modules to be traced.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Merge in arm64 ACPI RAS support (APEI/GHES) from Tyler Baicar.
|
|
Thierry bisected boot failures to this simplification commit.
Reverts: 3f1d472055bb ("ktime: Simplify ktime_compare implementation")
Reported-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mariusz Skamra <mariuszx.skamra@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
Second set of IIO new device support, features and cleanups for the 4.13 cycle.
A few reverts here. One was a general failure to notice a device was already
supported by another driver. The second is due to a review comment pointing
out that the original patch was a bad idea and would break existing systems.
Reverts
* bma180
- Revert addition of support for the BMA250E it is already supported by
the bmc150-accel and better supported at that. Oops.
* hi8435
- The fix for cleanup of the reset gpio stuff isn't a good way to go. It
breaks systems where an inverting level convertor is used. The right fix
is to make the original devicetree correct - even if it involves patching
the devicetree in kernel.
New Device Support
* stm32-adc
- STM32H7 support and bindings.
Features
* core
- add a hardware triggered operating mode for systems in which the actual
trigger is never seen by the kernel. This is typically only used when
a device 'can' use other triggers, but if a particular magic one is
enabled the interrupt is effectively handled in hardware and we never see
it.
* st-lsm6dsx
- support active low interrupts.
* stm32-adc
- Make the core adc clock optional as not all hardware supported requires it.
- Make the bus clock optional in the per instance driver as it may be shared
by all instances of the ADC and is handled by the core.
- Rework to have a data structure representing the device type specific
elements.
* stm32-trigger (and counter)
- Use the INDIO_HARDWARE_TRIGGERED_MODE where appropriate.
- Add an attribute to configure device modes for quadrature counting etc.
Clean ups and minor fixes
* IIO core.
- use __sysfs_match_string() helper rather than open coding the same.
* ad7791
- use sysfs_match_string() helper rather than open coding the same.
* aspeed-adc
- handle return value of clk_prepare_enable
* cpcap
- Fix default register values and ensure the battery thermistor is enabled
correctly.
- Fix the reported die temperature where we can - docs are lacking.
- Remove the hung interrupt quirk as no longer happens due to fix in the
mfd driver.
* hi8435
- Remove &s from hi8435_info definition as unneeded and inconsistent.
* hid-sensor-trgger
- Add kconfig depends on IIO_BUFFER (fixes patch in previous series)
* ina2xx
- Make the use of iio_info_mask* elements consistent for all channels.
This doesn't have any visible effect, but acts as clear documentation of
which channels various resulting attributes apply to.
* lpc32xx
- handle the return value of clk_prepare_enable.
* meson-saradc
- NULL instead of 0 for pointer.
* mma9551
- use NULL for GPIO connection ID to aid implementation fo ACPI support.
Here the connection ID doesn't actually tell us anything and it is much
easier to deal with the driver if it's not there.
* mpu6050
- Fix lock issues through use of a local mux.
- Replace sprintf with scnprintf as appropriate.
- Check whoami against all known values. This allows for a small number of
boards where we are really fishing for the part not being present at all.
It is unfortunately common to have undescribed changes to use newer chips.
We paper over this but just emitting a warning for those cases as long as
we know about.
* mxs-lradc
- Fix some non static warnings.
* rcar-adc
- Part of making the naming for this part consistent across the kernel.
* st_accel
- drop some spi_device_id entries for variants with no SPI support
* st_magn
- drop some spi_device_id entries for variants with no SPI support.
* sx9500
- Use devm_gpiod_get instead of indexed value with an index of 0 on all
occasions.
* twl4030
- Drop unused twl4030_get_madc_conversion as callers removed now throughout
kernel.
- Unexport twl4030_madc_conversion() as no used only within this driver.
- Drop twl4030_madc_user_params as not used now.
- Drop twl4030_madc_request.func_cb as not used now.
- Fold the twl4030-madc.h header into the driver as no longer used anywhere
else in the kernel.
* xilinx
- Handle the return value of clk_prepare_enable
|
|
As we change the user space type for the timerfd and posix timer
functions to newer data types, we need some form of conversion
helpers to avoid duplicating that logic.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add helper functions to convert between struct timespec64 and
struct timespec at userspace boundaries.
This is a preparatory patch to use timespec64 as the basic type
internally in the kernel as timespec is not y2038 safe on 32 bit systems.
The patch helps the cause by containing all data conversions at the
userspace boundaries within these functions.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
|
|
Looks like commit f663dd9aaf9e ("net: core: explicitly select a txq before doing l2 forwarding")
has removed the need for this dedicated xmit function [it even explicitly
states so in its commit log message] but it hasn't removed the definition
of the ndo.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
CC: Jason Wang <jasowang@redhat.com>
CC: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|