Age | Commit message (Collapse) | Author |
|
If the current cpu is the one which has the hrtimer based broadcast
queued then we better return busy immediately instead of going through
loops and hoops to figure that out.
[ Split out from a larger combo patch ]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
Tell the idle code not to go deep if the broadcast IPI is about to
arrive.
[ Split out from a larger combo patch ]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
If the system is in periodic mode and the broadcast device is hrtimer
based, return busy as we have no proper handling for this.
[ Split out from a larger combo patch ]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
We need to check more than the periodic mode for proper operation in
all runtime combinations. To avoid code duplication move the check
into the enter state handling.
No functional change.
[ Split out from a larger combo patch ]
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
Add a check for a installed broadcast device to the oneshot control
function and return busy if not.
[ Split out from a larger combo patch ]
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
Currently the broadcast busy check, which prevents the idle code from
going into deep idle, works only in one shot mode.
If NOHZ and HIGHRES are off (config or command line) there is no
sanity check at all, so under certain conditions cpus are allowed to
go into deep idle, where the local timer stops, and are not woken up
again because there is no broadcast timer installed or a hrtimer based
broadcast device is not evaluated.
Move tick_broadcast_oneshot_control() into the common code and provide
proper subfunctions for the various config combinations.
The common check in tick_broadcast_oneshot_control() is for the C3STOP
misfeature flag of the local clock event device. If its not set, idle
can proceed. If set, further checks are necessary.
Provide checks for the trivial cases:
- If broadcast is disabled in the config, then return busy
- If oneshot mode (NOHZ/HIGHES) is disabled in the config, return
busy if the broadcast device is hrtimer based.
- If oneshot mode is enabled in the config call the original
tick_broadcast_oneshot_control() function. That function needs
extra checks which will be implemented in seperate patches.
[ Split out from a larger combo patch ]
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
The broadcast code shuts down the local clock event unconditionally
even if no broadcast device is installed or if the broadcast device is
hrtimer based.
Add proper sanity checks.
[ Split out from a larger combo patch ]
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
The hrtimer based broadcast vehicle can cause a hrtimer recursion
which went unnoticed until we changed the hrtimer expiry code to keep
track of the currently running timer.
local_timer_interrupt()
local_handler()
hrtimer_interrupt()
expire_hrtimers()
broadcast_hrtimer()
send_ipis()
local_handler()
hrtimer_interrupt()
....
Solution is simple: Prevent the local handler call from the broadcast
code when the broadcast 'device' is hrtimer based.
[ Split out from a larger combo patch ]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
|
|
Move EDAC_SUPPORT to the right place.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The __machine__findnew_compat() function is called only from
__machine__findnew_vdso_compat() which is called only from
machine__findnew_vdso() which already holds machine->dsos.lock, so
remove locking from __machine__findnew_compat().
This manifests itself tracing 32-bit programs with a 64-bit perf.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1436267618-20521-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch reorders the nodes alphabetically
Signed-off-by: Walter Lozano <walter@vanguardiasur.com.ar>
Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
|
|
For those parts of the arm64 ACPI code that need to check GICC subtables
in the MADT, use the new BAD_MADT_GICC_ENTRY macro instead of the previous
BAD_MADT_ENTRY. The new macro takes into account differences in the size
of the GICC subtable that the old macro did not; this caused failures even
though the subtable entries are valid.
Fixes: aeb823bbacc2 ("ACPICA: ACPI 6.0: Add changes for FADT table.")
Signed-off-by: Al Stone <al.stone@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The BAD_MADT_ENTRY() macro is designed to work for all of the subtables
of the MADT. In the ACPI 5.1 version of the spec, the struct for the
GICC subtable (struct acpi_madt_generic_interrupt) is 76 bytes long; in
ACPI 6.0, the struct is 80 bytes long. But, there is only one definition
in ACPICA for this struct -- and that is the 6.0 version. Hence, when
BAD_MADT_ENTRY() compares the struct size to the length in the GICC
subtable, it fails if 5.1 structs are in use, and there are systems in
the wild that have them.
This patch adds the BAD_MADT_GICC_ENTRY() that checks the GICC subtable
only, accounting for the difference in specification versions that are
possible. The BAD_MADT_ENTRY() will continue to work as is for all other
MADT subtables.
This code is being added to an arm64 header file since that is currently
the only architecture using the GICC subtable of the MADT. As a GIC is
specific to ARM, it is also unlikely the subtable will be used elsewhere.
Fixes: aeb823bbacc2 ("ACPICA: ACPI 6.0: Add changes for FADT table.")
Signed-off-by: Al Stone <al.stone@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
[catalin.marinas@arm.com: extra brackets around macro arguments]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This patch fixes the formating of DTS bindings for the adxl34x digital
accelerometer, and updates the compatible string after the deprecation
of "adxl345x" compatible string.
Signed-off-by: Walter Lozano <walter@vanguardiasur.com.ar>
Acked-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
|
|
Currently, per-blkcg data is freed each time a policy is deactivated,
that is also upon scheduler switch. However, when switching from a
scheduler implementing a policy which requires per-blkcg data to
another one, that same policy might be active on other devices, and
therefore those same per-blkcg data could be still in use.
This commit lets per-blkcg data be freed when the blkcg is freed
instead of on policy deactivation.
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
Reported-and-tested-by: Michael Kaminsky <kaminsky@cs.cmu.edu>
Fixes: e48453c3 ("block, cgroup: implement policy-specific per-blkcg data")
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
use FIELD_SIZEOF instead of open coding
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
bio_integrity_alloc() and bio_integrity_free() assume that if a bio was
allocated from a bioset that that bioset also had its bio_integrity_pool
allocated using bioset_integrity_create(). This is a very bad
assumption given that bioset_create() and bioset_integrity_create() are
completely disjoint. Not all callers of bioset_create() have been
trained to also call bioset_integrity_create() -- and they may not care
to be.
Fix this by falling back to kmalloc'ing 'struct bio_integrity_payload'
rather than force all bioset consumers to (wastefully) preallocate a
bio_integrity_pool that they very likely won't actually need (given the
niche nature of the current block integrity support).
Otherwise, a NULL pointer "Kernel BUG" with a trace like the following
will be observed (as seen on s390x using zfcp storage) because dm-io
doesn't use bioset_integrity_create() when creating its bioset:
[ 791.643338] Call Trace:
[ 791.643339] ([<00000003df98b848>] 0x3df98b848)
[ 791.643341] [<00000000002c5de8>] bio_integrity_alloc+0x48/0xf8
[ 791.643348] [<00000000002c6486>] bio_integrity_prep+0xae/0x2f0
[ 791.643349] [<0000000000371e38>] blk_queue_bio+0x1c8/0x3d8
[ 791.643355] [<000000000036f8d0>] generic_make_request+0xc0/0x100
[ 791.643357] [<000000000036f9b2>] submit_bio+0xa2/0x198
[ 791.643406] [<000003ff801f9774>] dispatch_io+0x15c/0x3b0 [dm_mod]
[ 791.643419] [<000003ff801f9b3e>] dm_io+0x176/0x2f0 [dm_mod]
[ 791.643423] [<000003ff8074b28a>] do_reads+0x13a/0x1a8 [dm_mirror]
[ 791.643425] [<000003ff8074b43a>] do_mirror+0x142/0x298 [dm_mirror]
[ 791.643428] [<0000000000154fca>] process_one_work+0x18a/0x3f8
[ 791.643432] [<000000000015598a>] worker_thread+0x132/0x3b0
[ 791.643435] [<000000000015d49a>] kthread+0xd2/0xd8
[ 791.643438] [<00000000005bc0ca>] kernel_thread_starter+0x6/0xc
[ 791.643446] [<00000000005bc0c4>] kernel_thread_starter+0x0/0xc
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The control flag for the bias control is MAX8973_CONTROL_BIAS_ENABLE
rather than MAX8973_BIAS_ENABLE which is macro for the bits in
register.
Fix this typo.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
According to the datasheet 'pm', 'psr' and 'div2' should never be all 0.
Since commit 541b03ad6cfe ("ASoC: fsl_ssi: Fix the incorrect limitation of
the bit clock rate") this can happen, because for some bitclock rates
'pm' = 0 seems to be a valid choice but does not work due to hardware
restrictions. This results into a bad hardware behaviour (slow audio for
example). Feature tested on a i.MX25.
Signed-off-by: Juergen Borleis <jbe@pengutronix.de>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
If NO_DMA=y:
ERROR: "dma_unmap_single" [drivers/spi/spi-zynqmp-gqspi.ko] undefined!
ERROR: "dma_mapping_error" [drivers/spi/spi-zynqmp-gqspi.ko] undefined!
ERROR: "dma_map_single" [drivers/spi/spi-zynqmp-gqspi.ko] undefined!
Add a dependency on HAS_DMA to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ranjit Waghmode <ranjitw@xilinx.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Since spidev is no more allowed to use in DT and is really loudly warned about
it I'd like to add this compatible value.
(Geert Uytterhoeven wrote: "Add the compatible value for your device to the
spidev_dt_ids[] array in drivers/spi/spidev.c.")
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
legacy setcrtc ioctl does take a 32 bit value which might indeed
overflow
the checks of crtc_req->x > INT_MAX and crtc_req->y > INT_MAX aren't
needed any more with this
v2: -polish the annotation according to Daniel's comment
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Rather than (incompletely [0]) re-implementing drm_gem_mmap() and
drm_gem_mmap_obj() helpers, call them directly from the rockchip mmap
routines.
Once the core functions return successfully, the rockchip mmap routines
can still use dma_mmap_attrs() to simply mmap the entire buffer.
[0] Previously, we were performing the mmap() without first taking a
reference on the underlying gem buffer. This could leak ptes if the gem
object is destroyed while userspace is still holding the mapping.
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
|
|
This more accurately reflects what these drivers actually do.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The s390-specific virtio drivers have probably more to do with virtio
than with kvm today; let's move them out into a separate section to
reflect this and to be able to add relevant mailing lists.
CC: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
We already have VIRTIO_PCI_CAP_PCI_CFG, let's define the structure that
goes with it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The virtio_ring.h header is used in userspace programs (ie. QEMU),
too. Here we can not assume that sizeof(pointer) is the same as
sizeof(long), e.g. when compiling for Windows, so the typecast in
vring_init() should be done with (uintptr_t) instead of (unsigned long).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
drivers/scsi/virtio_scsi.c: In function 'virtscsi_probe':
drivers/scsi/virtio_scsi.c:952:11: warning: unused variable 'host_prot' [-Wunused-variable]
int err, host_prot;
^
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
If dev_pm_attach_wake_irq() fails, the device's power.wakeirq field
should not be set to point to the struct wake_irq passed to that
function, as that object will be freed going forward.
For this reason, make dev_pm_attach_wake_irq() first call
device_wakeup_attach_irq() and only set the device's power.wakeirq
field if that's successful.
That requires device_wakeup_attach_irq() to be called under the
device's power.lock lock, but since dev_pm_attach_wake_irq() is
the only caller of it, the requisite changes are easy to make.
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling)
Reported-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
irq_data is protected by irq_desc->lock, so retrieving the irq chip
from irq_data outside the lock is racy vs. an concurrent update. Move
it into the lock held region.
While at it add a comment why the vector walk does not require
vector_lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.331320612@linutronix.de
|
|
It's unsafe to examine fields in the irq descriptor w/o holding the
descriptor lock. Add proper locking.
While at it add a comment why the vector check can run lock less
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.236544164@linutronix.de
|
|
Jin debugged a nasty cpu hotplug race which results in leaking a irq
vector on the newly hotplugged cpu.
cpu N cpu M
native_cpu_up device_shutdown
do_boot_cpu free_msi_irqs
start_secondary arch_teardown_msi_irqs
smp_callin default_teardown_msi_irqs
setup_vector_irq arch_teardown_msi_irq
__setup_vector_irq native_teardown_msi_irq
lock(vector_lock) destroy_irq
install vectors
unlock(vector_lock)
lock(vector_lock)
---> __clear_irq_vector
unlock(vector_lock)
lock(vector_lock)
set_cpu_online
unlock(vector_lock)
This leaves the irq vector(s) which are torn down on CPU M stale in
the vector array of CPU N, because CPU M does not see CPU N online
yet. There is a similar issue with concurrent newly setup interrupts.
The alloc/free protection of irq descriptors does not prevent the
above race, because it merily prevents interrupt descriptors from
going away or changing concurrently.
Prevent this by moving the call to setup_vector_irq() into the
vector_lock held region which protects set_cpu_online():
cpu N cpu M
native_cpu_up device_shutdown
do_boot_cpu free_msi_irqs
start_secondary arch_teardown_msi_irqs
smp_callin default_teardown_msi_irqs
lock(vector_lock) arch_teardown_msi_irq
setup_vector_irq()
__setup_vector_irq native_teardown_msi_irq
install vectors destroy_irq
set_cpu_online
unlock(vector_lock)
lock(vector_lock)
__clear_irq_vector
unlock(vector_lock)
So cpu M either sees the cpu N online before clearing the vector or
cpu N installs the vectors after cpu M has cleared it.
Reported-by: xiao jin <jin.xiao@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.141898931@linutronix.de
|
|
Currently the kernel API AFU dev refcounting is done on context start and stop.
This patch moves this refcounting to context init and release, bringing it
inline with how the userspace API does it.
Without this we've seen the refcounting on the AFU get out of whack between the
user and kernel API usage. This causes the AFU structures to be freed when
they are actually still in use.
This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
an RCU assertion
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
SCHEDULE_USER is no longer used, and asm/context-tracking.h
contained nothing else. Remove the header entirely.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/854e9b45f69af20e26c47099eb236321563ebcee.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On 64-bit kernels, we don't need it any more: we handle context
tracking directly on entry from user mode and exit to user mode.
On 32-bit kernels, we don't support context tracking at all, so
these callbacks had no effect.
Note: this doesn't change do_page_fault(). Before we do that,
we need to make sure that there is no code that can page fault
from kernel mode with CONTEXT_USER. The 32-bit fast system call
stack argument code is the only offender I'm aware of right now.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/ae22f4dfebd799c916574089964592be218151f9.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
assembly code
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
There's no need for both RSI and RDI to point to the original stack.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/3a0481f809dd340c7d3f54ce3fd6d66ef2a578cd.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
To prepare for the big rewrite of the error and interrupt exit
paths, we will need pt_regs completely filled in.
It's already completely filled in when error_exit runs, so rearrange
interrupt handling to match it. This will slow down interrupt
handling very slightly (eight instructions), but the
simplification it enables will be more than worth it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/d8a766a7f558b30e6e01352854628a2d9943460c.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
and remove old assembly code
These need to be migrated together, as the compat case used to
jump into the middle of the 64-bit exit code.
Remove the old assembly code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/d4d1d70de08ac3640badf50048a9e8f18fe2497f.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
In 539f51136500 ("x86/asm/entry/64: Disentangle error_entry/exit
gsbase/ebx/usermode code"), I arranged the code slightly wrong
-- IRET faults would skip the code path that was intended to
execute on all error entries from user mode. Fix it up.
While we're at it, make all the labels in error_entry local.
This does not fix a bug, but we'll need it, and it slightly
shrinks the code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/91e17891e49fa3d61357eadc451529ad48143ee1.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The current x86 entry and exit code, written in a mixture of assembly and
C code, is incomprehensible due to being open-coded in a lot of places
without coherent documentation.
It appears to work primary by luck and duct tape: i.e. obvious runtime
failures were fixed on-demand, without re-thinking the design.
Due to those reasons our confidence level in that code is low, and it is
very difficult to incrementally improve.
Add new code written in C, in preparation for simply deleting the old
entry code.
prepare_exit_to_usermode() is a new function that will handle all
slow path exits to user mode. It is called with IRQs disabled
and it leaves us in a state in which it is safe to immediately
return to user mode. IRQs must not be re-enabled at any point
after prepare_exit_to_usermode() returns and user mode is actually
entered. (We can, of course, fail to enter user mode and treat
that failure as a fresh entry to kernel mode.)
All callers of do_notify_resume() will be migrated to call
prepare_exit_to_usermode() instead; prepare_exit_to_usermode() needs
to do everything that do_notify_resume() does today, but it also
takes care of scheduling and context tracking. Unlike
do_notify_resume(), it does not need to be called in a loop.
syscall_return_slowpath() is exactly what it sounds like: it will
be called on any syscall exit slow path. It will replace
syscall_trace_leave() and it calls prepare_exit_to_usermode() on the
way out.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/c57c8b87661a4152801d7d3786eac2d1a2f209dd.1435952415.git.luto@kernel.org
[ Improved the changelog a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Changing the x86 context tracking hooks is dangerous because
there are no good checks that we track our context correctly.
Add a helper to check that we're actually in CONTEXT_USER when
we enter from user mode and wire it up for syscall entries.
Subsequent patches will wire this up for all non-NMI entries as
well. NMIs are their own special beast and cannot currently
switch overall context tracking state. Instead, they have their
own special RCU hooks.
This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a
layer of indirection). Eventually, we should fix up the core
context tracking code to supply a function that does what we
want (and can be much simpler than user_exit), which will enable
us to get rid of the extra call.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/853b42420066ec3fb856779cdc223a6dcb5d355b.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
exception entries
Other than the super-atomic exception entries, all exception
entries are supposed to switch our context tracking state to
CONTEXT_KERNEL. Assert that they do. These assertions appear
trivial at this point, as exception_enter() is the function
responsible for switching context, but I'm planning on reworking
x86's exception context tracking, and these assertions will help
make sure that all of this code keeps working.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20fa1ee2d943233a184aaf96ff75394d3b34dfba.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The entry and exit C helpers were confusingly scattered between
ptrace.c and signal.c, even though they aren't specific to
ptrace or signal handling. Move them together in a new file.
This change just moves code around. It doesn't change anything.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/324d686821266544d8572423cc281f961da445f4.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Low-level arch entries often call notify_die(), and it's easy for
arch code to fail to exit an RCU quiescent state first. Assert
that we're not quiescent in notify_die().
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1f5fe6c23d5b432a23267102f2d72b787d80fdd8.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This will let us sprinkle sanity checks around the kernel
without making too much of a mess.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/5da41fb2ceb29eac671f427c67040401ba2a1fa0.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Once x86 exports its do_signal(), the prototypes will clash.
Fix the clash and also improve the code a bit: remove the
unnecessary kern_do_signal() indirection. This allows
interrupt_end() to share the 'regs' parameter calculation.
Also remove the unused return code to match x86.
Minimally build and boot tested.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/67c57eac09a589bac3c6c5ff22f9623ec55a184a.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If user code does SYSCALL32 or SYSENTER without a valid stack,
then our attempt to determine the syscall args will result in a
failed uaccess fault. Previously, we would try to recover by
jumping to the syscall exit code, but we'd run the syscall exit
work even though we never made it to the syscall entry work.
Clean it up by treating the failure path as a non-syscall entry
and exit pair.
This fixes strace's output when running the syscall_arg_fault
test. Without this fix, strace would get out of sync and would
fail to associate syscall entries with syscall exits.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/903010762c07a3d67df914fea2da84b52b0f8f1d.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|