Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Two fixes:
- a timer pause event notification was garbled upon the recent
hardening work; corrected now
- HD-audio runtime PM regression fix due to the incorrect return
type"
* tag 'sound-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix runtime PM
ALSA: timer: Fix pause event notification
|
|
Commit d5c435df4a890 ("intel_th: msu: Use the real device in case of IOMMU
domain allocation") changes dma buffer allocation to use the actual
underlying device, but forgets to change the deallocation path, which leads
to (if you've got CAP_SYS_RAWIO):
> # echo 0,0 > /sys/bus/intel_th/devices/0-msc0/nr_pages
> ------------[ cut here ]------------
> kernel BUG at ../linux/drivers/iommu/intel-iommu.c:3670!
> CPU: 3 PID: 231 Comm: sh Not tainted 4.17.0-rc1+ #2729
> RIP: 0010:intel_unmap+0x11e/0x130
...
> Call Trace:
> intel_free_coherent+0x3e/0x60
> msc_buffer_win_free+0x100/0x160 [intel_th_msu]
This patch fixes the buffer deallocation code to use the correct device.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Fixes: d5c435df4a890 ("intel_th: msu: Use the real device in case of IOMMU domain allocation")
Reported-by: Baofeng Tian <baofeng.tian@intel.com>
CC: stable@vger.kernel.org # v4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fengguang is running into a warning from the buddy allocator:
> swapper/0: page allocation failure: order:9, mode:0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.17.0-rc1 #262
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> Call Trace:
...
> __kmalloc+0x14b/0x180: ____cache_alloc at mm/slab.c:3127
> stm_register_device+0xf3/0x5c0: stm_register_device at drivers/hwtracing/stm/core.c:695
...
Which is basically a result of the stm class trying to allocate ~512kB
for the dummy_stm with its default parameters. There's no reason, however,
for it not to be vmalloc()ed instead, which is what this patch does.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
CC: stable@vger.kernel.org # v4.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-next
Kishon writes:
phy: for 4.18
*) Add PHY driver for the ATH79 USB PHY
*) Add USB3 PHY driver for Mediatek XS-PHY
*) Add QUSB/QMP V3 USB3 PHY Support for Qualcomm's SDM845
*) Add runtime PM support for mapphone PHY driver
*) Allow phy_pm_runtime_xxx API calls to accept NULL
*) Other minor cleanups and fixes
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
Commit f8df13e0a9 ("tty: Clean console safely") added code to clear
both the scrollback buffer and the screen with "\e[3J", then execution
falls through into the code to simply clear the screen. This means
scr_memsetw() and the console driver update callback are called twice
on the whole screen buffer. Let's reorganize the code so the same work
is not performed twice needlessly.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
UCR4_OREN is (depending on the configuration) enabled in startup,
but is never disabled. Fix this by disabling it in shutdown.
Reported-by: Nandor Han <nandor.han@ge.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
According to Documentation/serial/driver the shutdown function should
not disable RTS, so drop it.
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Casting a pointer to a 64-bit type causes a warning on 32-bit targets:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c:473:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
lower_32_bits((uint64_t)wptr));
^
drivers/gpu/drm/amd/amdgpu/amdgpu.h:1701:53: note: in definition of macro 'WREG32'
#define WREG32(reg, v) amdgpu_mm_wreg(adev, (reg), (v), 0)
^
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c:473:10: note: in expansion of macro 'lower_32_bits'
lower_32_bits((uint64_t)wptr));
^~~~~~~~~~~~~
The correct method is to cast to 'uintptr_t'.
Fixes: d5a114a6c5f7 ("drm/amdgpu: Add GFXv9 kfd2kgd interface functions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
- The description of 'blocking' is missing in null_blk.txt
- The 'lightnvm' parameter has been removed in null_blk.c
This updates both in null_blk.txt.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If nvme_get_effects_log() failed the 'id' buffer from the previous
nvme_identify_ctrl() call will never be freed.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The host nqn actually is smaller than the space reserved for it,
so we should be using strlcpy to keep KASAN happy.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Use blk_rq_nr_phys_segments() instead of blk_rq_payload_bytes() to check
if a command contains data to me mapped. This fixes the case where
a struct requests contains LBAs, but no data will actually be send,
e.g. the pending Write Zeroes support.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Todays limit on concurrent LS's is very small - 4 buffers. With large
subsystem counts or large numbers of initiators connecting, the limit
may be exceeded.
Raise the LS buffer count to 256.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
This patch adds simple file backed namespace support for NVMeOF target.
The new file io-cmd-file.c is responsible for handling the code for I/O
commands when ns is file backed. Also, we introduce mempools based slow
path using sync I/Os for file backed ns to ensure forward progress under
reclaim.
The old block device based implementation is moved to io-cmd-bdev.c and
use a "nvmet_bdev_" symbol prefix. The enable/disable calls are also
move into the respective files.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
[hch: updated changelog, fixed double req->ns lookup in bdev case]
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Remove the duplicate NULL initialization for req->ns. req->ns is always
initialized to NULL in nvmet_req_init(), so there is no need to reset
it later on failures unless we have previously assigned a value to it.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
"nvmet_check_ctrl_status()" is called from admin-cmd.c along
with io-cmd.c, make the error message more generic.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The whole point of the discovery controller is that it can accept
multiple connections. Additionally the cmic field is not even defined for
the discovery controller identify page.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
When connecting to the discovery controller we have certain defaults
to observe, so centralize them to avoid inconsistencies due to argument
ordering.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
After creating the nvme controller, nvmf_create_ctrl() validates
the newly created subsysnqn vs the one specified by the options.
In general, this is an unnecessary check as the Connect message
should implicitly ensure this value matches.
With the change to the FC transport to do an asynchronous connect
for the first association create, the transport will return to
nvmf_create_ctrl() before that first association has been established,
thus the subnqn will not yet be set.
Remove the unnecessary validation.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Current code will set DNR if the controller is deleting or there is
an error during controller init. None of this is necessary.
Remove the code that sets DNR
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
For any failure after nvme_rdma_start_queue in
nvme_rdma_configure_admin_queue, the admin queue will be freed with the
NVME_RDMA_Q_LIVE flag still set. Once nvme_rdma_stop_queue is invoked,
that will cause a use-after-free.
BUG: KASAN: use-after-free in rdma_disconnect+0x1f/0xe0 [rdma_cm]
To fix it, call nvme_rdma_stop_queue for all the failed cases after
nvme_rdma_start_queue.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The nvme timeout handling doesn't do anything if the pci channel is
offline, which is the case when recovering from PCI error event, so it
was a bad idea to sync the controller reset in this state. This patch
flushes the reset work in the error_resume callback instead when the
channel is back to online. This keeps AER handling serialized and
can recover from timeouts.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199757
Fixes: cc1d5e749a2e ("nvme/pci: Sync controller reset for AER slot_reset")
Reported-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Set cq_vector after alloc cq/sq, otherwise nvme_suspend_queue will invoke
free_irq for it and cause a 'Trying to free already-free IRQ xxx'
warning if the create CQ/SQ command times out.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
[hch: fixed to pass a s16 and clean up the comment]
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
When a system is under memory presure (high usage with fragments),
the original 256KB ICM chunk allocations will likely trigger kernel
memory management to enter slow path doing memory compact/migration
ops in order to complete high order memory allocations.
When that happens, user processes calling uverb APIs may get stuck
for more than 120s easily even though there are a lot of free pages
in smaller chunks available in the system.
Syslog:
...
Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
oracle_205573_e:205573 blocked for more than 120 seconds.
...
With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.
However in order to support smaller ICM chunk size, we need to fix
another issue in large size kcalloc allocations.
E.g.
Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
entry). So we need a 16MB allocation for a table->icm pointer array to
hold 2M pointers which can easily cause kcalloc to fail.
The solution is to use kvzalloc to replace kcalloc which will fall back
to vmalloc automatically if kmalloc fails.
Signed-off-by: Qing Huang <qing.huang@oracle.com>
Acked-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
|
|
On platforms where the Low Power S0 Idle _DSM interface is used,
on wakeup from suspend-to-idle, when it is known that the ACPI SCI
has triggered while suspended, dispatch the EC GPE in order to catch
all EC events that may have triggered the wakeup before carrying out
the noirq phase of device resume.
That is needed to handle power button wakeup on some platforms where
the EC goes into a low-power mode during suspend-to-idle and while in
that mode it will discard events after a timeout. If that timeout is
shorter than the time it takes to complete the noirq resume of
devices, looking for EC events after the latter is too late.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
|
|
Introduce acpi_dispatch_gpe() as a wrapper around acpi_ev_detect_gpe()
for checking if the given GPE (as represented by a GPE device handle
and a GPE number) is currently active and dispatching it (if that's
the case) outside of interrupt context.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Provide __dasd_cancel_req that is called with the ccw device lock
held to simplify the locking in dasd_times_out. Also this removes
the following sparse warning:
context imbalance in 'dasd_times_out' - different lock contexts for basic block
Note: with this change dasd_schedule_device_bh is now called (via
dasd_cancel_req) with the ccw device lock held. But is is already
the case for other codepaths.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Add the trivial owner_on_cpu() helper for rwsem_can_spin_on_owner() and
rwsem_spin_on_owner(), it also allows to make rwsem_can_spin_on_owner()
a bit more clear.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jan Kara <jack@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Theodore Y. Ts'o <tytso@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180518165534.GA22348@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Store user space frame-pointer value (BP register) into the perf trace
on a sample for a process so the value becomes available when
unwinding call stacks for functions gaining event samples.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/311d4a34-f81b-5535-3385-01427ac73b41@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
PERF_EVENT_IOC_MODIFY_ATTRIBUTES
Since pointer size is different in compat, and switching in _perf_ioctl
is done using exact ioctl numbers, all new ioctl numbers that use pointer
should be added to perf_compat_ioctl for _IOC_SIZE fixup before passing
to perf_ioctl routine (this shouldn't be needed if semantics of the size
argument of _IO* macros was honored).
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20180521123420.GA24291@asgard.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
As Miklos reported and suggested:
"This pattern repeats two times in trace_uprobe.c and in
kernel/events/core.c as well:
ret = kern_path(filename, LOOKUP_FOLLOW, &path);
if (ret)
goto fail_address_parse;
inode = igrab(d_inode(path.dentry));
path_put(&path);
And it's wrong. You can only hold a reference to the inode if you
have an active ref to the superblock as well (which is normally
through path.mnt) or holding s_umount.
This way unmounting the containing filesystem while the tracepoint is
active will give you the "VFS: Busy inodes after unmount..." message
and a crash when the inode is finally put.
Solution: store path instead of inode."
This patch fixes the issue in kernel/event/core.c.
Reviewed-and-tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When hw and sw events are mixed in the same group, they are all attached
to the hw perf_event_context. This sometimes requires moving group of
perf_event to a different context.
We found a bug in how the kernel handles this, for example if we do:
perf stat -e '{faults,ref-cycles,faults}' -I 1000
1.005591180 1,297 faults
1.005591180 457,476,576 ref-cycles
1.005591180 <not supported> faults
First, sw event "faults" is attached to the sw context, and becomes the
group leader. Then, hw event "ref-cycles" is attached, so both events
are moved to the hw context. Last, another sw "faults" tries to attach,
but it fails because of mismatch between the new target ctx (from sw
pmu) and the group_leader's ctx (hw context, same as ref-cycles).
The broken condition is:
group_leader is sw event;
group_leader is on hw context;
add a sw event to the group.
Fix this scenario by checking group_leader's context (instead of just
event type). If group_leader is on hw context, use the ->pmu of this
context to look up context for the new event.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: b04243ef7006 ("perf: Complete software pmu grouping")
Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When a task is enqueued the estimated utilization of a CPU is updated
to better support the selection of the required frequency.
However, schedutil is (implicitly) updated by update_load_avg() which
always happens before util_est_{en,de}queue(), thus potentially
introducing a latency between estimated utilization updates and
frequency selections.
Let's update util_est at the beginning of enqueue_task_fair(),
which will ensure that all schedutil updates will see the most
updated estimated utilization value for a CPU.
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Fixes: 7f65ea42eb00 ("sched/fair: Add util_est on top of PELT")
Link: http://lkml.kernel.org/r/20180524141023.13765-3-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
utilization
Since the refactoring introduced by:
commit 8f111bc357aa ("cpufreq/schedutil: Rewrite CPUFREQ_RT support")
we aggregate FAIR utilization only if this class has runnable tasks.
This was mainly due to avoid the risk to stay on an high frequency just
because of the blocked utilization of a CPU not being properly decayed
while the CPU was idle.
However, since:
commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
the FAIR blocked utilization is properly decayed also for IDLE CPUs.
This allows us to use the FAIR blocked utilization as a safe mechanism
to gracefully reduce the frequency only if no FAIR tasks show up on a
CPU for a reasonable period of time.
Moreover, we also reduce the frequency drops of CPUs running periodic
tasks which, depending on the task periodicity and the time required
for a frequency switch, was increasing the chances to introduce some
undesirable performance variations.
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Link: http://lkml.kernel.org/r/20180524141023.13765-2-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since the following commit:
b91473ff6e97 ("sched,tracing: Update trace_sched_pi_setprio()")
the sched_pi_setprio trace point shows the "newprio" during a deboost:
|futex sched_pi_setprio: comm=futex_requeue_p pid"34 oldprio newprio=3D98
|futex sched_switch: prev_comm=futex_requeue_p prev_pid"34 prev_prio=120
This patch open codes __rt_effective_prio() in the tracepoint as the
'newprio' to get the old behaviour back / the correct priority:
|futex sched_pi_setprio: comm=futex_requeue_p pid"20 oldprio newprio=3D120
|futex sched_switch: prev_comm=futex_requeue_p prev_pid"20 prev_prio=120
Peter suggested to open code the new priority so people using tracehook
could get the deadline data out.
Reported-by: Mansky Christian <man@keba.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: b91473ff6e97 ("sched,tracing: Update trace_sched_pi_setprio()")
Link: http://lkml.kernel.org/r/20180524132647.gg6ziuogczdmjjzu@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The following commit:
85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")
added a WARN() in the case where we call kthread_park() on an already
parked thread, because the old code wasn't doing the right thing there
and it wasn't at all clear that would happen.
It turns out, this does in fact happen, so we have to deal with it.
Instead of potentially returning early, also wait for the completion.
This does however mean we have to use complete_all() and re-initialize
the completion on re-use.
Reported-by: LKP <lkp@01.org>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel test robot <lkp@intel.com>
Cc: wfg@linux.intel.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")
Link: http://lkml.kernel.org/r/20180504091142.GI12235@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When scheduler debug is enabled, building scheduling domains outputs
information about how the domains are laid out and to which root domain
each CPU (or sets of CPUs) belongs, e.g.:
CPU0 attaching sched-domain(s):
domain-0: span=0-5 level=MC
groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }
CPU1 attaching sched-domain(s):
domain-0: span=0-5 level=MC
groups: 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 0:{ span=0 }
[...]
span: 0-5 (max cpu_capacity = 1024)
The fact that latest line refers to CPUs 0-5 root domain doesn't however look
immediately obvious to me: one might wonder why span 0-5 is reported "again".
Make it more clear by adding "root domain" to it, as to end with the
following:
CPU0 attaching sched-domain(s):
domain-0: span=0-5 level=MC
groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }
CPU1 attaching sched-domain(s):
domain-0: span=0-5 level=MC
groups: 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 0:{ span=0 }
[...]
root domain span: 0-5 (max cpu_capacity = 1024)
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180524152936.17611-1-juri.lelli@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
"id" needs to be signed for the error handling to work.
Fixes: 7a2d5c77c558 ("drm/exynos: fimc: Convert driver to IPP v2 core API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
|
|
drivers/gpu/drm/exynos/exynos_drm_scaler.c:402 scaler_task_done()
warn: signedness bug returning '(-22)'
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
|
|
qcom_scm_call_atomic1() can crash with a NULL pointer dereference at
qcom_scm_call_atomic1+0x30/0x48.
disassembly of qcom_scm_call_atomic1():
...
<0xc08d73b0 <+12>: ldr r3, [r12]
... (no instruction explicitly modifies r12)
0xc08d73cc <+40>: smc 0
... (no instruction explicitly modifies r12)
0xc08d73d4 <+48>: ldr r3, [r12] <- crashing instruction
...
Since the first ldr is successful, and since r12 isn't explicitly
modified by any instruction between the first and the second ldr,
it must have been modified by the smc call, which is ok,
since r12 is caller save according to the AAPCS.
Add r12 to the clobber list so that the compiler knows that the
callee potentially overwrites the value in r12.
Clobber descriptions may not in any way overlap with an input or
output operand.
Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
In commit 624dbf55a359b ("driver/net: enic: Try DMA 64 first, then
failover to DMA") DMA mask was changed from 40 bits to 64 bits.
Hardware actually supports only 47 bits.
Fixes: 624dbf55a359b ("driver/net: enic: Try DMA 64 first, then failover to DMA")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
before f_count has reached 0, which is fundamentally a bad idea. It
does check 'f_count < 2', which excludes concurrent operations on the
file since they would only be possible with a shared fd table, in which
case each fdget() would take a file reference. However, it fails to
account for the fact that even with 'f_count == 1' the file can still be
linked into epoll instances. As reported by syzbot, this can trivially
be used to cause a use-after-free.
Yet, the only known user of PPPIOCDETACH is pppd versions older than
ppp-2.4.2, which was released almost 15 years ago (November 2003).
Also, PPPIOCDETACH apparently stopped working reliably at around the
same time, when the f_count check was added to the kernel, e.g. see
https://lkml.org/lkml/2002/12/31/83. Also, the current 'f_count < 2'
check makes PPPIOCDETACH only work in single-threaded applications; it
always fails if called from a multithreaded application.
All pppd versions released in the last 15 years just close() the file
descriptor instead.
Therefore, instead of hacking around this bug by exporting epoll
internals to modules, and probably missing other related bugs, just
remove the PPPIOCDETACH ioctl and see if anyone actually notices. Leave
a stub in place that prints a one-time warning and returns EINVAL.
Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A precondition check in ip_recv_error triggered on an otherwise benign
race. Remove the warning.
The warning triggers when passing an ipv6 socket to this ipv4 error
handling function. RaceFuzzer was able to trigger it due to a race
in setsockopt IPV6_ADDRFORM.
---
CPU0
do_ipv6_setsockopt
sk->sk_socket->ops = &inet_dgram_ops;
---
CPU1
sk->sk_prot->recvmsg
udp_recvmsg
ip_recv_error
WARN_ON_ONCE(sk->sk_family == AF_INET6);
---
CPU0
do_ipv6_setsockopt
sk->sk_family = PF_INET;
This socket option converts a v6 socket that is connected to a v4 peer
to an v4 socket. It updates the socket on the fly, changing fields in
sk as well as other structs. This is inherently non-atomic. It races
with the lockless udp_recvmsg path.
No other code makes an assumption that these fields are updated
atomically. It is benign here, too, as ip_recv_error cares only about
the protocol of the skbs enqueued on the error queue, for which
sk_family is not a precise predictor (thanks to another isue with
IPV6_ADDRFORM).
Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
Fixes: 7ce875e5ecb8 ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When dealing with ingress rule on a netdev, if we did fine through the
conventional path, there's no need to continue into the egdev route,
and we can stop right there.
Not doing so may cause a 2nd rule to be added by the cls api layer
with the ingress being the egdev.
For example, under sriov switchdev scheme, a user rule of VFR A --> VFR B
will end up with two HW rules (1) VF A --> VF B and (2) uplink --> VF B
Fixes: 208c0f4b5237 ('net: sched: use tc_setup_cb_call to call per-block callbacks')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DaeRyong Jeong reports a race between vhost_dev_cleanup() and
vhost_process_iotlb_msg():
Thread interleaving:
CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup)
(In the case of both VHOST_IOTLB_UPDATE and
VHOST_IOTLB_INVALIDATE)
===== =====
vhost_umem_clean(dev->iotlb);
if (!dev->iotlb) {
ret = -EFAULT;
break;
}
dev->iotlb = NULL;
The reason is we don't synchronize between them, fixing by protecting
vhost_process_iotlb_msg() with dev mutex.
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|