Age | Commit message (Collapse) | Author |
|
The warning was intended to spot complete_all() users from hardirq
context on PREEMPT_RT. The warning as-is will also trigger in interrupt
handlers, which are threaded on PREEMPT_RT, which was not intended.
Use lockdep_assert_RT_in_threaded_ctx() which triggers in non-preemptive
context on PREEMPT_RT.
Fixes: a5c6234e1028 ("completion: Use simple wait queues")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200323152019.4qjwluldohuh3by5@linutronix.de
|
|
Make it so that CEPH_MSG_DATA_PAGES data item can own pages,
fixing a bunch of memory leaks for a page vector allocated in
alloc_msg_with_page_vector(). Currently, only watch-notify
messages trigger this allocation, and normally the page vector
is freed either in handle_watch_notify() or by the caller of
ceph_osdc_notify(). But if the message is freed before that
(e.g. if the session faults while reading in the message or
if the notify is stale), we leak the page vector.
This was supposed to be fixed by switching to a message-owned
pagelist, but that never happened.
Fixes: 1907920324f1 ("libceph: support for sending notifies")
Reported-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
|
|
CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
per-pool flags as well. Unfortunately the backwards compatibility here
is lacking:
- the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
was guarded by require_osd_release >= RELEASE_LUMINOUS
- it was subsequently backported to luminous in v12.2.2, but that makes
no difference to clients that only check OSDMAP_FULL/NEARFULL because
require_osd_release is not client-facing -- it is for OSDs
Since all kernels are affected, the best we can do here is just start
checking both map flags and pool flags and send that to stable.
These checks are best effort, so take osdc->lock and look up pool flags
just once. Remove the FIXME, since filesystem quotas are checked above
and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.
Cc: stable@vger.kernel.org
Reported-by: Yanhu Cao <gmayyyha@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Sage Weil <sage@redhat.com>
|
|
Don't let non-letters inside a literal block without escaping it, as
the toolchain would mis-interpret it:
./include/linux/i2c.h:518: WARNING: Inline strong start-string without end-string.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the staging/iio fixes in here as well
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
x86/mm: split vmalloc_sync_all()
mm, slub: prevent kmalloc_node crashes and memory leaks
mm/mmu_notifier: silence PROVE_RCU_LIST warnings
epoll: fix possible lost wakeup on epoll_ctl() path
mm: do not allow MADV_PAGEOUT for CoW pages
mm, memcg: throttle allocators based on ancestral memory.high
mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
page-flags: fix a crash at SetPageError(THP_SWAP)
mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
|
|
Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path. While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.
Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.
To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:
* vmalloc_sync_mappings(), and
* vmalloc_sync_unmappings()
Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized. The only exception is the new call-site added in the
above mentioned commit.
Shile Zhang directed us to a report of an 80% regression in reaim
throughput.
Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [GHES]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org
Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped
out") supported writing THP to a swap device but forgot to upgrade an
older commit df8c94d13c7e ("page-flags: define behavior of FS/IO-related
flags on compound pages") which could trigger a crash during THP
swapping out with DEBUG_VM_PGFLAGS=y,
kernel BUG at include/linux/page-flags.h:317!
page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
page:fffff3b2ec3a8000 refcount:512 mapcount:0 mapping:000000009eb0338c index:0x7f6e58200 head:fffff3b2ec3a8000 order:9 compound_mapcount:0 compound_pincount:0
anon flags: 0x45fffe0000d8454(uptodate|lru|workingset|owner_priv_1|writeback|head|reclaim|swapbacked)
end_swap_bio_write()
SetPageError(page)
VM_BUG_ON_PAGE(1 && PageCompound(page))
<IRQ>
bio_endio+0x297/0x560
dec_pending+0x218/0x430 [dm_mod]
clone_endio+0xe4/0x2c0 [dm_mod]
bio_endio+0x297/0x560
blk_update_request+0x201/0x920
scsi_end_request+0x6b/0x4b0
scsi_io_completion+0x509/0x7e0
scsi_finish_command+0x1ed/0x2a0
scsi_softirq_done+0x1c9/0x1d0
__blk_mqnterrupt+0xf/0x20
</IRQ>
Fix by checking PF_NO_TAIL in those places instead.
Fixes: bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200310235846.1319-1-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
'kfree_rcu.2020.02.20a', 'locktorture.2020.02.20a', 'ovld.2020.02.20a', 'rcu-tasks.2020.02.20a', 'srcu.2020.02.20a' and 'torture.2020.02.20a' into HEAD
doc.2020.02.27a: Documentation updates.
fixes.2020.03.21a: Miscellaneous fixes.
kfree_rcu.2020.02.20a: Updates to kfree_rcu().
locktorture.2020.02.20a: Lock torture-test updates.
ovld.2020.02.20a: Updates to callback-overload handling.
rcu-tasks.2020.02.20a: RCU-tasks updates.
srcu.2020.02.20a: SRCU updates.
torture.2020.02.20a: Torture-test updates.
|
|
Currently, rcu_barrier() ignores offline CPUs, However, it is possible
for an offline no-CBs CPU to have callbacks queued, and rcu_barrier()
must wait for those callbacks. This commit therefore makes rcu_barrier()
directly invoke the rcu_barrier_func() with interrupts disabled for such
CPUs. This requires passing the CPU number into this function so that
it can entrain the rcu_barrier() callback onto the correct CPU's callback
list, given that the code must instead execute on the current CPU.
While in the area, this commit fixes a bug where the first CPU's callback
might have been invoked before rcu_segcblist_entrain() returned, which
would also result in an early wakeup.
Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ paulmck: Apply optimization feedback from Boqun Feng. ]
Cc: <stable@vger.kernel.org> # 5.5.x
|
|
Commit bbbdeb4720a0 ("io_uring: dual license io_uring.h uapi header")
uses a nested SPDX-License-Identifier to dual license the header.
Since then, ./scripts/spdxcheck.py complains:
include/uapi/linux/io_uring.h: 1:60 Missing parentheses: OR
Add parentheses to make spdxcheck.py happy.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull io_uring fixes from Jens Axboe:
"Two different fixes in here:
- Fix for a potential NULL pointer deref for links with async or
drain marked (Pavel)
- Fix for not properly checking RLIMIT_NOFILE for async punted
operations.
This affects openat/openat2, which were added this cycle, and
accept4. I did a full audit of other cases where we might check
current->signal->rlim[] and found only RLIMIT_FSIZE for buffered
writes and fallocate. That one is fixed and queued for 5.7 and
marked stable"
* tag 'io_uring-5.6-20200320' of git://git.kernel.dk/linux-block:
io_uring: make sure accept honor rlimit nofile
io_uring: make sure openat/openat2 honor rlimit nofile
io_uring: NULL-deref for IOSQE_{ASYNC,DRAIN}
|
|
The futex UAPI changed back in commit 76b81e2b0e22 ("[PATCH] lightweight
robust futexes updates 2"), which landed in v2.6.17: FUTEX_TID_MASK is now
0x3fffffff instead of 0x1fffffff. Update the corresponding comment in
include/linux/threads.h.
Documentation mentions that only the lower 29 bits are available for TID
storage, but as the UAPI header released the bit already via
FUTEX_TID_MASK, this is moot as well. Fix it up.
[ tglx: Fixed up documentation ]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200302112939.8068-1-jannh@google.com
|
|
Splitting run_posix_cpu_timers() into two parts is work in progress which
is stuck on other entry code related problems. The heavy lifting which
involves locking of sighand lock will be moved into task context so the
necessary execution time is burdened on the task and not on interrupt
context.
Until this work completes lockdep with the spinlock nesting rules enabled
would emit warnings for this known context.
Prevent it by setting "->irq_config = 1" for the invocation of
run_posix_cpu_timers() so lockdep does not complain when sighand lock is
acquried. This will be removed once the split is completed.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.751182723@linutronix.de
|
|
Mark irq_work items with IRQ_WORK_HARD_IRQ which should be invoked in
hardirq context even on PREEMPT_RT. IRQ_WORK without this flag will be
invoked in softirq context on PREEMPT_RT.
Set ->irq_config to 1 for the IRQ_WORK items which are invoked in softirq
context so lockdep knows that these can safely acquire a spinlock_t.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.643576700@linutronix.de
|
|
Set current->irq_config = 1 for hrtimers which are not marked to expire in
hard interrupt context during hrtimer_init(). These timers will expire in
softirq context on PREEMPT_RT.
Setting this allows lockdep to differentiate these timers. If a timer is
marked to expire in hard interrupt context then the timer callback is not
supposed to acquire a regular spinlock instead of a raw_spinlock in the
expiry callback.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.534508206@linutronix.de
|
|
Extend lockdep to validate lock wait-type context.
The current wait-types are:
LD_WAIT_FREE, /* wait free, rcu etc.. */
LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */
Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).
This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().
Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.
Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.
[ To make static initialization easier we have the rule that:
.outer == INV means .outer == .inner; because INV == 0. ]
It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.
Below is an example output generated by the trivial test code:
raw_spin_lock(&foo);
spin_lock(&bar);
spin_unlock(&bar);
raw_spin_unlock(&foo);
[ BUG: Invalid wait context ]
-----------------------------
swapper/0/1 is trying to lock:
ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
other info that might help us debug this:
1 lock held by swapper/0/1:
#0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187
The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.
This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.
Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.
The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.
The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.
[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de
|
|
completion uses a wait_queue_head_t to enqueue waiters.
wait_queue_head_t contains a spinlock_t to protect the list of waiters
which excludes it from being used in truly atomic context on a PREEMPT_RT
enabled kernel.
The spinlock in the wait queue head cannot be replaced by a raw_spinlock
because:
- wait queues can have custom wakeup callbacks, which acquire other
spinlock_t locks and have potentially long execution times
- wake_up() walks an unbounded number of list entries during the wake up
and may wake an unbounded number of waiters.
For simplicity and performance reasons complete() should be usable on
PREEMPT_RT enabled kernels.
completions do not use custom wakeup callbacks and are usually single
waiter, except for a few corner cases.
Replace the wait queue in the completion with a simple wait queue (swait),
which uses a raw_spinlock_t for protecting the waiter list and therefore is
safe to use inside truly atomic regions on PREEMPT_RT.
There is no semantical or functional change:
- completions use the exclusive wait mode which is what swait provides
- complete() wakes one exclusive waiter
- complete_all() wakes all waiters while holding the lock which protects
the wait queue against newly incoming waiters. The conversion to swait
preserves this behaviour.
complete_all() might cause unbound latencies with a large number of waiters
being woken at once, but most complete_all() usage sites are either in
testing or initialization code or have only a really small number of
concurrent waiters which for now does not cause a latency problem. Keep it
simple for now.
The fixup of the warning check in the USB gadget driver is just a straight
forward conversion of the lockless waiter check from one waitqueue type to
the other.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200321113242.317954042@linutronix.de
|
|
Extend rcuwait_wait_event() with a state variable so that it is not
restricted to UNINTERRUPTIBLE waits.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113241.824030968@linutronix.de
|
|
In order to avoid future header hell, remove the inclusion of
proc_fs.h from acpi_bus.h. All it needs is a forward declaration of a
struct.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lkml.kernel.org/r/20200321113241.246190285@linutronix.de
|
|
Just like commit 4022e7af86be, this fixes the fact that
IORING_OP_ACCEPT ends up using get_unused_fd_flags(), which checks
current->signal->rlim[] for limits.
Add an extra argument to __sys_accept4_file() that allows us to pass
in the proper nofile limit, and grab it at request prep time.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Dmitry reports that a test case shows that io_uring isn't honoring a
modified rlimit nofile setting. get_unused_fd_flags() checks the task
signal->rlimi[] for the limits. As this isn't easily inheritable,
provide a __get_unused_fd_flags() that takes the value instead. Then we
can grab it when the request is prepared (from the original task), and
pass that in when we do the async part part of the open.
Reported-by: Dmitry Kadashev <dkadashev@gmail.com>
Tested-by: Dmitry Kadashev <dkadashev@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Sofar we have been unable to get permission from the vendors to put the
firmware for touchscreens listed in touchscreen_dmi in linux-firmware.
Some of the tablets with such a touchscreen have a touchscreen driver, and
thus a copy of the firmware, as part of their EFI code.
This commit adds the necessary info for the new EFI embedded-firmware code
to extract these firmwares, making the touchscreen work OOTB without the
user needing to manually add the firmware.
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200115163554.101315-10-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In some cases the platform's main firmware (e.g. the UEFI fw) may contain
an embedded copy of device firmware which needs to be (re)loaded into the
peripheral. Normally such firmware would be part of linux-firmware, but in
some cases this is not feasible, for 2 reasons:
1) The firmware is customized for a specific use-case of the chipset / use
with a specific hardware model, so we cannot have a single firmware file
for the chipset. E.g. touchscreen controller firmwares are compiled
specifically for the hardware model they are used with, as they are
calibrated for a specific model digitizer.
2) Despite repeated attempts we have failed to get permission to
redistribute the firmware. This is especially a problem with customized
firmwares, these get created by the chip vendor for a specific ODM and the
copyright may partially belong with the ODM, so the chip vendor cannot
give a blanket permission to distribute these.
This commit adds a new platform fallback mechanism to the firmware loader
which will try to lookup a device fw copy embedded in the platform's main
firmware if direct filesystem lookup fails.
Drivers which need such embedded fw copies can enable this fallback
mechanism by using the new firmware_request_platform() function.
Note that for now this is only supported on EFI platforms and even on
these platforms firmware_fallback_platform() only works if
CONFIG_EFI_EMBEDDED_FIRMWARE is enabled (this gets selected by drivers
which need this), in all other cases firmware_fallback_platform() simply
always returns -ENOENT.
Reported-by: Dave Olsthoorn <dave@bewaar.me>
Suggested-by: Peter Jones <pjones@redhat.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200115163554.101315-5-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into driver-core-next
Ard writes:
Stable shared branch between EFI and driver tree
Stable shared branch to ease the integration of Hans's series to support
device firmware loaded from EFI boot service memory regions.
[PATCH v12 00/10] efi/firmware/platform-x86: Add EFI embedded fw support
https://lore.kernel.org/linux-efi/20200115163554.101315-1-hdegoede@redhat.com/
* tag 'stable-shared-branch-for-driver-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: Add embedded peripheral firmware support
efi: Export boot-services code and data as debugfs-blobs
|
|
The task->flags is a 32-bits flag, in which 31 bits have already been
consumed. So it is hardly to introduce other new per process flag.
Currently there're still enough spaces in the bit-field section of
task_struct, so we can define the memstall state as a single bit in
task_struct instead.
This patch also removes an out-of-date comment pointed by Matthew.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/1584408485-1921-1-git-send-email-laoar.shao@gmail.com
|
|
When switching tasks running on a CPU, the psi state of a cgroup
containing both of these tasks does not change. Right now, we don't
exploit that, and can perform many unnecessary state changes in nested
hierarchies, especially when most activity comes from one leaf cgroup.
This patch implements an optimization where we only update cgroups
whose state actually changes during a task switch. These are all
cgroups that contain one task but not the other, up to the first
shared ancestor. When both tasks are in the same group, we don't need
to update anything at all.
We can identify the first shared ancestor by walking the groups of the
incoming task until we see TSK_ONCPU set on the local CPU; that's the
first group that also contains the outgoing task.
The new psi_task_switch() is similar to psi_task_change(). To allow
code reuse, move the task flag maintenance code into a new function
and the poll/avg worker wakeups into the shared psi_group_change().
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200316191333.115523-3-hannes@cmpxchg.org
|
|
For simplicity, cpu pressure is defined as having more than one
runnable task on a given CPU. This works on the system-level, but it
has limitations in a cgrouped reality: When cpu.max is in use, it
doesn't capture the time in which a task is not executing on the CPU
due to throttling. Likewise, it doesn't capture the time in which a
competing cgroup is occupying the CPU - meaning it only reflects
cgroup-internal competitive pressure, not outside pressure.
Enable tracking of currently executing tasks, and then change the
definition of cpu pressure in a cgroup from
NR_RUNNING > 1
to
NR_RUNNING > ON_CPU
which will capture the effects of cpu.max as well as competition from
outside the cgroup.
After this patch, a cgroup running `stress -c 1` with a cpu.max
setting of 5000 10000 shows ~50% continuous CPU pressure.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200316191333.115523-2-hannes@cmpxchg.org
|
|
Currently, when updating the affinity of tasks via either cpusets.cpus,
or, sched_setaffinity(); tasks not currently running within the newly
specified mask will be arbitrarily assigned to the first CPU within the
mask.
This (particularly in the case that we are restricting masks) can
result in many tasks being assigned to the first CPUs of their new
masks.
This:
1) Can induce scheduling delays while the load-balancer has a chance to
spread them between their new CPUs.
2) Can antogonize a poor load-balancer behavior where it has a
difficult time recognizing that a cross-socket imbalance has been
forced by an affinity mask.
This change adds a new cpumask interface to allow iterated calls to
distribute within the intersection of the provided masks.
The cases that this mainly affects are:
- modifying cpuset.cpus
- when tasks join a cpuset
- when modifying a task's affinity via sched_setaffinity(2)
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Tested-by: Qais Yousef <qais.yousef@arm.com>
Link: https://lkml.kernel.org/r/20200311010113.136465-1-joshdon@google.com
|
|
This function is no longer used by other drivers, so it can be
made static.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
This argument refers to a stable name for an HDMI port, mostly i915
(ACPI) specific. Since we'll be introducing a more generic 'name' argument
as well later, rename this now to avoid confusion.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
The code is called MEDIA_BUS_FMT_Y14_1X14 and behaves just like
MEDIA_BUS_FMT_Y12_1X12 with two more bits.
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
The new format is called V4L2_PIX_FMT_Y14. Like V4L2_PIX_FMT_Y10 and
V4L2_PIX_FMT_Y12 it is stored in two bytes per pixel but has only two
unused bits at the top.
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
The formats added by this patch are:
V4L2_PIX_FMT_SBGGR14
V4L2_PIX_FMT_SGBRG14
V4L2_PIX_FMT_SGRBG14
V4L2_PIX_FMT_SRGGB14
Signed-off-by: Jouni Ukkonen <jouni.ukkonen@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
[dg@emlix.com: rebased onto current media_tree]
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc, afs: Interruptibility fixes
Here are a number of fixes for AF_RXRPC and AFS that make AFS system calls
less interruptible and so less likely to leave the filesystem in an
uncertain state. There's also a miscellaneous patch to make tracing
consistent.
(1) Firstly, abstract out the Tx space calculation in sendmsg. Much the
same code is replicated in a number of places that subsequent patches
are going to alter, including adding another copy.
(2) Fix Tx interruptibility by allowing a kernel service, such as AFS, to
request that a call be interruptible only when waiting for a call slot
to become available (ie. the call has not taken place yet) or that a
call be not interruptible at all (e.g. when we want to do writeback
and don't want a signal interrupting a VM-induced writeback).
(3) Increase the minimum delay on MSG_WAITALL for userspace sendmsg() when
waiting for Tx buffer space as a 2*RTT delay is really small over 10G
ethernet and a 1 jiffy timeout might be essentially 0 if at the end of
the jiffy period.
(4) Fix some tracing output in AFS to make it consistent with rxrpc.
(5) Make sure aborted asynchronous AFS operations are tidied up properly
so we don't end up with stuck rxrpc calls.
(6) Make AFS client calls uninterruptible in the Rx phase. If we don't
wait for the reply to be fully gathered, we can't update the local VFS
state and we end up in an indeterminate state with respect to the
server.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Tegra XUSB host, device mode driver requires the USB 3 companion port
number for corresponding USB 2 port. Add API to retrieve the same.
Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com>
Reviewed-by: JC Kuo <jckuo@nvidia.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The toolchain produces a warning on this driver when building
the docs:
./include/linux/regulator/driver.h:284: WARNING: Unknown target name: "regulator_regmap_x_voltage".
While fixing it, we notices that there's no function names
with the above pattern. It seems that some previous patch
renamed it to regulator_map_* instead.
So, change the function name, replacing "x" by "*", with is
a more used way to add a wildcard, and escape those with
``literal`` markup, in order to avoid the toolchain to think
that this is a link to some existing document chapter.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/b9f5687bcf981a88c9d1fd04d759a540fda53a99.1584456635.git.mchehab+huawei@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Allow block/genhd to notify user space (via udev) about disk size changes
using a new helper set_capacity_revalidate_and_notify(), which is a wrapper
on top of set_capacity(). set_capacity_revalidate_and_notify() will only
notify via udev if the current capacity or the target capacity is not zero
and iff the capacity changes.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Someswarudu Sangaraju <ssomesh@amazon.com>
Signed-off-by: Balbir Singh <sblbir@amazon.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No one checks the return value of debugfs_create_file_size, as it's not
needed, so make the return value void, so that no one tries to do so in
the future.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20200309163640.237984-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
First set of new IIO device support, fatures and cleanups for the 5.7 cycle
Includes changes for the counter subsystem
Core Feature
* Explicitly handle sysfs values in dB, including correctly handling the
needed postfix dB.
* Add a TODO to suggest suitable activities for new contributors to IIO now
the vast majority of drivers are out of staging (and the remaining ones
there are 'hard'). Also update the TODO in staging to remove stale entries.
Staging graduations
* ad7192 ADC.
New device support
* ad5770r
- New driver for this 6 channel DAC including DT bindings.
* ad8366
- Add supprot for the hmc1119 attenuator.
* al3010
- New driver supporting this Dyna-image light sensors.
- Power management and DT bindings added in additional patches.
* atlas-sensor
- Add support for atlas DO-SM device. Reads disolved oxygen in a solution.
* gpap002x00f
- New driver and bindings to support the GP2AP002A00F and GP2AP002S00F light
and proximity sensors. There is some limited existing support in
input. The intent is to drop this driver once IIO driver is in place.
* hmc425a
- New driver for this attenuator.
* icp10100
- New driver for this presure sensor.
* ltc2632
- Add support for the ltc2636 8 channel DAC. Includes bindings and some
tidying up of the driver.
* inv_mpu6050
- Support IAM20680, ICM20609, ICM20689 and ICM20690.
Includes related tidy up and rework of low pass filter bandwidth
handling to give suitable values for all chips.
Binding conversions to yaml or missing bindings docs.
* atlas-sensor, including consolidation of previous 3 separate docs into 1.
* ad7923, previously no doc.
* max1363, split into max1238 and max1363 to simplify yaml.
* stm32-adc
Features
* (counter) 104-quad-8
- Support a filter clock prescaler.
- Support reporting of encoder cable status.
* ad7124
- Low pass filter support.
- Debugfs interface to access registers directly.
* ad8366
- Support control of hardware gain.
* inv_mpu6050
- Runtime pm with autosuspend.
* npcm adc
- Add reset support. This is a breaking change if DT is not in sync,
however this device is a BMC so the ecosystem is closed enought that
this should not be a problem.
* srf04
- Add power management with DT bindings for the GPIO.
* stm32-timer-trigger
- Power management.
* (counter) stm32-timer-cnt
- Power management.
* vcnl4000
- Enable runtime PM for devices that don't use on demand measurement.
Cleanups and minor fixes
* core
- Avoid double read when using debugfs. Whilst we provide no guarantees
on lack of side effects using the debugfs interfaces, this one is
generate unexpected results so let us tidy it up.
* dac/Kconfig
- Alphabetic order.
* ad5755
- Grammar and minor other fixes.
* ad7124
- Fail probe if get_voltage fails as no meaningful readings can be had
without knowing the external reference.
- Switch to selection between different channel attributes rather than
building the arrays at runtime.
- Remove the spi_device_id table as the driver cannot be probled without
more information that can be provided without dt.
- Update sysfs docs to provide more inormation and bring remaining docs for
this part out of staging.
* ad9292
- Use new SPI transfer delay structure.
* adis library
- Add unlocked version of adis_initial_startup and refactor the function.
- Add a product ID santiy check.
- Add support for different self test registers.
- Use new SPI delay structure.
- Add new docs and tidy up existing.
* adis16136
- Initialize adis_data statically.
* adis16400
- Initialize adis_data statically.
* adis16460
- Use core __adis_initial_Startup now it supports everything needed.
* adis16480
- Initialize adis_data statically.
- Use core __adis_initial_startup now it supports everything needed.
* al3320a
- Add missing DT binding docs.
- Tidy up code formatting.
- Simplify error paths using devm_add_action_or_reset.
- Ensure autoloading works by adding the of_match_table.
* atlas-sensor
- Drop false requirement for interrupt line, the value can be polled using
a sysfs or hrtimer type trigger.
* exynos-adc
- Silence warning message on deferring probe.
* gp2ap002
- Greatly simplify the Lux LUT.
- Reorder actions around buffer setup and tear down as part of a sub-system
wide standardization of these.
* inv_mpu6050
- Various lttle tidyups.
- Simpliy I2C aux MUX handling by enabling it only at startup. It never
needs to be disabled.
- Simplify polling rate when magnetometer enabled by putting only under
control of userspace.
- Always execute full reset on devices supporting spi. It does no harm
when using i2c and makes for simpler code.
- Reduce over the top sleep times for vddio regulator power up.
- Greatly simplify power and engine management.
- Fix some delays in polled reads (only visible due to other changes)
- Stop preventing sampling rate changes whilst running as there is no
adverse consequence of doing so.
- Prevent attempting to read the temperature if neither accel nor
gyro is enabled.
* lmp9100
- Reorder actions around buffer setup and tear down as part of a sub-system
wide standardization of these.
* max1118
- Use new SPI transfer delay structure.
* mcp320x
- Use new SPI transfer delay structure.
* si1133
- Read full 24 bit signed integer instead o dropping last 8 bits of value.
Not a critical fix as just adds precision.
* st_sensors
- Use st_sensors_dev_name_probe instead of open coded version in st_accel
- Handle potential memory allocation failure.
* st_lsm6dsx
- Fix some wrong structure element naming in documentation.
- Add missing return value check.
* stm32_timer_cnt
- Drop some unused left over IIO headers from this count subsystem driver.
- Ensure the clock is enabled in master mode. Theoretical issue rather
than one known to happen in the wild.
* tlc4541
- Use new SPI delay structure.
* tag 'iio-5.7a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (98 commits)
iio: dac: Kconfig: sort symbols alphabetically
iio: light: gp2ap020a00f: fix iio_triggered_buffer_{predisable,postenable} positions
iio: potentiostat: lmp9100: fix iio_triggered_buffer_{predisable,postenable} positions
iio: trigger: stm32-timer: add power management support
iio: trigger: stm32-timer: rename enabled flag
iio: add a TODO
counter: 104-quad-8: Support Differential Encoder Cable Status
counter: 104-quad-8: Support Filter Clock Prescaler
iio: pressure: icp10100: add driver for InvenSense ICP-101xx
iio: industrialio-core: Fix debugfs read
iio: imu: adis: add a note better explaining state_lock
iio: imu: adis: update 'adis_data' struct doc-string
iio: imu: adis: add doc-string for 'adis' struct
iio: imu: adis_buffer: Use new structure for SPI transfer delays
iio: adc: ti-tlc4541: Use new structure for SPI transfer delays
iio: adc: mcp320x: Use new structure for SPI transfer delays
iio: adc: max1118: Use new structure for SPI transfer delays
iio: adc: ad9292: Use new structure for SPI transfer delays
iio: adc: exynos: Silence warning about regulators during deferred probe
staging: iio: update TODO
...
|
|
New Chrome OS keyboards have a "snip" key that is basically a selective
screenshot (allows a user to select an area of screen to be copied).
Allocate a keycode for it.
Signed-off-by: Rajat Jain <rajatja@google.com>
Reviewed-by: Harry Cutts <hcutts@chromium.org>
Link: https://lore.kernel.org/r/20200313180333.75011-1-rajatja@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
The bpf_struct_ops tcp-cc name should be sanitized in order to
avoid problematic chars (e.g. whitespaces).
This patch reuses the bpf_obj_name_cpy() for accepting the same set
of characters in order to keep a consistent bpf programming experience.
A "size" param is added. Also, the strlen is returned on success so
that the caller (like the bpf_tcp_ca here) can error out on empty name.
The existing callers of the bpf_obj_name_cpy() only need to change the
testing statement to "if (err < 0)". For all these existing callers,
the err will be overwritten later, so no extra change is needed
for the new strlen return value.
v3:
- reverse xmas tree style
v2:
- Save the orig_src to avoid "end - size" (Andrii)
Fixes: 0baf26b0fcd7 ("bpf: tcp: Support tcp_congestion_ops in bpf")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200314010209.1131542-1-kafai@fb.com
|
|
struct pnp_driver has name set as char* instead of const char* like platform_driver, pci_driver, usb_driver, etc...
Let's unify a bit by setting name as const char*.
Furthermore, all users of this structures set name from already const
data.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Mark usb_raw_io_flags_valid() and usb_raw_io_flags_zero() as inline to
fix the following warnings:
./usr/include/linux/usb/raw_gadget.h:69:12: warning: unused function 'usb_raw_io_flags_valid' [-Wunused-function]
./usr/include/linux/usb/raw_gadget.h:74:12: warning: unused function 'usb_raw_io_flags_zero' [-Wunused-function]
Reported-by: kernelci.org bot <bot@kernelci.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lore.kernel.org/r/6206b80b3810f95bfe1d452de45596609a07b6ea.1584456779.git.andreyknvl@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
FDC registers FD_STATUS, FD_DATA, FD_DOR, FD_DIR and FD_DCR used to be
defined relative to FD_IOPORT, which is the FDC's base address, itself
a macro depending on the "fdc" local or global variable.
This patch changes this so that the register macros above now only
reference the address offset, and that the FDC's address is explicitly
passed in each call to fd_inb() and fd_outb(), thus removing the macro.
With this change there is no more implicit usage of the local/global
"fdc" variable.
One place in the ARM code used to check if the port was equal to FD_DOR,
this was changed to testing the register by applying a mask to the port,
as was already done in the sparc code.
There are still occurrences of fd_inb() and fd_outb() in the PARISC
code and these ones remain unaffected since they already used to work
with a base address and a register offset.
The sparc, m68k and parisc code could now be slightly cleaned up to
benefit from the macro definitions above instead of the equivalent
hard-coded values.
Link: https://lore.kernel.org/r/20200301195555.11154-6-w@1wt.eu
Cc: Ian Molton <spyro@f2s.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Similar to existing nl_set_extack_cookie_u64(), add new helper
nl_set_extack_cookie_u32() which sets extack cookie to a u32 value.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next
Felipe writes:
USB: changes for v5.7 merge window
Lots of changes on dwc3 this time, most of them from Thinh fixing a
bunch of really old mishaps on the driver.
DWC2 got support for STM32MP15 and a couple RockChip SoCs while DWC3
learned about Amlogic A1 family.
Apart from these, we have a few spelling fixes and other minor
non-critical fixes all over the place.
Signed-off-by: Felipe Balbi <balbi@kernel.org>
* tag 'usb-for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (41 commits)
dt-bindings: usb: add documentation for aspeed usb-vhub
ARM: dts: aspeed-g4: add vhub port and endpoint properties
ARM: dts: aspeed-g5: add vhub port and endpoint properties
ARM: dts: aspeed-g6: add usb functions
usb: gadget: aspeed: add ast2600 vhub support
usb: gadget: aspeed: read vhub properties from device tree
usb: gadget: aspeed: support per-vhub usb descriptors
usb: gadget: f_phonet: Replace zero-length array with flexible-array member
usb: gadget: composite: Inform controller driver of self-powered
usb: gadget: amd5536udc: fix spelling mistake "reserverd" -> "reserved"
udc: s3c-hsudc: Silence warning about supplies during deferred probe
usb: dwc2: Silence warning about supplies during deferred probe
dt-bindings: usb: dwc2: add compatible property for rk3368 usb
dt-bindings: usb: dwc2: add compatible property for rk3328 usb
usb: gadget: add raw-gadget interface
usb: dwc2: Implement set_selfpowered()
usb: dwc3: qcom: Replace <linux/clk-provider.h> by <linux/of_clk.h>
usb: dwc3: core: don't do suspend for device mode if already suspended
usb: dwc3: Rework resets initialization to be more flexible
usb: dwc3: Rework clock initialization to be more flexible
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex fix from Thomas Gleixner:
"Fix for yet another subtle futex issue.
The futex code used ihold() to prevent inodes from vanishing, but
ihold() does not guarantee inode persistence. Replace the inode
pointer with a per boot, machine wide, unique inode identifier.
The second commit fixes the breakage of the hash mechanism which
causes a 100% performance regression"
* tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Unbreak futex hashing
futex: Fix inode life-time issue
|