Age | Commit message (Collapse) | Author |
|
These have no clear purpose. This is effectively a revert of commit
bb7462b6fd64 ("vfs: use helpers for calling f_op->{read,write}_iter()").
The patch was created with the help of a coccinelle script.
Fixes: bb7462b6fd64 ("vfs: use helpers for calling f_op->{read,write}_iter()")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
always equal to ->dentry->d_inode of the path argument these
days.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
no users outside that...
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
both callers are happier that way...
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
With the previous change, struct dqs->stall_thrs will be in the hot path
(at queue side), even if DQS is disabled.
The other fields accessed in this function (last_obj_cnt and num_queued)
are in the first cache line, let's move this field (stall_thrs) to the
very first cache line, since there is a hole there.
This does not change the structure size, since it moves an short (2
bytes) to 4-bytes whole in the first cache line.
This is the new structure format now:
struct dql {
unsigned int num_queued;
unsigned int last_obj_cnt;
...
short unsigned int stall_thrs;
/* XXX 2 bytes hole, try to pack */
...
/* --- cacheline 1 boundary (64 bytes) --- */
...
/* Longest stall detected, reported to user */
short unsigned int stall_max;
/* XXX 2 bytes hole, try to pack */
};
Also, read the stall_thrs (now in the very first cache line) earlier,
together with dql->num_queued (also in the first cache line).
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240411192241.2498631-5-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When Dynamic Queue Limit (DQL) is set, it always populate stall
information through dql_queue_stall(). However, this information is
only necessary if a stall threshold is set, stored in struct
dql->stall_thrs.
dql_queue_stall() is cheap, but not free, since it does have memory
barriers and so forth.
Do not call dql_queue_stall() if there is no stall threshold set, and
save some CPU cycles.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240411192241.2498631-4-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The dql_queued() function currently handles both queuing object counts
and populating bitmaps for reporting stalls.
This commit splits the bitmap population into a separate function,
allowing for conditional invocation in scenarios where the feature is
disabled.
This refactor maintains functionality while improving code
organization.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240411192241.2498631-3-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the dql_queued() function receives an invalid argument, WARN about it
and continue, instead of crashing the kernel.
This was raised by checkpatch, when I am refactoring this code (see
following patch/commit)
WARNING: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240411192241.2498631-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
"ARCH_HAVE_EXTRA_ELF_NOTES" enables an extra note section in the
core dump. Kconfig variable is preferred over ARCH_HAVE_* macro.
Co-developed-by: Jini Susan George <jinisusan.george@amd.com>
Signed-off-by: Jini Susan George <jinisusan.george@amd.com>
Signed-off-by: Vignesh Balasubramanian <vigbalas@amd.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20240412062138.1132841-2-vigbalas@amd.com
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Add an rcu_sr_normal() trace event. It takes three arguments
first one is the name of RCU flavour, second one is a user id
which triggeres synchronize_rcu_normal() and last one is an
event.
There are two traces in the synchronize_rcu_normal(). On entry,
when a new request is registered and on exit point when request
is completed.
Please note, CONFIG_RCU_TRACE=y is required to activate traces.
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
|
|
When using "guard(rcu)();" sparse will complain, because even
though it now understands the cleanup attribute, it doesn't
evaluate the calls from it at function exit, and thus doesn't
count the context correctly.
Given that there's a conditional in the resulting code:
static inline void class_rcu_destructor(class_rcu_t *_T)
{
if (_T->lock) {
rcu_read_unlock();
}
}
it seems that even trying to teach sparse to evalulate the
cleanup attribute function it'd still be difficult to really
make it understand the full context here.
Suppress the sparse warning by just releasing the context in
the acquisition part of the function, after all we know it's
safe with the guard, that's the whole point of it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
|
|
There is no point in compiling in the list and mutex operations which are
only used from the dma-buf debugfs code, if debugfs is not compiled in.
Put the code in questions behind some kconfig guards and so save some text
and maybe even a pointer per object at runtime when not enabled.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-dev@igalia.com
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328145323.68872-1-tursulin@igalia.com
|
|
On version 11 of the "drm/panic: Add a drm panic handler" series [1], Thomas
suggested to change the name of the function `drm_panic_gem_get_scanout_buffer`
to `drm_fb_dma_get_scanout_buffer` and this request was applied on version 12,
which is the version that landed [2]. Although the name of the function
changed on the C file, it didn't changed on the header file, leading to a
compilation error as such:
drivers/gpu/drm/imx/ipuv3/ipuv3-plane.c:780:24: error: use of undeclared
identifier 'drm_fb_dma_get_scanout_buffer'; did you mean 'drm_panic_gem_get_scanout_buffer'?
780 | .get_scanout_buffer = drm_fb_dma_get_scanout_buffer,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| drm_panic_gem_get_scanout_buffer
./include/drm/drm_fb_dma_helper.h:23:5: note: 'drm_panic_gem_get_scanout_buffer'
declared here
23 | int drm_panic_gem_get_scanout_buffer(struct drm_plane *plane,
| ^
1 error generated.
Fix the compilation error by changing `drm_panic_gem_get_scanout_buffer` to
`drm_fb_dma_get_scanout_buffer` on the header file.
Link: https://lore.kernel.org/dri-devel/20240328120638.468738-1-jfalempe@redhat.com/ [1]
Link: https://lore.kernel.org/dri-devel/aea2aa01-7f03-453b-8b30-8f4d90b1b47f@redhat.com/ [2]
Fixes: 879b3b6511fe ("drm/fb_dma: Add generic get_scanout_buffer() for drm_panic")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415151013.3210278-1-mcanal@igalia.com
|
|
The #if condition controlling the rcu_preempt_sleep_check() definition
has a redundant check for CONFIG_PREEMPT_RCU, which is already checked
for by an enclosing #ifndef. This commit therefore removes this redundant
condition from the inner #if.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
|
|
No swap support -- no swapfiles possible.
Signed-off-by: Alexey Dobriyan (Yandex) <adobriyan@gmail.com>
Link: https://lore.kernel.org/r/2391c7f5-0f83-4188-ae56-4ec7ccbf2576@p183
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This was initialy done for imx6, but should work on most drivers
using drm_fb_dma_helper.
v8:
* Replace get_scanout_buffer() logic with drm_panic_set_buffer()
(Thomas Zimmermann)
v9:
* go back to get_scanout_buffer()
* move get_scanout_buffer() to plane helper functions
v12:
* Rename drm_panic_gem_get_scanout_buffer to drm_fb_dma_get_scanout_buffer
(Thomas Zimmermann)
* Remove the #ifdef CONFIG_DRM_PANIC, and build it unconditionnaly, as
it's a small function. (Thomas Zimmermann)
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240409163432.352518-6-jfalempe@redhat.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This module displays a user friendly message when a kernel panic
occurs. It currently doesn't contain any debug information,
but that can be added later.
v2
* Use get_scanout_buffer() instead of the drm client API.
(Thomas Zimmermann)
* Add the panic reason to the panic message (Nerdopolis)
* Add an exclamation mark (Nerdopolis)
v3
* Rework the drawing functions, to write the pixels line by line and
to use the drm conversion helper to support other formats.
(Thomas Zimmermann)
v4
* Use drm_fb_r1_to_32bit for fonts (Thomas Zimmermann)
* Remove the default y to DRM_PANIC config option (Thomas Zimmermann)
* Add foreground/background color config option
* Fix the bottom lines not painted if the framebuffer height
is not a multiple of the font height.
* Automatically register the device to drm_panic, if the function
get_scanout_buffer exists. (Thomas Zimmermann)
v5
* Change the drawing API, use drm_fb_blit_from_r1() to draw the font.
* Also add drm_fb_fill() to fill area with background color.
* Add draw_pixel_xy() API for drivers that can't provide a linear buffer.
* Add a flush() callback for drivers that needs to synchronize the buffer.
* Add a void *private field, so drivers can pass private data to
draw_pixel_xy() and flush().
v6
* Fix sparse warning for panic_msg and logo.
v7
* Add select DRM_KMS_HELPER for the color conversion functions.
v8
* Register directly each plane to the panic notifier (Sima)
* Add raw_spinlock to properly handle concurrency (Sima)
* Register plane instead of device, to avoid looping through plane
list, and simplify code.
* Replace get_scanout_buffer() logic with drm_panic_set_buffer()
(Thomas Zimmermann)
* Removed the draw_pixel_xy() API, will see later if it can be added back.
v9
* Revert to using get_scanout_buffer() (Sima)
* Move get_scanout_buffer() and panic_flush() to the plane helper
functions (Thomas Zimmermann)
* Register all planes with get_scanout_buffer() to the panic notifier
* Use drm_panic_lock() to protect against race (Sima)
v10
* Move blit and fill functions back in drm_panic (Thomas Zimmermann).
* Simplify the text drawing functions.
* Use kmsg_dumper instead of panic_notifier (Sima).
v12
* Use array for map and pitch in struct drm_scanout_buffer
to support multi-planar format later. (Thomas Zimmermann)
* Better indent struct drm_scanout_buffer declaration. (Thomas Zimmermann)
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240409163432.352518-3-jfalempe@redhat.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Rough sketch for the locking of drm panic printing code. The upshot of
this approach is that we can pretty much entirely rely on the atomic
commit flow, with the pair of raw_spin_lock/unlock providing any
barriers we need, without having to create really big critical
sections in code.
This also avoids the need that drivers must explicitly update the
panic handler state, which they might forget to do, or not do
consistently, and then we blow up in the worst possible times.
It is somewhat racy against a concurrent atomic update, and we might
write into a buffer which the hardware will never display. But there's
fundamentally no way to avoid that - if we do the panic state update
explicitly after writing to the hardware, we might instead write to an
old buffer that the user will barely ever see.
Note that an rcu protected deference of plane->state would give us the
the same guarantees, but it has the downside that we then need to
protect the plane state freeing functions with call_rcu too. Which
would very widely impact a lot of code and therefore doesn't seem
worth the complexity compared to a raw spinlock with very tiny
critical sections. Plus rcu cannot be used to protect access to
peek/poke registers anyway, so we'd still need it for those cases.
Peek/poke registers for vram access (or a gart pte reserved just for
panic code) are also the reason I've gone with a per-device and not
per-plane spinlock, since usually these things are global for the
entire display. Going with per-plane locks would mean drivers for such
hardware would need additional locks, which we don't want, since it
deviates from the per-console takeoverlocks design.
Longer term it might be useful if the panic notifiers grow a bit more
structure than just the absolute bare
EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not
EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console
drivers with proper register/unregister interfaces we could perhaps
reuse the very fancy console lock with all it's check and takeover
semantics that John Ogness is developing to fix the console_lock mess.
But for the initial cut of a drm panic printing support I don't think
we need that, because the critical sections are extremely small and
only happen once per display refresh. So generally just 60 tiny locked
sections per second, which is nothing compared to a serial console
running a 115kbaud doing really slow mmio writes for each byte. So for
now the raw spintrylock in drm panic notifier callback should be good
enough.
Another benefit of making panic notifiers more like full blown
consoles (that are used in panics only) would be that we get the two
stage design, where first all the safe outputs are used. And then the
dangerous takeover tricks are deployed (where for display drivers we
also might try to intercept any in-flight display buffer flips, which
if we race and misprogram fifos and watermarks can hang the memory
controller on some hw).
For context the actual implementation on the drm side is by Jocelyn
and this patch is meant to be combined with the overall approach in
v7 (v8 is a bit less flexible, which I think is the wrong direction):
https://lore.kernel.org/dri-devel/20240104160301.185915-1-jfalempe@redhat.com/
Note that the locking is very much not correct there, hence this
separate rfc.
Starting from v10, I (Jocelyn) have included this patch in the drm_panic
series, and done the corresponding changes.
v2:
- fix authorship, this was all my typing
- some typo oopsies
- link to the drm panic work by Jocelyn for context
v10:
- Use spinlock_irqsave/restore (John Ogness)
v11:
- Use macro instead of inline functions for drm_panic_lock/unlock (John Ogness)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240409163432.352518-2-jfalempe@redhat.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
We're exporting some io_uring bits to networking, e.g. for implementing
a net callback for io_uring cmds, but we don't want to expose more than
needed. Add a separate header for networking.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240409210554.1878789-1-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_req_complete_post() was a sole user of ->locked_free_list, but
since we just gutted the function, the cache is not used anymore and
can be removed.
->locked_free_list served as an asynhronous counterpart of the main
request (i.e. struct io_kiocb) cache for all unlocked cases like io-wq.
Now they're all forced to be completed into the main cache directly,
off of the normal completion path or via io_free_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7bffccd213e370abd4de480e739d8b08ab6c1326.1712331455.git.asml.silence@gmail.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Rather than use remap_pfn_range() for this and manually free later,
switch to using vm_insert_page() and have it Just Work.
This requires a bit of effort on the mmap lookup side, as the ctx
uring_lock isn't held, which otherwise protects buffer_lists from being
torn down, and it's not safe to grab from mmap context that would
introduce an ABBA deadlock between the mmap lock and the ctx uring_lock.
Instead, lookup the buffer_list under RCU, as the the list is RCU freed
already. Use the existing reference count to determine whether it's
possible to safely grab a reference to it (eg if it's not zero already),
and drop that reference when done with the mapping. If the mmap
reference is the last one, the buffer_list and the associated memory can
go away, since the vma insertion has references to the inserted pages at
that point.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
While valid C, anonymous enums confuse Cython (Python to C translator),
as reported by Ritesh (YoSTEALTH) [1] . Since people rely on it when
building against liburing and we want to keep this header in sync with
the library version, let's name the existing enums in the uapi header.
[1] https://github.com/cython/cython/issues/3240
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20240328210935.25640-1-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently lists are being used to manage this, but best practice is
usually to have these in an array instead as that it cheaper to manage.
Outside of that detail, games are also played with KASAN as the list
is inside the cached entry itself.
Finally, all users of this need a struct io_cache_entry embedded in
their struct, which is union'ized with something else in there that
isn't used across the free -> realloc cycle.
Get rid of all of that, and simply have it be an array. This will not
change the memory used, as we're just trading an 8-byte member entry
for the per-elem array size.
This reduces the overhead of the recycled allocations, and it reduces
the amount of code code needed to support recycling to about half of
what it currently is.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Basic conversion ensuring async_data is allocated off the prep path. Adds
a basic alloc cache as well, as passthrough IO can be quite high in rate.
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
read/write requests try to put everything on the stack, and then alloc
and copy if a retry is needed. This necessitates a bunch of nasty code
that deals with intermediate state.
Get rid of this, and have the prep side setup everything that is needed
upfront, which greatly simplifies the opcode handlers.
This includes adding an alloc cache for io_async_rw, to make it cheap
to handle.
In terms of cost, this should be basically free and transparent. For
the worst case of {READ,WRITE}_FIXED which didn't need it before,
performance is unaffected in the normal peak workload that is being
used to test that. Still runs at 122M IOPS.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_post_aux_cqe(), which is used for multishot requests, delays
completions by putting CQEs into a temporary array for the purpose
completion lock/flush batching.
DEFER_TASKRUN doesn't need any locking, so for it we can put completions
directly into the CQ and defer post completion handling with a flag.
That leaves !DEFER_TASKRUN, which is not that interesting / hot for
multishot requests, so have conditional locking with deferred flush
for them.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/b1d05a81fd27aaa2a07f9860af13059e7ad7a890.1710799188.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ctx is always locked for task_work now, so get rid of struct
io_tw_state::locked. Note I'm stopping one step before removing
io_tw_state altogether, which is not empty, because it still serves the
purpose of indicating which function is a tw callback and forcing users
not to invoke them carelessly out of a wrong context. The removal can
always be done later.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/e95e1ea116d0bfa54b656076e6a977bc221392a4.1710799188.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
NVMe is making up issue_flags, which is a no-no in general, and to make
matters worse, they are completely the wrong ones. For a pure polled
request, which it does check for, we're already inside the
ctx->uring_lock when the completions are run off io_do_iopoll(). Hence
the correct flag would be '0' rather than IO_URING_F_UNLOCKED.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add comments warning users that they're only allowed to pass issue_flags
that were given from io_uring.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/82ff8a45f2c3eb5f3a04a33f0692e5e4a1320455.1710799188.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some laptops have a key to switch platform profiles.
Add a platform_profile_cycle() function to cycle between the enabled
profiles.
Signed-off-by: Gergo Koteles <soyer@irl.hu>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/5a97deddf72aa5e764d881eb39a7ba35c01a903e.1712597199.git.soyer@irl.hu
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Newer laptops have FnLock LED.
Add a define for this very common function.
Signed-off-by: Gergo Koteles <soyer@irl.hu>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Lee Jones <lee@kernel.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/8ac95e85a53dc0b8cce1e27fc1cab6d19221543b.1712063200.git.soyer@irl.hu
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Pass the con_id instead of property so that callers won't repeat
the GPIO suffixes to try.
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
Add a function to print a decoded EDID vendor and product id to a drm
printer, optionally with the raw data.
v2:
- refactor date printing
- use seq_buf to avoid kasprintf() (Ville)
- handle week == 0 (Ville)
- use be16_to_cpu() on manufacturer_name
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Melissa Wen <mwen@igalia.com> # v1
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/32bbc83ee6557809ef6d7a5edb1bc8ef4d56d10f.1712655867.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Add a struct drm_edid based function to get the vendor and product ID
from an EDID. Add a separate struct for defining this part of the EDID,
with defined byte order for manufacturer name, product code and serial
number.
v2: Define manufacturer_name as __be16 instead of u8[2] (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/df0e7dedbf7f2c190039d6e6eae3e126eba113c9.1712655867.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
In order to be able to limit the amount of memory that is allocated
by IOMMU subsystem, the memory must be accounted.
Account IOMMU as part of the secondary pagetables as it was discussed
at LPC.
The value of SecPageTables now contains mmeory allocation by IOMMU
and KVM.
There is a difference between GFP_ACCOUNT and what NR_IOMMU_PAGES shows.
GFP_ACCOUNT is set only where it makes sense to charge to user
processes, i.e. IOMMU Page Tables, but there more IOMMU shared data
that should not really be charged to a specific process.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20240413002522.1101315-12-pasha.tatashin@soleen.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add NR_IOMMU_PAGES into node_stat_item that counts number of pages
that are allocated by the IOMMU subsystem.
The allocations can be view per-node via:
/sys/devices/system/node/nodeN/vmstat.
For example:
$ grep iommu /sys/devices/system/node/node*/vmstat
/sys/devices/system/node/node0/vmstat:nr_iommu_pages 106025
/sys/devices/system/node/node1/vmstat:nr_iommu_pages 3464
The value is in page-count, therefore, in the above example
the iommu allocations amount to ~428M.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20240413002522.1101315-11-pasha.tatashin@soleen.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The structure is packed, which requires that all its fields need to be
also packed.
./include/uapi/linux/videodev2.h:1810:2: warning: field within 'struct v4l2_ext_control' is less aligned than 'union v4l2_ext_control::(anonymous at ./include/uapi/linux/videodev2.h:1810:2)' and is usually due to 'struct v4l2_ext_control' being packed, which can lead to unaligned accesses [-Wunaligned-access]
Explicitly set the inner union as packed.
Marking the inner union as 'packed' does not change the layout, since the
whole struct is already packed, it just silences the clang warning. See
also this llvm discussion:
https://github.com/llvm/llvm-project/issues/55520
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
|
|
The structure is packed, which requires that all its fields need to be
also packed.
./include/uapi/linux/dvb/frontend.h:854:2: warning: field within 'struct dtv_stats' is less aligned than 'union dtv_stats::(anonymous at ./include/uapi/linux/dvb/frontend.h:854:2)' and is usually due to 'struct dtv_stats' being packed, which can lead to unaligned accesses [-Wunaligned-access]
Explicitly set the inner union as packed.
Marking the inner union as 'packed' does not change the layout, since the
whole struct is already packed, it just silences the clang warning. See
also this llvm discussion:
https://github.com/llvm/llvm-project/issues/55520
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
|
|
These helpers aim to help drivers, with checking
for the presence of unsupported control flags.
For drivers supporting at least one control flag:
flow_rule_is_supp_control_flags()
For drivers using flow_rule_match_control(), but not using flags:
flow_rule_has_control_flags()
For drivers not using flow_rule_match_control():
flow_rule_match_has_control_flags()
While primarily aimed at FLOW_DISSECTOR_KEY_CONTROL
and flow_rule_match_control(), then the first two
can also be used with FLOW_DISSECTOR_KEY_ENC_CONTROL
and flow_rule_match_enc_control().
These helpers mirrors the existing check done in sfc:
drivers/net/ethernet/sfc/tc.c +276
Only compile-tested.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Because Tiny SRCU is used only in kernels built with either
CONFIG_PREEMPT_NONE=y or CONFIG_PREEMPT_VOLUNTARY=y, there has not
been any need for TINY SRCU to explicitly disable preemption. However,
the prospect of lazy preemption changes that, and the lazy-preemption
patches do result in rcutorture runs finding both too-short grace periods
and grace-period hangs for Tiny SRCU.
This commit therefore adds the needed preempt_disable() and
preempt_enable() calls to Tiny SRCU.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
|
|
With Ankur's lazy-/auto-preemption patches applied and with a
lazy-preemptible kernel in combination with a non-preemptible RCU,
lockdep sometimes complains about context switches within RCU read-side
critical sections. This is a false positive due to rcu_read_unlock()
updating lockdep state too late:
__release(RCU);
__rcu_read_unlock();
// Context switch here results in lockdep false positive!!!
rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
Although this complaint could also happen with preemptible RCU
in a preemptible kernel, the odds of that happening aer quite low.
In constrast, with non-preemptible RCU, a long critical section has a
high probability of performing a context switch from the preempt_enable()
in __rcu_read_unlock().
The fix is straightforward, just move the rcu_lock_release()
within rcu_read_unlock() to obtain the reverse order from that of
rcu_read_lock():
rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
__release(RCU);
__rcu_read_unlock();
This commit makes this change.
Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Co-developed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
This will allow it to be called from perf_output_wakeup().
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240413141618.4160-2-khuey@kylehuey.com
|
|
Pick up perf/urgent fixes that are upstream already, but not
yet in the perf/core development branch.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
- Address a (valid) W=1 build warning
- Fix timer self-tests
- Annotate a KCSAN warning wrt. accesses to the tick_do_timer_cpu
global variable
- Address a !CONFIG_BUG build warning
* tag 'timers-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests: kselftest: Fix build failure with NOLIBC
selftests: timers: Fix abs() warning in posix_timers test
selftests: kselftest: Mark functions that unconditionally call exit() as __noreturn
selftests: timers: Fix posix_timers ksft_print_msg() warning
selftests: timers: Fix valid-adjtimex signed left-shift undefined behavior
bug: Fix no-return-statement warning with !CONFIG_BUG
timekeeping: Use READ/WRITE_ONCE() for tick_do_timer_cpu
selftests/timers/posix_timers: Reimplement check_timer_distribution()
irqflags: Explicitly ignore lockdep_hrtimer_exit() argument
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"Fix a PREEMPT_RT build bug"
* tag 'locking-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=y
|
|
Pull virtio bugfixes from Michael Tsirkin:
"Some small, obvious (in hindsight) bugfixes:
- new ioctl in vhost-vdpa has a wrong # - not too late to fix
- vhost has apparently been lacking an smp_rmb() - due to code
duplication :( The duplication will be fixed in the next merge
cycle, this is a minimal fix
- an error message in vhost talks about guest moving used index -
which of course never happens, guest only ever moves the available
index
- i2c-virtio didn't set the driver owner so it did not get refcounted
correctly"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: correct misleading printing information
vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE
virtio: store owner from modules with register_virtio_driver()
vhost: Add smp_rmb() in vhost_enable_notify()
vhost: Add smp_rmb() in vhost_vq_avail_empty()
|
|
The commit fc8b2a619469
("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation")
adds check of potential number of UDP segments vs
UDP_MAX_SEGMENTS in linux/virtio_net.h.
After this change certification test of USO guest-to-guest
transmit on Windows driver for virtio-net device fails,
for example with packet size of ~64K and mss of 536 bytes.
In general the USO should not be more restrictive than TSO.
Indeed, in case of unreasonably small mss a lot of segments
can cause queue overflow and packet loss on the destination.
Limit of 128 segments is good for any practical purpose,
with minimal meaningful mss of 536 the maximal UDP packet will
be divided to ~120 segments.
The number of segments for UDP packets is validated vs
UDP_MAX_SEGMENTS also in udp.c (v4,v6), this does not affect
quest-to-guest path but does affect packets sent to host, for
example.
It is important to mention that UDP_MAX_SEGMENTS is kernel-only
define and not available to user mode socket applications.
In order to request MSS smaller than MTU the applications
just uses setsockopt with SOL_UDP and UDP_SEGMENT and there is
no limitations on socket API level.
Fixes: fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation")
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On the time to free xbc memory in xbc_exit(), memblock may has handed
over memory to buddy allocator. So it doesn't make sense to free memory
back to memblock. memblock_free() called by xbc_exit() even causes UAF bugs
on architectures with CONFIG_ARCH_KEEP_MEMBLOCK disabled like x86.
Following KASAN logs shows this case.
This patch fixes the xbc memory free problem by calling memblock_free()
in early xbc init error rewind path and calling memblock_free_late() in
xbc exit path to free memory to buddy allocator.
[ 9.410890] ==================================================================
[ 9.418962] BUG: KASAN: use-after-free in memblock_isolate_range+0x12d/0x260
[ 9.426850] Read of size 8 at addr ffff88845dd30000 by task swapper/0/1
[ 9.435901] CPU: 9 PID: 1 Comm: swapper/0 Tainted: G U 6.9.0-rc3-00208-g586b5dfb51b9 #5
[ 9.446403] Hardware name: Intel Corporation RPLP LP5 (CPU:RaptorLake)/RPLP LP5 (ID:13), BIOS IRPPN02.01.01.00.00.19.015.D-00000000 Dec 28 2023
[ 9.460789] Call Trace:
[ 9.463518] <TASK>
[ 9.465859] dump_stack_lvl+0x53/0x70
[ 9.469949] print_report+0xce/0x610
[ 9.473944] ? __virt_addr_valid+0xf5/0x1b0
[ 9.478619] ? memblock_isolate_range+0x12d/0x260
[ 9.483877] kasan_report+0xc6/0x100
[ 9.487870] ? memblock_isolate_range+0x12d/0x260
[ 9.493125] memblock_isolate_range+0x12d/0x260
[ 9.498187] memblock_phys_free+0xb4/0x160
[ 9.502762] ? __pfx_memblock_phys_free+0x10/0x10
[ 9.508021] ? mutex_unlock+0x7e/0xd0
[ 9.512111] ? __pfx_mutex_unlock+0x10/0x10
[ 9.516786] ? kernel_init_freeable+0x2d4/0x430
[ 9.521850] ? __pfx_kernel_init+0x10/0x10
[ 9.526426] xbc_exit+0x17/0x70
[ 9.529935] kernel_init+0x38/0x1e0
[ 9.533829] ? _raw_spin_unlock_irq+0xd/0x30
[ 9.538601] ret_from_fork+0x2c/0x50
[ 9.542596] ? __pfx_kernel_init+0x10/0x10
[ 9.547170] ret_from_fork_asm+0x1a/0x30
[ 9.551552] </TASK>
[ 9.555649] The buggy address belongs to the physical page:
[ 9.561875] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x45dd30
[ 9.570821] flags: 0x200000000000000(node=0|zone=2)
[ 9.576271] page_type: 0xffffffff()
[ 9.580167] raw: 0200000000000000 ffffea0011774c48 ffffea0012ba1848 0000000000000000
[ 9.588823] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
[ 9.597476] page dumped because: kasan: bad access detected
[ 9.605362] Memory state around the buggy address:
[ 9.610714] ffff88845dd2ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 9.618786] ffff88845dd2ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 9.626857] >ffff88845dd30000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 9.634930] ^
[ 9.638534] ffff88845dd30080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 9.646605] ffff88845dd30100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 9.654675] ==================================================================
Link: https://lore.kernel.org/all/20240414114944.1012359-1-qiang4.zhang@linux.intel.com/
Fixes: 40caa127f3c7 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Stable@vger.kernel.org
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter into char-misc-next
William writes:
Counter updates for 6.10
Three key updates of note herein:
- Introduction of the COUNTER_COMP_FREQUENCY() macro to simplify
creation of "frequency" Counter extensions
- Three additional Signals (Clock, Channel 3, and Channel 4) are
supported for the stm32-timer-cnt
- Counter events support added for the stm32-timer-cnt
There are also some miscellaneous cleanups and improvements, such as
constifying Counter structures, resolving a kernel-doc description
warning, and converting platform_driver remove callbacks to remove_new.
* tag 'counter-updates-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter:
counter: ti-ecap-capture: Utilize COUNTER_COMP_FREQUENCY macro
counter: ti-eqep: Convert to platform remove callback returning void
counter: ti-ecap-capture: Convert to platform remove callback returning void
MAINTAINERS: Update email addresses for William Breathitt Gray
counter: stm32-timer-cnt: add support for capture events
counter: stm32-timer-cnt: add support for overflow events
counter: stm32-timer-cnt: probe number of channels from registers
counter: stm32-timer-cnt: introduce channels
counter: stm32-timer-cnt: add checks on quadrature encoder capability
counter: stm32-timer-cnt: add counter prescaler extension
counter: stm32-timer-cnt: introduce clock signal
counter: stm32-timer-cnt: adopt signal definitions
counter: stm32-timer-cnt: rename counter
counter: stm32-timer-cnt: rename quadrature signal
counter: Introduce the COUNTER_COMP_FREQUENCY() macro
counter: constify the struct device_type usage
counter: make counter_bus_type const
counter: linux/counter.h: fix Excess kernel-doc description warning
|
|
"The definition of insanity is doing the same thing over and over
again and expecting different results”
We've tried to do this before, most recently with commit bb2314b47996
("fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink") about a
decade ago.
But the effort goes back even further than that, eg this thread back
from 1998 that is so old that we don't even have it archived in lore:
https://lkml.org/lkml/1998/3/10/108
which also points out some of the reasons why it's dangerous.
Or, how about then in 2003:
https://lkml.org/lkml/2003/4/6/112
where we went through some of the same arguments, just wirh different
people involved.
In particular, having access to a file descriptor does not necessarily
mean that you have access to the path that was used for lookup, and
there may be very good reasons why you absolutely must not have access
to a path to said file.
For example, if we were passed a file descriptor from the outside into
some limited environment (think chroot, but also user namespaces etc) a
'flink()' system call could now make that file visible inside a context
where it's not supposed to be visible.
In the process the user may also be able to re-open it with permissions
that the original file descriptor did not have (eg a read-only file
descriptor may be associated with an underlying file that is writable).
Another variation on this is if somebody else (typically root) opens a
file in a directory that is not accessible to others, and passes the
file descriptor on as a read-only file. Again, the access to the file
descriptor does not imply that you should have access to a path to the
file in the filesystem.
So while we have tried this several times in the past, it never works.
The last time we did this, that commit bb2314b47996 quickly got reverted
again in commit f0cc6ffb8ce8 (Revert "fs: Allow unprivileged linkat(...,
AT_EMPTY_PATH) aka flink"), with a note saying "We may re-do this once
the whole discussion about the interface is done".
Well, the discussion is long done, and didn't come to any resolution.
There's no question that 'flink()' would be a useful operation, but it's
a dangerous one.
However, it does turn out that since 2008 (commit d76b0d9b2d87: "CRED:
Use creds in file structs") we have had a fairly straightforward way to
check whether the file descriptor was opened by the same credentials as
the credentials of the flink().
That allows the most common patterns that people want to use, which tend
to be to either open the source carefully (ie using the openat2()
RESOLVE_xyz flags, and/or checking ownership with fstat() before
linking), or to use O_TMPFILE and fill in the file contents before it's
exposed to the world with linkat().
But it also means that if the file descriptor was opened by somebody
else, or we've gone through a credentials change since, the operation no
longer works (unless we have CAP_DAC_READ_SEARCH capabilities in the
opener's user namespace, as before).
Note that the credential equality check is done by using pointer
equality, which means that it's not enough that you have effectively the
same user - they have to be literally identical, since our credentials
are using copy-on-write semantics.
So you can't change your credentials to something else and try to change
it back to the same ones between the open() and the linkat(). This is
not meant to be some kind of generic permission check, this is literally
meant as a "the open and link calls are 'atomic' wrt user credentials"
check.
It also means that you can't just move things between namespaces,
because the credentials aren't just a list of uid's and gid's: they
includes the pointer to the user_ns that the capabilities are relative
to.
So let's try this one more time and see if maybe this approach ends up
being workable after all.
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20240411001012.12513-1-torvalds@linux-foundation.org
[brauner: relax capability check to opener of the file]
Link: https://lore.kernel.org/all/20231113-undenkbar-gediegen-efde5f1c34bc@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
|