Age | Commit message (Collapse) | Author |
|
In order to better organize the platform/Kconfig, place
stm32-specific config stuff on a separate Kconfig file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to better organize the platform/Kconfig, place
hva-specific config stuff on a separate Kconfig file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to better organize the platform/Kconfig, place
s5p-g2d-specific config stuff on a separate Kconfig file.
Acked-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Do not re-enable PSR after it was marked as not reliable (Jose)
- Add missing boundary check in vm_access to avoid out-of-bounds access (Mastan)
- Naming fix for HPD short pulse handling for eDP (Jose)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YjLnofpe5sMHX7Pt@jlahtine-mobl.ger.corp.intel.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/imx: Don't test bus flags in atomic check
* drm/mgag200: Fix PLL setup on some models
* drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP
Kconfig dependencies
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YjMNcqOuDFDoe+EN@linux-uq9g
|
|
Jiri Olsa says:
====================
hi,
this patchset adds new link type BPF_TRACE_KPROBE_MULTI that attaches
kprobe program through fprobe API [1] instroduced by Masami.
The fprobe API allows to attach probe on multiple functions at once very
fast, because it works on top of ftrace. On the other hand this limits
the probe point to the function entry or return.
With bpftrace support I see following attach speed:
# perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
Attaching 2 probes...
Attaching 3342 functions
...
1.4960 +- 0.0285 seconds time elapsed ( +- 1.91% )
v3 changes:
- based on latest fprobe post from Masami [2]
- add acks
- add extra comment to kprobe_multi_link_handler wrt entry ip setup [Masami]
- keep swap_words_64 static and swap values directly in
bpf_kprobe_multi_cookie_swap [Andrii]
- rearrange locking/migrate setup in kprobe_multi_link_prog_run [Andrii]
- move uapi fields [Andrii]
- add bpf_program__attach_kprobe_multi_opts function [Andrii]
- many small test changes [Andrii]
- added tests for bpf_program__attach_kprobe_multi_opts
- make kallsyms_lookup_name check for empty string [Andrii]
v2 changes:
- based on latest fprobe changes [1]
- renaming the uapi interface to kprobe multi
- adding support for sort_r to pass user pointer for swap functions
and using that in cookie support to keep just single functions array
- moving new link to kernel/trace/bpf_trace.c file
- using single fprobe callback function for entry and exit
- using kvzalloc, libbpf_ensure_mem functions
- adding new k[ret]probe.multi sections instead of using current kprobe
- used glob_match from test_progs.c, added '?' matching
- move bpf_get_func_ip verifier inline change to seprate change
- couple of other minor fixes
Also available at:
https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
bpf/kprobe_multi
thanks,
jirka
[1] https://lore.kernel.org/bpf/164458044634.586276.3261555265565111183.stgit@devnote2/
[2] https://lore.kernel.org/bpf/164735281449.1084943.12438881786173547153.stgit@devnote2/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Adding bpf_cookie test for programs attached by
bpf_program__attach_kprobe_multi_opts API.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-14-jolsa@kernel.org
|
|
Adding tests for bpf_program__attach_kprobe_multi_opts function,
that test attach with pattern, symbols and addrs.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-13-jolsa@kernel.org
|
|
Adding bpf_cookie test for programs attached by kprobe_multi links.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-12-jolsa@kernel.org
|
|
Adding kprobe_multi attach test that uses new fprobe interface to
attach kprobe program to multiple functions.
The test is attaching programs to bpf_fentry_test* functions and
uses single trampoline program bpf_prog_test_run to trigger
bpf_fentry_test* functions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-11-jolsa@kernel.org
|
|
Adding bpf_program__attach_kprobe_multi_opts function for attaching
kprobe program to multiple functions.
struct bpf_link *
bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
const char *pattern,
const struct bpf_kprobe_multi_opts *opts);
User can specify functions to attach with 'pattern' argument that
allows wildcards (*?' supported) or provide symbols or addresses
directly through opts argument. These 3 options are mutually
exclusive.
When using symbols or addresses, user can also provide cookie value
for each symbol/address that can be retrieved later in bpf program
with bpf_get_attach_cookie helper.
struct bpf_kprobe_multi_opts {
size_t sz;
const char **syms;
const unsigned long *addrs;
const __u64 *cookies;
size_t cnt;
bool retprobe;
size_t :0;
};
Symbols, addresses and cookies are provided through opts object
(syms/addrs/cookies) as array pointers with specified count (cnt).
Each cookie value is paired with provided function address or symbol
with the same array index.
The program can be also attached as return probe if 'retprobe' is set.
For quick usage with NULL opts argument, like:
bpf_program__attach_kprobe_multi_opts(prog, "ksys_*", NULL)
the 'prog' will be attached as kprobe to 'ksys_*' functions.
Also adding new program sections for automatic attachment:
kprobe.multi/<symbol_pattern>
kretprobe.multi/<symbol_pattern>
The symbol_pattern is used as 'pattern' argument in
bpf_program__attach_kprobe_multi_opts function.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-10-jolsa@kernel.org
|
|
Adding new kprobe_multi struct to bpf_link_create_opts object
to pass multiple kprobe data to link_create attr uapi.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-9-jolsa@kernel.org
|
|
Move the kallsyms parsing in internal libbpf_kallsyms_parse
function, so it can be used from other places.
It will be used in following changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-8-jolsa@kernel.org
|
|
Adding support to call bpf_get_attach_cookie helper from
kprobe programs attached with kprobe multi link.
The cookie is provided by array of u64 values, where each
value is paired with provided function address or symbol
with the same array index.
When cookie array is provided it's sorted together with
addresses (check bpf_kprobe_multi_cookie_swap). This way
we can find cookie based on the address in
bpf_get_attach_cookie helper.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org
|
|
Adding support to inline it on x86, because it's single
load instruction.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-6-jolsa@kernel.org
|
|
Adding support to call bpf_get_func_ip helper from kprobe
programs attached by multi kprobe link.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-5-jolsa@kernel.org
|
|
Masami Hiramatsu says:
====================
Hi,
Here is the 12th version of fprobe. This version fixes a possible gcc-11 issue which
was reported as kretprobes on arm issue, and also I updated the fprobe document.
The previous version (v11) is here[1];
[1] https://lore.kernel.org/all/164701432038.268462.3329725152949938527.stgit@devnote2/T/#u
This series introduces the fprobe, the function entry/exit probe
with multiple probe point support for x86, arm64 and powerpc64le.
This also introduces the rethook for hooking function return as same as
the kretprobe does. This abstraction will help us to generalize the fgraph
tracer, because we can just switch to it from the rethook in fprobe,
depending on the kernel configuration.
The patch [1/12] is from Jiri's series[2].
[2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
if user wants to share the same code (or share a same resource) on the
fprobe and the kprobes.
I forcibly updated my kprobes/fprobe branch, you can pull this series
from:
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/fprobe
Thank you,
---
Jiri Olsa (1):
ftrace: Add ftrace_set_filter_ips function
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
program through fprobe API.
The fprobe API allows to attach probe on multiple functions at once
very fast, because it works on top of ftrace. On the other hand this
limits the probe point to the function entry or return.
The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.
Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
kprobe to multiple function with new link.
User provides array of addresses or symbols with count to attach the
kprobe program to. The new link_create uapi interface looks like:
struct {
__u32 flags;
__u32 cnt;
__aligned_u64 syms;
__aligned_u64 addrs;
} kprobe_multi;
The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
return multi kprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org
|
|
When kallsyms_lookup_name is called with empty string,
it will do futile search for it through all the symbols.
Skipping the search for empty string.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-3-jolsa@kernel.org
|
|
Adding support to have priv pointer in swap callback function.
Following the initial change on cmp callback functions [1]
and adding SWAP_WRAPPER macro to identify sort call of sort_r.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-2-jolsa@kernel.org
[1] 4333fb96ca10 ("media: lib/sort.c: implement sort() variant taking context argument")
|
|
Add a KUnit based selftest for fprobe interface.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735295554.1084943.18347620679928750960.stgit@devnote2
|
|
Add a documentation of fprobe for the user who needs
this interface.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735294272.1084943.12372175959382037397.stgit@devnote2
|
|
Introduce FPROBE_FL_KPROBE_SHARED flag for sharing fprobe callback with
kprobes safely from the viewpoint of recursion.
Since the recursion safety of the fprobe (and ftrace) is a bit different
from the kprobes, this may cause an issue if user wants to run the same
code from the fprobe and the kprobes.
The kprobes has per-cpu 'current_kprobe' variable which protects the
kprobe handler from recursion in any case. On the other hand, the fprobe
uses only ftrace_test_recursion_trylock(), which will allow interrupt
context calls another (or same) fprobe during the fprobe user handler is
running.
This is not a matter in cases if the common callback shared among the
kprobes and the fprobe has its own recursion detection, or it can handle
the recursion in the different contexts (normal/interrupt/NMI.)
But if it relies on the 'current_kprobe' recursion lock, it has to check
kprobe_running() and use kprobe_busy_*() APIs.
Fprobe has FPROBE_FL_KPROBE_SHARED flag to do this. If your common callback
code will be shared with kprobes, please set FPROBE_FL_KPROBE_SHARED
*before* registering the fprobe, like;
fprobe.flags = FPROBE_FL_KPROBE_SHARED;
register_fprobe(&fprobe, "func*", NULL);
This will protect your common callback from the nested call.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735293127.1084943.15687374237275817599.stgit@devnote2
|
|
Add a sample program for the fprobe. The sample_fprobe puts a fprobe on
kernel_clone() by default. This dump stack and some called address info
at the function entry and exit.
The sample_fprobe.ko gets 2 parameters.
- symbol: you can specify the comma separated symbols or wildcard symbol
pattern (in this case you can not use comma)
- stackdump: a bool value to enable or disable stack dump in the fprobe
handler.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735291987.1084943.4449670993752806840.stgit@devnote2
|
|
Add exit_handler to fprobe. fprobe + rethook allows us to hook the kernel
function return. The rethook will be enabled only if the
fprobe::exit_handler is set.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735290790.1084943.10601965782208052202.stgit@devnote2
|
|
Add rethook arm implementation. Most of the code has been copied from
kretprobes on arm.
Since the arm's ftrace implementation is a bit special, this needs a
special care using from fprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735289643.1084943.15184590256680485720.stgit@devnote2
|
|
Add rethook powerpc64 implementation. Most of the code has been copied from
kretprobes on powerpc64.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735288495.1084943.539630613772422267.stgit@devnote2
|
|
Add rethook arm64 implementation. Most of the code has been copied from
kretprobes on arm64.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735287344.1084943.9787335632585653418.stgit@devnote2
|
|
Add rethook for x86 implementation. Most of the code has been copied from
kretprobes on x86.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735286243.1084943.7477055110527046644.stgit@devnote2
|
|
Add a return hook framework which hooks the function return. Most of the
logic came from the kretprobe, but this is independent from kretprobe.
Note that this is expected to be used with other function entry hooking
feature, like ftrace, fprobe, adn kprobes. Eventually this will replace
the kretprobe (e.g. kprobe + rethook = kretprobe), but at this moment,
this is just an additional hook.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735285066.1084943.9259661137330166643.stgit@devnote2
|
|
The fprobe is a wrapper API for ftrace function tracer.
Unlike kprobes, this probes only supports the function entry, but this
can probe multiple functions by one fprobe. The usage is similar, user
will set their callback to fprobe::entry_handler and call
register_fprobe*() with probed functions.
There are 3 registration interfaces,
- register_fprobe() takes filtering patterns of the functin names.
- register_fprobe_ips() takes an array of ftrace-location addresses.
- register_fprobe_syms() takes an array of function names.
The registered fprobes can be unregistered with unregister_fprobe().
e.g.
struct fprobe fp = { .entry_handler = user_handler };
const char *targets[] = { "func1", "func2", "func3"};
...
ret = register_fprobe_syms(&fp, targets, ARRAY_SIZE(targets));
...
unregister_fprobe(&fp);
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735283857.1084943.1154436951479395551.stgit@devnote2
|
|
Adding ftrace_set_filter_ips function to be able to set filter on
multiple ip addresses at once.
With the kprobe multi attach interface we have cases where we need to
initialize ftrace_ops object with thousands of functions, so having
single function diving into ftrace_hash_move_and_update_ops with
ftrace_lock is faster.
The functions ips are passed as unsigned long array with count.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735282673.1084943.18310504594134769804.stgit@devnote2
|
|
for-5.18/drivers
Pull NVMe updates from Christoph:
"Second round of nvme updates for Linux 5.18
- add lockdep annotations for in-kernel sockets (Chris Leech)
- use vmalloc for ANA log buffer (Hannes Reinecke)
- kerneldoc fixes (Chaitanya Kulkarni)
- cleanups (Guoqing Jiang, Chaitanya Kulkarni, me)
- warn about shared namespaces without multipathing (me)"
* tag 'nvme-5.18-2022-03-17' of git://git.infradead.org/nvme:
nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH
nvme: remove nvme_alloc_request and nvme_alloc_request_qid
nvme: cleanup how disk->disk_name is assigned
nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate
nvmet: use snprintf() with PAGE_SIZE in configfs
nvmet: don't fold lines
nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal
nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport
nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport
nvme-tcp: lockdep: annotate in-kernel sockets
nvme-tcp: don't fold the line
nvme-tcp: don't initialize ret variable
nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio
nvme-multipath: use vmalloc for ANA log buffer
|
|
When IO requests are made continuously and the target block device
handles requests faster than request arrival, the request dispatch loop
keeps on repeating to dispatch the arriving requests very long time,
more than a minute. Since the loop runs as a workqueue worker task, the
very long loop duration triggers workqueue watchdog timeout and BUG [1].
To avoid the very long loop duration, break the loop periodically. When
opportunity to dispatch requests still exists, check need_resched(). If
need_resched() returns true, the dispatch loop already consumed its time
slice, then reschedule the dispatch work and break the loop. With heavy
IO load, need_resched() does not return true for 20~30 seconds. To cover
such case, check time spent in the dispatch loop with jiffies. If more
than 1 second is spent, reschedule the dispatch work and break the loop.
[1]
[ 609.691437] BUG: workqueue lockup - pool cpus=10 node=1 flags=0x0 nice=-20 stuck for 35s!
[ 609.701820] Showing busy workqueues and worker pools:
[ 609.707915] workqueue events: flags=0x0
[ 609.712615] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 609.712626] pending: drm_fb_helper_damage_work [drm_kms_helper]
[ 609.712687] workqueue events_freezable: flags=0x4
[ 609.732943] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 609.732952] pending: pci_pme_list_scan
[ 609.732968] workqueue events_power_efficient: flags=0x80
[ 609.751947] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 609.751955] pending: neigh_managed_work
[ 609.752018] workqueue kblockd: flags=0x18
[ 609.769480] pwq 21: cpus=10 node=1 flags=0x0 nice=-20 active=3/256 refcnt=4
[ 609.769488] in-flight: 1020:blk_mq_run_work_fn
[ 609.769498] pending: blk_mq_timeout_work, blk_mq_run_work_fn
[ 609.769744] pool 21: cpus=10 node=1 flags=0x0 nice=-20 hung=35s workers=2 idle: 67
[ 639.899730] BUG: workqueue lockup - pool cpus=10 node=1 flags=0x0 nice=-20 stuck for 66s!
[ 639.909513] Showing busy workqueues and worker pools:
[ 639.915404] workqueue events: flags=0x0
[ 639.920197] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 639.920215] pending: drm_fb_helper_damage_work [drm_kms_helper]
[ 639.920365] workqueue kblockd: flags=0x18
[ 639.939932] pwq 21: cpus=10 node=1 flags=0x0 nice=-20 active=3/256 refcnt=4
[ 639.939942] in-flight: 1020:blk_mq_run_work_fn
[ 639.939955] pending: blk_mq_timeout_work, blk_mq_run_work_fn
[ 639.940212] pool 21: cpus=10 node=1 flags=0x0 nice=-20 hung=66s workers=2 idle: 67
Fixes: 6e6fcbc27e778 ("blk-mq: support batching dispatch in case of io")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.10+
Link: https://lore.kernel.org/linux-block/20220310091649.zypaem5lkyfadymg@shindev/
Link: https://lore.kernel.org/r/20220318022641.133484-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Vladimir Oltean says:
====================
Mirroring for Ocelot switches
This series adds support for tc-matchall (port-based) and tc-flower
(flow-based) offloading of the tc-mirred action. Support has been added
for both the ocelot switchdev driver and felix DSA driver.
====================
Link: https://lore.kernel.org/r/20220316204144.2679277-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Gain support for port mirroring using tc-matchall by forwarding the
calls to the ocelot switch library.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drivers might have error messages to propagate to user space, most
common being that they support a single mirror port.
Propagate the netlink extack so that they can inform user space in a
verbal way of their limitations.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Per-flow mirroring with the VCAP IS2 TCAM (in itself handled as an
offload for tc-flower) is done by setting the MIRROR_ENA bit from the
action vector of the filter. The packet is mirrored to the port mask
configured in the ANA:ANA:MIRRORPORTS register (the same port mask as
the destinations for port-based mirroring).
Functionality was tested with:
tc qdisc add dev swp3 clsact
tc filter add dev swp3 ingress protocol ip \
flower skip_sw ip_proto icmp \
action mirred egress mirror dev swp1
and pinging through swp3, while seeing that the ICMP replies are
mirrored towards swp1.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some VCAP filters utilize resources which are global to the switch, like
for example VCAP IS2 policers take an index into a global policer pool.
In commit c9a7fe1238e5 ("net: mscc: ocelot: add action of police on
vcap_is2"), Xiaoliang expressed this by hooking into the low-level
ocelot_vcap_filter_add_to_block() and ocelot_vcap_block_remove_filter()
functions, and allocating/freeing the policers from there.
Evaluating the code, there probably isn't a better place, but we'll need
to do something similar for the mirror ports, and the code will start to
look even more hacked up than it is right now.
Create two ocelot_vcap_filter_{add,del}_aux_resources() functions to
contain the madness, and pollute less the body of other functions such
as ocelot_vcap_filter_add_to_block().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ocelot switches perform port-based ingress mirroring if
ANA:PORT:PORT_CFG field SRC_MIRROR_ENA is set, and egress mirroring if
the port is in ANA:ANA:EMIRRORPORTS.
Both ingress-mirrored and egress-mirrored frames are copied to the port
mask from ANA:ANA:MIRRORPORTS.
So the choice of limiting to a single mirror port via ocelot_mirror_get()
and ocelot_mirror_put() may seem bizarre, but the hardware model doesn't
map very well to the user space model. If the user wants to mirror the
ingress of swp1 towards swp2 and the ingress of swp3 towards swp4, we'd
have to program ANA:ANA:MIRRORPORTS with BIT(2) | BIT(4), and that would
make swp1 be mirrored towards swp4 too, and swp3 towards swp2. But there
are no tc-matchall rules to describe those actions.
Now, we could offload a matchall rule with multiple mirred actions, one
per desired mirror port, and force the user to stick to the multi-action
rule format for subsequent matchall filters. But both DSA and ocelot
have the flow_offload_has_one_action() check for the matchall offload,
plus the fact that it will get cumbersome to cross-check matchall
mirrors with flower mirrors (which will be added in the next patch).
As a result, we limit the configuration to a single mirror port, with
the possibility of lifting the restriction in the future.
Frames injected from the CPU don't get egress-mirrored, since they are
sent with the BYPASS bit in the injection frame header, and this
bypasses the analyzer module (effectively also the mirroring logic).
I don't know what to do/say about this.
Functionality was tested with:
tc qdisc add dev swp3 clsact
tc filter add dev swp3 ingress \
matchall skip_sw \
action mirred egress mirror dev swp1
and pinging through swp3, while seeing that the ICMP replies are
mirrored towards swp1.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for adding port mirroring support to the ocelot driver,
the dispatching function ocelot_setup_tc_cls_matchall() must be free of
action-specific code. Move port policer creation and deletion to
separate functions.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
An earlier patch mistakenly changed these variables from u32 to u16,
leading to unintended truncation. Restore the original logic.
Fixes: a509a7c61e3b ("ptp: ocp: Add support for selectable SMA directions.")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/r/20220316165347.599154-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
According to erratum described in DS80000687C[1]: "Module 2: Link drops with
some EEE link partners.", we need to "Disable the EEE next page
exchange in EEE Global Register 2"
1 - https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ87xx-Errata-DS80000687C.pdf
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220316125529.1489045-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tobias Waldekranz says:
====================
net: bridge: Multiple Spanning Trees
The bridge has had per-VLAN STP support for a while now, since:
https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay@cumulusnetworks.com/
The current implementation has some problems:
- The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN
is managed independently. This is awkward from an MSTP (802.1Q-2018,
Clause 13.5) point of view, where the model is that multiple VLANs
are grouped into MST instances.
Because of the way that the standard is written, presumably, this is
also reflected in hardware implementations. It is not uncommon for a
switch to support the full 4k range of VIDs, but that the pool of
MST instances is much smaller. Some examples:
Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs
Marvell Prestera: 4k VLANs, but only 128 MSTIs
Microchip SparX-5i: 4k VLANs, but only 128 MSTIs
- By default, the feature is enabled, and there is no way to disable
it. This makes it hard to add offloading in a backwards compatible
way, since any underlying switchdevs have no way to refuse the
function if the hardware does not support it
- The port-global STP state has precedence over per-VLAN states. In
MSTP, as far as I understand it, all VLANs will use the common
spanning tree (CST) by default - through traffic engineering you can
then optimize your network to group subsets of VLANs to use
different trees (MSTI). To my understanding, the way this is
typically managed in silicon is roughly:
Incoming packet:
.----.----.--------------.----.-------------
| DA | SA | 802.1Q VID=X | ET | Payload ...
'----'----'--------------'----'-------------
|
'->|\ .----------------------------.
| +--> | VID | Members | ... | MSTI |
PVID -->|/ |-----|---------|-----|------|
| 1 | 0001001 | ... | 0 |
| 2 | 0001010 | ... | 10 |
| 3 | 0001100 | ... | 10 |
'----------------------------'
|
.-----------------------------'
| .------------------------.
'->| MSTI | Fwding | Lrning |
|------|--------|--------|
| 0 | 111110 | 111110 |
| 10 | 110111 | 110111 |
'------------------------'
What this is trying to show is that the STP state (whether MSTP is
used, or ye olde STP) is always accessed via the VLAN table. If STP
is running, all MSTI pointers in that table will reference the same
index in the STP stable - if MSTP is running, some VLANs may point
to other trees (like in this example).
The fact that in the Linux bridge, the global state (think: index 0
in most hardware implementations) is supposed to override the
per-VLAN state, is very awkward to offload. In effect, this means
that when the global state changes to blocking, drivers will have to
iterate over all MSTIs in use, and alter them all to match. This
also means that you have to cache whether the hardware state is
currently tracking the global state or the per-VLAN state. In the
first case, you also have to cache the per-VLAN state so that you
can restore it if the global state transitions back to forwarding.
This series adds a new mst_enable bridge setting (as suggested by Nik)
that can only be changed when no VLANs are configured on the
bridge. Enabling this mode has the following effect:
- The port-global STP state is used to represent the CST (Common
Spanning Tree) (1/15)
- Ingress STP filtering is deferred until the frame's VLAN has been
resolved (1/15)
- The preexisting per-VLAN states can no longer be controlled directly
(1/15). They are instead placed under the MST module's control,
which is managed using a new netlink interface (described in 3/15)
- VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a
new global VLAN option (2/15)
Switchdev notifications are added so that a driver can track:
- MST enabled state
- VID to MSTI mappings
- MST port states
An offloading implementation is this provided for mv88e6xxx.
====================
Link: https://lore.kernel.org/r/20220316150857.2442916-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allocate a SID in the STU for each MSTID in use by a bridge and handle
the mapping of MSTIDs to VLANs using the SID field of each VTU entry.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Export the raw STU data in a devlink region so that it can be
inspected from userspace and compared to the current bridge
configuration.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In early LinkStreet silicon (e.g. 6095/6185), the per-VLAN STP states
were kept in the VTU - there was no concept of a SID. Later, the
information was split into two tables, where the VTU only tracked
memberships and deferred the STP state tracking to the STU via a
pointer (SID). This meant that a group of VLANs could share the same
STU entry. Most likely, this was done to align with MSTP (802.1Q-2018,
Clause 13), which is built on this principle.
While the VTU is still 4k lines on most devices, the STU is capped at
64 entries. This means that the current stategy, updating STU info
whenever a VTU entry is updated, can not easily support MSTP because:
- The maximum number of VIDs would also be capped at 64, as we would
have to allocate one SID for every VTU entry - even if many VLANs
would effectively share the same MST.
- MSTP updates would be unnecessarily slow as you would have to
iterate over all VLANs that share the same MST.
In order to support MSTP offloading in the future, manage the STU as a
separate entity from the VTU.
Only add support for newer hardware with separate VTU and
STU. VTU-only devices can also be supported, but essentially this
requires a software implementation of an STU (fanning out state
changed to all VLANs tied to the same MST).
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the usual trampoline functionality from the generic DSA layer down
to the drivers for MST state changes.
When a state changes to disabled/blocking/listening, make sure to fast
age any dynamic entries in the affected VLANs (those controlled by the
MSTI in question).
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the usual trampoline functionality from the generic DSA layer down
to the drivers for VLAN MSTI migrations.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When joining a bridge where MST is enabled, we validate that the
proper offloading support is in place, otherwise we fallback to
software bridging.
When then mode is changed on a bridge in which we are members, we
refuse the change if offloading is not supported.
At the moment we only check for configurable learning, but this will
be further restricted as we support more MST related switchdev events.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|