Age | Commit message (Collapse) | Author |
|
After code refactor in previous patches, the propagation logic inside the
for loop in "propagate_liveness" becomes clear that they are good enough to
be factored out into a common function "propagate_liveness_reg".
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Access to reg states were not factored out, the consequence is long code
for dereferencing them which made the indentation not good for reading.
This patch factor out these code so the core code in the loop could be
easier to follow.
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Propagation for register and stack slot are finished in separate for loop,
while they are perfect to be put into a single loop.
This could also let them share some common variables in later patches.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In the review of 0b34eb004347 ("ipv6: Refactor __ip6_route_redirect"),
Martin noted that the flowi6_oif compare is moved to the new helper and
should be removed from __ip6_route_redirect. Fix the oversight.
Fixes: 0b34eb004347 ("ipv6: Refactor __ip6_route_redirect")
Reported-by: Martin Lau <kafai@fb.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
David Howells says:
====================
rxrpc: Fixes
Here is a collection of fixes for rxrpc:
(1) rxrpc_error_report() needs to call sock_error() to clear the error
code from the UDP transport socket, lest it be unexpectedly revisited
on the next kernel_sendmsg() call. This has been causing all sorts of
weird effects in AFS as the effects have typically been felt by the
wrong RxRPC call.
(2) Allow a kernel user of AF_RXRPC to easily detect if an rxrpc call has
completed.
(3) Allow errors incurred by attempting to transmit data through the UDP
socket to get back up the stack to AFS.
(4) Make AFS use (2) to abort the synchronous-mode call waiting loop if
the rxrpc-level call completed.
(5) Add a missing tracepoint case for tracing abort reception.
(6) Fix detection and handling of out-of-order ACKs.
====================
Tested-by: Jonathan Billings <jsbillin@umich.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The rxrpc packet serial number cannot be safely used to compute out of
order ack packets for several reasons:
1. The allocation of serial numbers cannot be assumed to imply the order
by which acks are populated and transmitted. In some rxrpc
implementations, delayed acks and ping acks are transmitted
asynchronously to the receipt of data packets and so may be transmitted
out of order. As a result, they can race with idle acks.
2. Serial numbers are allocated by the rxrpc connection and not the call
and as such may wrap independently if multiple channels are in use.
In any case, what matters is whether the ack packet provides new
information relating to the bounds of the window (the firstPacket and
previousPacket in the ACK data).
Fix this by discarding packets that appear to wind back the window bounds
rather than on serial number procession.
Fixes: 298bc15b2079 ("rxrpc: Only take the rwind and mtu values from latest ACK")
Signed-off-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Trace received calls that are aborted due to a connection abort, typically
because of authentication failure. Without this, connection aborts don't
show up in the trace log.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Check the state of the rxrpc call backing an afs call in each iteration of
the call wait loop in case the rxrpc call has already been terminated at
the rxrpc layer.
Interrupt the wait loop and mark the afs call as complete if the rxrpc
layer call is complete.
There were cases where rxrpc errors were not passed up to afs, which could
result in this loop waiting forever for an afs call to transition to
AFS_CALL_COMPLETE while the rx call was already complete.
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change rxrpc_queue_packet()'s signature so that it can return any error
code it may encounter when trying to send the packet.
This allows the caller to eventually do something in case of error - though
it should be noted that the packet has been queued and a resend is
scheduled.
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make rxrpc_kernel_check_life() pass back the life counter through the
argument list and return true if the call has not yet completed.
Suggested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When an ICMP or ICMPV6 error is received, the error will be attached
to the socket (sk_err) and the report function will get called.
Clear any pending error here by calling sock_error().
This would cause the following attempt to use the socket to fail with
the error code stored by the ICMP error, resulting in unexpected errors
with various side effects depending on the context.
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jonathan Billings <jsbillin@umich.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The err2 error return path calls qede_ptp_disable that cleans up
on an error and frees ptp. After this, the free'd ptp is dereferenced
when ptp->clock is set to NULL and the code falls-through to error
path err1 that frees ptp again.
Fix this by calling qede_ptp_disable and exiting via an error
return path that does not set ptp->clock or kfree ptp.
Addresses-Coverity: ("Write to pointer after free")
Fixes: 035744975aec ("qede: Add support for PTP resource locking.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently if a pci dma mapping failure is detected a free'd
memblock address is returned rather than a NULL (that indicates
an error). Fix this by ensuring NULL is returned on this error case.
Addresses-Coverity: ("Use after free")
Fixes: 528f727279ae ("vxge: code cleanup and reorganization")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jiri Pirko says:
====================
netdevsim: Mostly cleanup in sdev/bpf iface area
This patches does mainly internal netdevsim code shuffle. Nothing
serious, just small changes to help readability and preparations for
future work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to improve readability and prepare for future code changes,
move sdev specific init/uninit code into separate functions.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
offload dev is stored in sdev struct. However, first netdevsim instance
is used as a priv. Change this to be sdev to as it is shared among
multiple netdevsim instances.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some netdevsim bpf debugfs files are per-sdev, yet they are defined per
netdevsim instance. Move them under sdev directory.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To make code easier to read, move shared dev bits into a separate file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch also performs some minor adjustments such as numbering for
the receive path sequence, conversion of keywords to inline literals and
adding an index page so it looks better in the output of 'make htmldocs'.
Signed-off-by: Ioana Ciornei <ciorneiioana@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For reporting the common set of SW timestamping capabilities, use
ethtool_op_get_ts_info() instead of re-implementing it.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For reporting the common set of SW timestamping capabilities, use
ethtool_op_get_ts_info() instead of re-implementing it.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For reporting the common set of SW timestamping capabilities, use
ethtool_op_get_ts_info() instead of re-implementing it.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix a new warning reported by kbuild for make ARCH=i386:
In file included from kernel/bpf/cgroup.c:11:0:
kernel/bpf/cgroup.c: In function '__cgroup_bpf_run_filter_sysctl':
include/linux/kernel.h:827:29: warning: comparison of distinct pointer types lacks a cast
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
^
include/linux/kernel.h:841:4: note: in expansion of macro '__typecheck'
(__typecheck(x, y) && __no_side_effects(x, y))
^~~~~~~~~~~
include/linux/kernel.h:851:24: note: in expansion of macro '__safe_cmp'
__builtin_choose_expr(__safe_cmp(x, y), \
^~~~~~~~~~
include/linux/kernel.h:860:19: note: in expansion of macro '__careful_cmp'
#define min(x, y) __careful_cmp(x, y, <)
^~~~~~~~~~~~~
>> kernel/bpf/cgroup.c:837:17: note: in expansion of macro 'min'
ctx.new_len = min(PAGE_SIZE, *pcount);
^~~
Fixes: 4e63acdff864 ("bpf: Introduce bpf_sysctl_{get,set}_new_value helpers")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Code which initializes the "clk_init_data.ops" checks pll->rate_table
before that field is ever assigned to so it always picks
"clk_pll1416x_min_ops".
This breaks dynamic rate rounding for features such as cpufreq.
Fix by checking pll_clk->rate_table instead, here pll_clk refers to
the constant initialization data coming from per-soc clk driver.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Fixes: 8646d4dcc7fb ("clk: imx: Add PLLs driver for imx8mm soc")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Andrey Ignatov says:
====================
v2->v3:
- simplify C based selftests by relying on variable offset stack access.
v1->v2:
- add fs/proc/proc_sysctl.c mainteners to Cc:.
The patch set introduces new BPF hook for sysctl.
It adds new program type BPF_PROG_TYPE_CGROUP_SYSCTL and attach type
BPF_CGROUP_SYSCTL.
BPF_CGROUP_SYSCTL hook is placed before calling to sysctl's proc_handler so
that accesses (read/write) to sysctl can be controlled for specific cgroup
and either allowed or denied, or traced.
The hook has access to sysctl name, current sysctl value and (on write
only) to new sysctl value via corresponding helpers. New sysctl value can
be overridden by program. Both name and values (current/new) are
represented as strings same way they're visible in /proc/sys/. It is up to
program to parse these strings.
To help with parsing the most common kind of sysctl value, vector of
integers, two new helpers are provided: bpf_strtol and bpf_strtoul with
semantic similar to user space strtol(3) and strtoul(3).
The hook also provides bpf_sysctl context with two fields:
* @write indicates whether sysctl is being read (= 0) or written (= 1);
* @file_pos is sysctl file position to read from or write to, can be
overridden.
The hook allows to make better isolation for containerized applications
that are run as root so that one container can't change a sysctl and affect
all other containers on a host, make changes to allowed sysctl in a safer
way and simplify sysctl tracing for cgroups.
Patch 1 is preliminary refactoring.
Patch 2 adds new program and attach types.
Patches 3-5 implement helpers to access sysctl name and value.
Patch 6 adds file_pos field to bpf_sysctl context.
Patch 7 updates UAPI in tools.
Patches 8-9 add support for the new hook to libbpf and corresponding test.
Patches 10-14 add selftests for the new hook.
Patch 15 adds support for new arg types to verifier: pointer to integer.
Patch 16 adds bpf_strto{l,ul} helpers to parse integers from sysctl value.
Patch 17 updates UAPI in tools.
Patch 18 updates bpf_helpers.h.
Patch 19 adds selftests for pointer to integer in verifier.
Patches 20-21 add selftests for bpf_strto{l,ul}, including integration
C based test for sysctl value parsing.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add C based test for a few bpf_sysctl_* helpers and bpf_strtoul.
Make sure that sysctl can be identified by name and that multiple
integers can be parsed from sysctl value with bpf_strtoul.
net/ipv4/tcp_mem is chosen as a testing sysctl, it contains 3 unsigned
longs, they all are parsed and compared (val[0] < val[1] < val[2]).
Example of output:
# ./test_sysctl
...
Test case: C prog: deny all writes .. [PASS]
Test case: C prog: deny access by name .. [PASS]
Test case: C prog: read tcp_mem .. [PASS]
Summary: 39 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test that bpf_strtol and bpf_strtoul helpers can be used to convert
provided buffer to long or unsigned long correspondingly and return both
correct result and number of consumed bytes, or proper errno.
Example of output:
# ./test_sysctl
..
Test case: bpf_strtoul one number string .. [PASS]
Test case: bpf_strtoul multi number string .. [PASS]
Test case: bpf_strtoul buf_len = 0, reject .. [PASS]
Test case: bpf_strtoul supported base, ok .. [PASS]
Test case: bpf_strtoul unsupported base, EINVAL .. [PASS]
Test case: bpf_strtoul buf with spaces only, EINVAL .. [PASS]
Test case: bpf_strtoul negative number, EINVAL .. [PASS]
Test case: bpf_strtol negative number, ok .. [PASS]
Test case: bpf_strtol hex number, ok .. [PASS]
Test case: bpf_strtol max long .. [PASS]
Test case: bpf_strtol overflow, ERANGE .. [PASS]
Summary: 36 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test that verifier handles new argument types properly, including
uninitialized or partially initialized value, misaligned stack access,
etc.
Example of output:
#456/p ARG_PTR_TO_LONG uninitialized OK
#457/p ARG_PTR_TO_LONG half-uninitialized OK
#458/p ARG_PTR_TO_LONG misaligned OK
#459/p ARG_PTR_TO_LONG size < sizeof(long) OK
#460/p ARG_PTR_TO_LONG initialized OK
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add bpf_sysctl_* and bpf_strtoX helpers to bpf_helpers.h.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Sync bpf_strtoX related bpf UAPI changes to tools/.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add bpf_strtol and bpf_strtoul to convert a string to long and unsigned
long correspondingly. It's similar to user space strtol(3) and
strtoul(3) with a few changes to the API:
* instead of NUL-terminated C string the helpers expect buffer and
buffer length;
* resulting long or unsigned long is returned in a separate
result-argument;
* return value is used to indicate success or failure, on success number
of consumed bytes is returned that can be used to identify position to
read next if the buffer is expected to contain multiple integers;
* instead of *base* argument, *flags* is used that provides base in 5
LSB, other bits are reserved for future use;
* number of supported bases is limited.
Documentation for the new helpers is provided in bpf.h UAPI.
The helpers are made available to BPF_PROG_TYPE_CGROUP_SYSCTL programs to
be able to convert string input to e.g. "ulongvec" output.
E.g. "net/ipv4/tcp_mem" consists of three ulong integers. They can be
parsed by calling to bpf_strtoul three times.
Implementation notes:
Implementation includes "../../lib/kstrtox.h" to reuse integer parsing
functions. It's done exactly same way as fs/proc/base.c already does.
Unfortunately existing kstrtoX function can't be used directly since
they fail if any invalid character is present right after integer in the
string. Existing simple_strtoX functions can't be used either since
they're obsolete and don't handle overflow properly.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently the way to pass result from BPF helper to BPF program is to
provide memory area defined by pointer and size: func(void *, size_t).
It works great for generic use-case, but for simple types, such as int,
it's overkill and consumes two arguments when it could use just one.
Introduce new argument types ARG_PTR_TO_INT and ARG_PTR_TO_LONG to be
able to pass result from helper to program via pointer to int and long
correspondingly: func(int *) or func(long *).
New argument types are similar to ARG_PTR_TO_MEM with the following
differences:
* they don't require corresponding ARG_CONST_SIZE argument, predefined
access sizes are used instead (32bit for int, 64bit for long);
* it's possible to use more than one such an argument in a helper;
* provided pointers have to be aligned.
It's easy to introduce similar ARG_PTR_TO_CHAR and ARG_PTR_TO_SHORT
argument types. It's not done due to lack of use-case though.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test access to file_pos field of bpf_sysctl context, both read (incl.
narrow read) and write.
# ./test_sysctl
...
Test case: ctx:file_pos sysctl:read read ok .. [PASS]
Test case: ctx:file_pos sysctl:read read ok narrow .. [PASS]
Test case: ctx:file_pos sysctl:read write ok .. [PASS]
...
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test that new value provided by user space on sysctl write can be read
by bpf_sysctl_get_new_value and overridden by bpf_sysctl_set_new_value.
# ./test_sysctl
...
Test case: sysctl_get_new_value sysctl:read EINVAL .. [PASS]
Test case: sysctl_get_new_value sysctl:write ok .. [PASS]
Test case: sysctl_get_new_value sysctl:write ok long .. [PASS]
Test case: sysctl_get_new_value sysctl:write E2BIG .. [PASS]
Test case: sysctl_set_new_value sysctl:read EINVAL .. [PASS]
Test case: sysctl_set_new_value sysctl:write ok .. [PASS]
Summary: 22 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test sysctl_get_current_value on sysctl read and write, buffers with
enough space and too small buffers to get E2BIG and truncated result,
etc.
# ./test_sysctl
...
Test case: sysctl_get_current_value sysctl:read ok, gt .. [PASS]
Test case: sysctl_get_current_value sysctl:read ok, eq .. [PASS]
Test case: sysctl_get_current_value sysctl:read E2BIG truncated .. [PASS]
Test case: sysctl_get_current_value sysctl:read EINVAL .. [PASS]
Test case: sysctl_get_current_value sysctl:write ok .. [PASS]
Summary: 16 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test w/ and w/o BPF_F_SYSCTL_BASE_NAME, buffers with enough space and
too small buffers to get E2BIG and truncated result, etc.
# ./test_sysctl
...
Test case: sysctl_get_name sysctl_value:base ok .. [PASS]
Test case: sysctl_get_name sysctl_value:base E2BIG truncated .. [PASS]
Test case: sysctl_get_name sysctl:full ok .. [PASS]
Test case: sysctl_get_name sysctl:full E2BIG truncated .. [PASS]
Test case: sysctl_get_name sysctl:full E2BIG truncated small .. [PASS]
Summary: 11 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add unit test for BPF_PROG_TYPE_CGROUP_SYSCTL program type.
Test that program can allow/deny access.
Test both valid and invalid accesses to ctx->write.
Example of output:
# ./test_sysctl
Test case: sysctl wrong attach_type .. [PASS]
Test case: sysctl:read allow all .. [PASS]
Test case: sysctl:read deny all .. [PASS]
Test case: ctx:write sysctl:read read ok .. [PASS]
Test case: ctx:write sysctl:write read ok .. [PASS]
Test case: ctx:write sysctl:read write reject .. [PASS]
Summary: 6 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add unit test to verify that program and attach types are properly
identified for "cgroup/sysctl" section name.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Support BPF_PROG_TYPE_CGROUP_SYSCTL program in libbpf: identifying
program and attach types by section name, probe.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Sync BPF_PROG_TYPE_CGROUP_SYSCTL related bpf UAPI changes to tools/.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add file_pos field to bpf_sysctl context to read and write sysctl file
position at which sysctl is being accessed (read or written).
The field can be used to e.g. override whole sysctl value on write to
sysctl even when sys_write is called by user space with file_pos > 0. Or
BPF program may reject such accesses.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add helpers to work with new value being written to sysctl by user
space.
bpf_sysctl_get_new_value() copies value being written to sysctl into
provided buffer.
bpf_sysctl_set_new_value() overrides new value being written by user
space with a one from provided buffer. Buffer should contain string
representation of the value, similar to what can be seen in /proc/sys/.
Both helpers can be used only on sysctl write.
File position matters and can be managed by an interface that will be
introduced separately. E.g. if user space calls sys_write to a file in
/proc/sys/ at file position = X, where X > 0, then the value set by
bpf_sysctl_set_new_value() will be written starting from X. If program
wants to override whole value with specified buffer, file position has
to be set to zero.
Documentation for the new helpers is provided in bpf.h UAPI.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add bpf_sysctl_get_current_value() helper to copy current sysctl value
into provided by BPF_PROG_TYPE_CGROUP_SYSCTL program buffer.
It provides same string as user space can see by reading corresponding
file in /proc/sys/, including new line, etc.
Documentation for the new helper is provided in bpf.h UAPI.
Since current value is kept in ctl_table->data in a parsed form,
ctl_table->proc_handler() with write=0 is called to read that data and
convert it to a string. Such a string can later be parsed by a program
using helpers that will be introduced separately.
Unfortunately it's not trivial to provide API to access parsed data due to
variety of data representations (string, intvec, uintvec, ulongvec,
custom structures, even NULL, etc). Instead it's assumed that user know
how to handle specific sysctl they're interested in and appropriate
helpers can be used.
Since ctl_table->proc_handler() expects __user buffer, conversion to
__user happens for kernel allocated one where the value is stored.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add bpf_sysctl_get_name() helper to copy sysctl name (/proc/sys/ entry)
into provided by BPF_PROG_TYPE_CGROUP_SYSCTL program buffer.
By default full name (w/o /proc/sys/) is copied, e.g. "net/ipv4/tcp_mem".
If BPF_F_SYSCTL_BASE_NAME flag is set, only base name will be copied,
e.g. "tcp_mem".
Documentation for the new helper is provided in bpf.h UAPI.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Containerized applications may run as root and it may create problems
for whole host. Specifically such applications may change a sysctl and
affect applications in other containers.
Furthermore in existing infrastructure it may not be possible to just
completely disable writing to sysctl, instead such a process should be
gradual with ability to log what sysctl are being changed by a
container, investigate, limit the set of writable sysctl to currently
used ones (so that new ones can not be changed) and eventually reduce
this set to zero.
The patch introduces new program type BPF_PROG_TYPE_CGROUP_SYSCTL and
attach type BPF_CGROUP_SYSCTL to solve these problems on cgroup basis.
New program type has access to following minimal context:
struct bpf_sysctl {
__u32 write;
};
Where @write indicates whether sysctl is being read (= 0) or written (=
1).
Helpers to access sysctl name and value will be introduced separately.
BPF_CGROUP_SYSCTL attach point is added to sysctl code right before
passing control to ctl_table->proc_handler so that BPF program can
either allow or deny access to sysctl.
Suggested-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently kernel/bpf/cgroup.c contains only one program type and one
proto function cgroup_dev_func_proto(). It'd be useful to have base
proto function that can be reused for new cgroup-bpf program types
coming soon.
Introduce cgroup_base_func_proto().
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In most cases, kmalloc() will not be available early in boot when
pci_setup() is called. Thus, the kstrdup() call that was added to fix the
__initdata bug with the disable_acs_redir parameter usually returns NULL,
so the parameter is discarded and has no effect.
To fix this, store the string that's in initdata until an initcall function
can allocate the memory appropriately. This way we don't need any
additional static memory.
Fixes: d2fd6e81912a ("PCI: Fix __initdata issue with "pci=disable_acs_redir" parameter")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
Second batch of iwlwifi fixes intended for v5.1
* fix for a potential deadlock in the TX path;
* a fix for offloaded rate-control;
* support new PCI HW IDs which use a new FW;
|
|
Move ieee80211_tx_status_ext() outside of status_list lock section
in order to avoid locking dependency and possible deadlock reposed by
LOCKDEP in below warning.
Also do mt76_tx_status_lock() just before it's needed.
[ 440.224832] WARNING: possible circular locking dependency detected
[ 440.224833] 5.1.0-rc2+ #22 Not tainted
[ 440.224834] ------------------------------------------------------
[ 440.224835] kworker/u16:28/2362 is trying to acquire lock:
[ 440.224836] 0000000089b8cacf (&(&q->lock)->rlock#2){+.-.}, at: mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[ 440.224842]
but task is already holding lock:
[ 440.224842] 000000002cfedc59 (&(&sta->lock)->rlock){+.-.}, at: ieee80211_stop_tx_ba_cb+0x32/0x1f0 [mac80211]
[ 440.224863]
which lock already depends on the new lock.
[ 440.224863]
the existing dependency chain (in reverse order) is:
[ 440.224864]
-> #3 (&(&sta->lock)->rlock){+.-.}:
[ 440.224869] _raw_spin_lock_bh+0x34/0x40
[ 440.224880] ieee80211_start_tx_ba_session+0xe4/0x3d0 [mac80211]
[ 440.224894] minstrel_ht_get_rate+0x45c/0x510 [mac80211]
[ 440.224906] rate_control_get_rate+0xc1/0x140 [mac80211]
[ 440.224918] ieee80211_tx_h_rate_ctrl+0x195/0x3c0 [mac80211]
[ 440.224930] ieee80211_xmit_fast+0x26d/0xa50 [mac80211]
[ 440.224942] __ieee80211_subif_start_xmit+0xfc/0x310 [mac80211]
[ 440.224954] ieee80211_subif_start_xmit+0x38/0x390 [mac80211]
[ 440.224956] dev_hard_start_xmit+0xb8/0x300
[ 440.224957] __dev_queue_xmit+0x7d4/0xbb0
[ 440.224968] ip6_finish_output2+0x246/0x860 [ipv6]
[ 440.224978] mld_sendpack+0x1bd/0x360 [ipv6]
[ 440.224987] mld_ifc_timer_expire+0x1a4/0x2f0 [ipv6]
[ 440.224989] call_timer_fn+0x89/0x2a0
[ 440.224990] run_timer_softirq+0x1bd/0x4d0
[ 440.224992] __do_softirq+0xdb/0x47c
[ 440.224994] irq_exit+0xfa/0x100
[ 440.224996] smp_apic_timer_interrupt+0x9a/0x220
[ 440.224997] apic_timer_interrupt+0xf/0x20
[ 440.224999] cpuidle_enter_state+0xc1/0x470
[ 440.225000] do_idle+0x21a/0x260
[ 440.225001] cpu_startup_entry+0x19/0x20
[ 440.225004] start_secondary+0x135/0x170
[ 440.225006] secondary_startup_64+0xa4/0xb0
[ 440.225007]
-> #2 (&(&sta->rate_ctrl_lock)->rlock){+.-.}:
[ 440.225009] _raw_spin_lock_bh+0x34/0x40
[ 440.225022] rate_control_tx_status+0x4f/0xb0 [mac80211]
[ 440.225031] ieee80211_tx_status_ext+0x142/0x1a0 [mac80211]
[ 440.225035] mt76x02_send_tx_status+0x2e4/0x340 [mt76x02_lib]
[ 440.225037] mt76x02_tx_status_data+0x31/0x40 [mt76x02_lib]
[ 440.225040] mt76u_tx_status_data+0x51/0xa0 [mt76_usb]
[ 440.225042] process_one_work+0x237/0x5d0
[ 440.225043] worker_thread+0x3c/0x390
[ 440.225045] kthread+0x11d/0x140
[ 440.225046] ret_from_fork+0x3a/0x50
[ 440.225047]
-> #1 (&(&list->lock)->rlock#8){+.-.}:
[ 440.225049] _raw_spin_lock_bh+0x34/0x40
[ 440.225052] mt76_tx_status_skb_add+0x51/0x100 [mt76]
[ 440.225054] mt76x02u_tx_prepare_skb+0xbd/0x116 [mt76x02_usb]
[ 440.225056] mt76u_tx_queue_skb+0x5f/0x180 [mt76_usb]
[ 440.225058] mt76_tx+0x93/0x190 [mt76]
[ 440.225070] ieee80211_tx_frags+0x148/0x210 [mac80211]
[ 440.225081] __ieee80211_tx+0x75/0x1b0 [mac80211]
[ 440.225092] ieee80211_tx+0xde/0x110 [mac80211]
[ 440.225105] __ieee80211_tx_skb_tid_band+0x72/0x90 [mac80211]
[ 440.225122] ieee80211_send_auth+0x1f3/0x360 [mac80211]
[ 440.225141] ieee80211_auth.cold.40+0x6c/0x100 [mac80211]
[ 440.225156] ieee80211_mgd_auth.cold.50+0x132/0x15f [mac80211]
[ 440.225171] cfg80211_mlme_auth+0x149/0x360 [cfg80211]
[ 440.225181] nl80211_authenticate+0x273/0x2e0 [cfg80211]
[ 440.225183] genl_family_rcv_msg+0x196/0x3a0
[ 440.225184] genl_rcv_msg+0x47/0x8e
[ 440.225185] netlink_rcv_skb+0x3a/0xf0
[ 440.225187] genl_rcv+0x24/0x40
[ 440.225188] netlink_unicast+0x16d/0x210
[ 440.225189] netlink_sendmsg+0x204/0x3b0
[ 440.225191] sock_sendmsg+0x36/0x40
[ 440.225193] ___sys_sendmsg+0x259/0x2b0
[ 440.225194] __sys_sendmsg+0x47/0x80
[ 440.225196] do_syscall_64+0x60/0x1f0
[ 440.225197] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 440.225198]
-> #0 (&(&q->lock)->rlock#2){+.-.}:
[ 440.225200] lock_acquire+0xb9/0x1a0
[ 440.225202] _raw_spin_lock_bh+0x34/0x40
[ 440.225204] mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[ 440.225215] ieee80211_agg_start_txq+0xe8/0x2b0 [mac80211]
[ 440.225225] ieee80211_stop_tx_ba_cb+0xb8/0x1f0 [mac80211]
[ 440.225235] ieee80211_ba_session_work+0x1c1/0x2f0 [mac80211]
[ 440.225236] process_one_work+0x237/0x5d0
[ 440.225237] worker_thread+0x3c/0x390
[ 440.225239] kthread+0x11d/0x140
[ 440.225240] ret_from_fork+0x3a/0x50
[ 440.225240]
other info that might help us debug this:
[ 440.225241] Chain exists of:
&(&q->lock)->rlock#2 --> &(&sta->rate_ctrl_lock)->rlock --> &(&sta->lock)->rlock
[ 440.225243] Possible unsafe locking scenario:
[ 440.225244] CPU0 CPU1
[ 440.225244] ---- ----
[ 440.225245] lock(&(&sta->lock)->rlock);
[ 440.225245] lock(&(&sta->rate_ctrl_lock)->rlock);
[ 440.225246] lock(&(&sta->lock)->rlock);
[ 440.225247] lock(&(&q->lock)->rlock#2);
[ 440.225248]
*** DEADLOCK ***
[ 440.225249] 5 locks held by kworker/u16:28/2362:
[ 440.225250] #0: 0000000048fcd291 ((wq_completion)phy0){+.+.}, at: process_one_work+0x1b5/0x5d0
[ 440.225252] #1: 00000000f1c6828f ((work_completion)(&sta->ampdu_mlme.work)){+.+.}, at: process_one_work+0x1b5/0x5d0
[ 440.225254] #2: 00000000433d2b2c (&sta->ampdu_mlme.mtx){+.+.}, at: ieee80211_ba_session_work+0x5c/0x2f0 [mac80211]
[ 440.225265] #3: 000000002cfedc59 (&(&sta->lock)->rlock){+.-.}, at: ieee80211_stop_tx_ba_cb+0x32/0x1f0 [mac80211]
[ 440.225276] #4: 000000009d7b9a44 (rcu_read_lock){....}, at: ieee80211_agg_start_txq+0x33/0x2b0 [mac80211]
[ 440.225286]
stack backtrace:
[ 440.225288] CPU: 2 PID: 2362 Comm: kworker/u16:28 Not tainted 5.1.0-rc2+ #22
[ 440.225289] Hardware name: LENOVO 20KGS23S0P/20KGS23S0P, BIOS N23ET55W (1.30 ) 08/31/2018
[ 440.225300] Workqueue: phy0 ieee80211_ba_session_work [mac80211]
[ 440.225301] Call Trace:
[ 440.225304] dump_stack+0x85/0xc0
[ 440.225306] print_circular_bug.isra.38.cold.58+0x15c/0x195
[ 440.225307] check_prev_add.constprop.48+0x5f0/0xc00
[ 440.225309] ? check_prev_add.constprop.48+0x39d/0xc00
[ 440.225311] ? __lock_acquire+0x41d/0x1100
[ 440.225312] __lock_acquire+0xd98/0x1100
[ 440.225313] ? __lock_acquire+0x41d/0x1100
[ 440.225315] lock_acquire+0xb9/0x1a0
[ 440.225317] ? mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[ 440.225319] _raw_spin_lock_bh+0x34/0x40
[ 440.225321] ? mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[ 440.225323] mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[ 440.225334] ieee80211_agg_start_txq+0xe8/0x2b0 [mac80211]
[ 440.225344] ieee80211_stop_tx_ba_cb+0xb8/0x1f0 [mac80211]
[ 440.225354] ieee80211_ba_session_work+0x1c1/0x2f0 [mac80211]
[ 440.225356] process_one_work+0x237/0x5d0
[ 440.225358] worker_thread+0x3c/0x390
[ 440.225359] ? wq_calc_node_cpumask+0x70/0x70
[ 440.225360] kthread+0x11d/0x140
[ 440.225362] ? kthread_create_on_node+0x40/0x40
[ 440.225363] ret_from_fork+0x3a/0x50
Cc: stable@vger.kernel.org
Fixes: 88046b2c9f6d ("mt76: add support for reporting tx status with skb")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Currently rt2x00 devices retransmit the management frames with
incremented sequence number if hardware is assigning the sequence.
This is HW bug fixed already for non-QOS data frames, but it should
be fixed for management frames except beacon.
Without fix retransmitted frames have wrong SN:
AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1648, FN=0, Flags=........C Frame is not being retransmitted 1648 1
AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1649, FN=0, Flags=....R...C Frame is being retransmitted 1649 1
AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1650, FN=0, Flags=....R...C Frame is being retransmitted 1650 1
With the fix SN stays correctly the same:
88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=........C
88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C
88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C
Cc: stable@vger.kernel.org
Signed-off-by: Vijayakumar Durai <vijayakumar.durai1@vivint.com>
[sgruszka: simplify code, change comments and changelog]
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|