Age | Commit message (Collapse) | Author |
|
Register snoc bus layer suspend/resume PM ops and configure
the wakeup source(CE2) for the device.
Testing:
Tested on WCN3990 HW.
Tested FW: WLAN.HL.2.0-01192-QCAHLSWMTPLZ-1.
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Many integrated QCA9984 WiFis in various IPQ806x platform routers
from various vendors (Netgear R7800, ZyXEL NBG6817, TP-LINK C2600,
etc.) have either blank, bogus or non-unique MAC-addresses in
their calibration data.
As a result, OpenWrt utilizes a discouraged binary calibration data
patching method that allows to modify the device's MAC-addresses right
at the source. This is because the ath10k' firmware extracts the MAC
address from the supplied radio/calibration data and issues a response
to the ath10k linux driver. Which was designed to take the main MAC in
ath10k_wmi_event_ready().
Part of the "setting an alternate MAC" issue was already tackled by a
patch from Brian Norris:
commit 9d5804662ce1
("ath10k: retrieve MAC address from system firmware if provided")
by allowing the option to specify an alternate MAC-address with the
established device_get_mac_address() function which extracts the right
address from DeviceTree/fwnode mac-address or local-mac-address
properties and saves it for later.
However, Ben Greear noted that the Qualcomm's ath10k firmware is liable
to not properly calculate its rx-bssid mask in this case. This can cause
issues in the popluar "multiple AP with a single ath10k instance"
configurations.
To improve MAC address handling, Felix Fietkau suggested to call
pdev_set_base_macaddr_cmdid before bringing up the first vif and
use the first vif MAC address there. Which is in ath10k_core_start().
This patch implement Felix Fietkau's request to
"call pdev_set_base_macaddr_cmdid before bringing up the first vif".
The pdev_set_base_macaddr_cmdid is already declared for all devices
and version. The driver just needed the support code for this
function.
Tested on:
QCA9880/CUS223, firmwares: 10.2.4.13-2, 10.2.4.70.44, 10.2.4-1.0-00041
QCA9887/MR33 firmware:10.2.4-1.0-00033
QCA4019/RT-AC58U firmware: 10.4-3.4-00104, 10.4-3.5.3-00057
QCA9984/R7800 firmware: Candela Technologies (CT) Firmware
BugLink: https://lists.openwrt.org/pipermail/openwrt-devel/2018-November/014595.html
Fixes: 9d5804662ce1 ("ath10k: retrieve MAC address from system firmware if provided")
Cc: Brian Norris <briannorris@chromium.org>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Mathias Kresin <dev@kresin.me>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
TLV based firmware ex. QCA6174, WCN3990 expects key cipher value
set to 9 while non-TLV firmware expects key cipher value set to 8
for enabling GCMP and GCMP-256 cipher suites.
To fix this problem, attach the key cipher suite values based on
wmi version.
Tested HW: WCN3990
Tested FW: WLAN.HL.2.0-01188-QCAHLSWMTPLZ-1
Signed-off-by: Abhishek Ambure <aambure@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Hostapd uses CCMP, GCMP & GCMP-256 as 'wpa_pairwise' option to run WPA3.
In WCN3990 firmware cipher suite numbers 9 to 11 are for CCMP,
GCMP & GCMP-256.
To enable CCMP, GCMP & GCMP-256 cipher suites in WCN3990 firmware,
host sets 'n_cipher_suites = 11' while initializing hardware parameters.
Tested HW: WCN3990
Tested FW: WLAN.HL.2.0-01188-QCAHLSWMTPLZ-1
Signed-off-by: Abhishek Ambure <aambure@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
REGULATORY_WIPHY_SELF_MANAGED as set here breaks NL80211_CMD_GET_REG,
because it expects the wiphy to do regulatory management. Since
virt_wifi does not do regulatory management, this triggers a WARN_ON in
NL80211_CMD_GET_REG and fails the netlink command.
Removing REGULATORY_WIPHY_SELF_MANAGED fixes the problem and the virtual
wireless network continues to work.
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Acked-by: Alistair Strachan <astrachan@google.com>
Acked-by: Greg Hartman <ghartman@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Fix cfg80211_gen_new_bssid() to not rely on u64 modulo arithmetic,
which isn't needed since we really just want to mask there. Also,
clean it up to calculate the mask only once and use GENMASK_ULL()
instead of open-coding the mask calculation.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Enforce the first argument to be a correct type of a pointer to struct
element and avoid unnecessary typecasts from const to non-const pointers
(the change in validate_ie_attr() is needed to make this part work). In
addition, avoid signed/unsigned comparison within for_each_element() and
mark struct element packed just in case.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Lockdep warns about false positive:
[ 11.211460] ------------[ cut here ]------------
[ 11.211936] DEBUG_LOCKS_WARN_ON(depth <= 0)
[ 11.211985] WARNING: CPU: 0 PID: 141 at ../kernel/locking/lockdep.c:3592 lock_release+0x1ad/0x280
[ 11.213134] Modules linked in:
[ 11.214954] RIP: 0010:lock_release+0x1ad/0x280
[ 11.223508] Call Trace:
[ 11.223705] <IRQ>
[ 11.223874] ? __local_bh_enable+0x7a/0x80
[ 11.224199] up_read+0x1c/0xa0
[ 11.224446] do_up_read+0x12/0x20
[ 11.224713] irq_work_run_list+0x43/0x70
[ 11.225030] irq_work_run+0x26/0x50
[ 11.225310] smp_irq_work_interrupt+0x57/0x1f0
[ 11.225662] irq_work_interrupt+0xf/0x20
since rw_semaphore is released in a different task vs task that locked the sema.
It is expected behavior.
Fix the warning with up_read_non_owner() and rwsem_release() annotation.
Fixes: bae77c5eb5b2 ("bpf: enable stackmap with build_id in nmi context")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
When mac80211 requests the low level driver to stop an ongoing
Tx aggregation, the low level driver is expected to call
ieee80211_stop_tx_ba_cb_irqsafe() to indicate that it is ready
to stop the session. The callback in turn schedules a worker
to complete the session tear down, which in turn also handles
the relevant state for the intermediate Tx queue.
However, as this flow in asynchronous, the intermediate queue
should be stopped and not continue servicing frames, as in
such a case frames that are dequeued would be marked as part
of an aggregation, although the aggregation is already been
stopped.
Fix this by stopping the intermediate Tx queue, before
calling the low level driver to stop the Tx aggregation.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
It's possible that the caller of cfg80211_classify8021d() uses the
value to index an array, like mac80211 in ieee80211_downgrade_queue().
Prevent speculation on the return value.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Without recording the netlink port ID, we cannot return the
results or complete messages to userspace, nor will we be
able to abort if the socket is closed, so clearly we need
to fill the value.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Fix FTM per burst maximum value from 15 to 31
(The maximal bits that represents that number in the frame
is 5 hence a maximal value of 31)
Signed-off-by: Aviya Erenfeld <aviya.erenfeld@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If a driver does any significant activity in its ibss_join method,
then it will very well expect that to be called during restart,
before any stations are added. Do that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Jouni reports that in some cases it is possible that getting
disconnected (or stopping AP, after previous patches) results
in further operations hitting the window within the regulatory
core restoring the regdomain to the defaults. The reason for
this is that we have to call out to CRDA or otherwise do some
asynchronous work, and thus can't do the restore atomically.
However, we've previously seen all the data we need to do the
restore, so we can hang on to that data and use it later for
the restore. This makes the whole thing happen within a single
locked section and thus atomic.
However, we can't *always* do this - there are unfortunately
cases where the restore needs to re-request, because this is
also used (abused?) as an error recovery process, so make the
new behaviour optional and only use it when doing a regular
restore as described above.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When facility.76 MSAX3 is present for the guest we must issue a validity
interception if the CRYCBD is not valid.
The bit CRYCBD.31 is an effective field and tested at each guest level
and has for effect to mask the facility.76
It follows that if CRYCBD.31 is clear and AP is not in use we do not
have to test the CRYCBD validatity even if facility.76 is present in the
host.
Fixes: 6ee74098201b ("KVM: s390: vsie: allow CRYCB FORMAT-0")
Cc: stable@vger.kernel.org
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1549876849-32680-1-git-send-email-pmorel@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
Vince (and later on Ravi) reported crashes in the BTS code during
fuzzing with the following backtrace:
general protection fault: 0000 [#1] SMP PTI
...
RIP: 0010:perf_prepare_sample+0x8f/0x510
...
Call Trace:
<IRQ>
? intel_pmu_drain_bts_buffer+0x194/0x230
intel_pmu_drain_bts_buffer+0x160/0x230
? tick_nohz_irq_exit+0x31/0x40
? smp_call_function_single_interrupt+0x48/0xe0
? call_function_single_interrupt+0xf/0x20
? call_function_single_interrupt+0xa/0x20
? x86_schedule_events+0x1a0/0x2f0
? x86_pmu_commit_txn+0xb4/0x100
? find_busiest_group+0x47/0x5d0
? perf_event_set_state.part.42+0x12/0x50
? perf_mux_hrtimer_restart+0x40/0xb0
intel_pmu_disable_event+0xae/0x100
? intel_pmu_disable_event+0xae/0x100
x86_pmu_stop+0x7a/0xb0
x86_pmu_del+0x57/0x120
event_sched_out.isra.101+0x83/0x180
group_sched_out.part.103+0x57/0xe0
ctx_sched_out+0x188/0x240
ctx_resched+0xa8/0xd0
__perf_event_enable+0x193/0x1e0
event_function+0x8e/0xc0
remote_function+0x41/0x50
flush_smp_call_function_queue+0x68/0x100
generic_smp_call_function_single_interrupt+0x13/0x30
smp_call_function_single_interrupt+0x3e/0xe0
call_function_single_interrupt+0xf/0x20
</IRQ>
The reason is that while event init code does several checks
for BTS events and prevents several unwanted config bits for
BTS event (like precise_ip), the PERF_EVENT_IOC_PERIOD allows
to create BTS event without those checks being done.
Following sequence will cause the crash:
If we create an 'almost' BTS event with precise_ip and callchains,
and it into a BTS event it will crash the perf_prepare_sample()
function because precise_ip events are expected to come
in with callchain data initialized, but that's not the
case for intel_pmu_drain_bts_buffer() caller.
Adding a check_period callback to be called before the period
is changed via PERF_EVENT_IOC_PERIOD. It will deny the change
if the event would become BTS. Plus adding also the limit_period
check as well.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20190204123532.GA4794@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The SDC controls live in the south tile, not the north one. Correct this
so that we program the right registers.
Cc: stable@vger.kernel.org
Fixes: 22eb8301dbc1 ("pinctrl: qcom: Add qcs404 pinctrl driver")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Fix page fault handling code to fixup r16-r18 registers.
Before the patch code had off-by-two registers bug.
This bug caused overwriting of ps,pc,gp registers instead
of fixing intended r16,r17,r18 (see `struct pt_regs`).
More details:
Initially Dmitry noticed a kernel bug as a failure
on strace test suite. Test passes unmapped userspace
pointer to io_submit:
```c
#include <err.h>
#include <unistd.h>
#include <sys/mman.h>
#include <asm/unistd.h>
int main(void)
{
unsigned long ctx = 0;
if (syscall(__NR_io_setup, 1, &ctx))
err(1, "io_setup");
const size_t page_size = sysconf(_SC_PAGESIZE);
const size_t size = page_size * 2;
void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (MAP_FAILED == ptr)
err(1, "mmap(%zu)", size);
if (munmap(ptr, size))
err(1, "munmap");
syscall(__NR_io_submit, ctx, 1, ptr + page_size);
syscall(__NR_io_destroy, ctx);
return 0;
}
```
Running this test causes kernel to crash when handling page fault:
```
Unable to handle kernel paging request at virtual address ffffffffffff9468
CPU 3
aio(26027): Oops 0
pc = [<fffffc00004eddf8>] ra = [<fffffc00004edd5c>] ps = 0000 Not tainted
pc is at sys_io_submit+0x108/0x200
ra is at sys_io_submit+0x6c/0x200
v0 = fffffc00c58e6300 t0 = fffffffffffffff2 t1 = 000002000025e000
t2 = fffffc01f159fef8 t3 = fffffc0001009640 t4 = fffffc0000e0f6e0
t5 = 0000020001002e9e t6 = 4c41564e49452031 t7 = fffffc01f159c000
s0 = 0000000000000002 s1 = 000002000025e000 s2 = 0000000000000000
s3 = 0000000000000000 s4 = 0000000000000000 s5 = fffffffffffffff2
s6 = fffffc00c58e6300
a0 = fffffc00c58e6300 a1 = 0000000000000000 a2 = 000002000025e000
a3 = 00000200001ac260 a4 = 00000200001ac1e8 a5 = 0000000000000001
t8 = 0000000000000008 t9 = 000000011f8bce30 t10= 00000200001ac440
t11= 0000000000000000 pv = fffffc00006fd320 at = 0000000000000000
gp = 0000000000000000 sp = 00000000265fd174
Disabling lock debugging due to kernel taint
Trace:
[<fffffc0000311404>] entSys+0xa4/0xc0
```
Here `gp` has invalid value. `gp is s overwritten by a fixup for the
following page fault handler in `io_submit` syscall handler:
```
__se_sys_io_submit
...
ldq a1,0(t1)
bne t0,4280 <__se_sys_io_submit+0x180>
```
After a page fault `t0` should contain -EFALUT and `a1` is 0.
Instead `gp` was overwritten in place of `a1`.
This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18`
(aka `a0-a2`).
I think the bug went unnoticed for a long time as `gp` is one
of scratch registers. Any kernel function call would re-calculate `gp`.
Dmitry tracked down the bug origin back to 2.1.32 kernel version
where trap_a{0,1,2} fields were inserted into struct pt_regs.
And even before that `dpf_reg()` contained off-by-one error.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reported-and-reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: stable@vger.kernel.org # v2.1.32+
Bug: https://bugs.gentoo.org/672040
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Matt Turner <mattst88@gmail.com>
|
|
Eiger machine vector definition has nr_irqs 128, and working 2.6.26
boot shows SCSI getting IRQ-s 64 and 65. Current kernel boot fails
because Symbios SCSI fails to request IRQ-s and does not find the disks.
It has been broken at least since 3.18 - the earliest I could test with
my gcc-5.
The headers have moved around and possibly another order of defines has
worked in the past - but since 128 seems to be correct and used, fix
arch/alpha/include/asm/irq.h to have NR_IRQS=128 for Eiger.
This fixes 4.19-rc7 boot on my Force Flexor A264 (Eiger subarch).
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Matt Turner <mattst88@gmail.com>
|
|
All the setup code in AF_XDP is protected by a mutex with the
exception of the mmap code that cannot use it. To make sure that a
process banging on the mmap call at the same time as another process
is setting up the socket, smp_wmb() calls were added in the umem
registration code and the queue creation code, so that the published
structures that xsk_mmap needs would be consistent. However, the
corresponding smp_rmb() calls were not added to the xsk_mmap
code. This patch adds these calls.
Fixes: 37b076933a8e3 ("xsk: add missing write- and data-dependency barrier")
Fixes: c0c77d8fb787c ("xsk: add user memory registration support sockopt")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
bpf_skb_change_proto and bpf_skb_adjust_room change skb header length.
For GSO packets they adjust gso_size to maintain the same MTU.
The gso size can only be safely adjusted on bytestream protocols.
Commit d02f51cbcf12 ("bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat
to deal with gso sctp skbs") excluded SKB_GSO_SCTP.
Since then type SKB_GSO_UDP_L4 has been added, whose contents are one
gso_size unit per datagram. Also exclude these.
Move from a blacklist to a whitelist check to future proof against
additional such new GSO types, e.g., for fraglist based GRO.
Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Martin KaFai Lau says:
====================
This series adds __sk_buff->sk, "struct bpf_tcp_sock",
BPF_FUNC_sk_fullsock and BPF_FUNC_tcp_sock. Together, they provide
a common way to expose the members of "struct tcp_sock" and
"struct bpf_sock" for the bpf_prog to access.
The patch series first adds a bpf_sock pointer to __sk_buff
and a new helper BPF_FUNC_sk_fullsock.
It then adds BPF_FUNC_tcp_sock to get a bpf_tcp_sock
pointer from a bpf_sock pointer.
The current use case is to allow a cg_skb_bpf_prog to provide
per cgroup traffic policing/shaping.
Please see individual patch for details.
v2:
- Patch 1 depends on
commit d623876646be ("bpf: Fix narrow load on a bpf_sock returned from sk_lookup()")
in the bpf branch.
- Add sk_to_full_sk() to bpf_sk_fullsock() and bpf_tcp_sock()
such that there is a way to access the listener's sk and tcp_sk
when __sk_buff->sk is a request_sock.
The comments in the uapi bpf.h is updated accordingly.
- bpf_ctx_range_till() is used in bpf_sock_common_is_valid_access()
in patch 1. Saved a few lines.
- Patch 2 is new in v2 and it adds "state", "dst_ip4", "dst_ip6" and
"dst_port" to the bpf_sock. Narrow load is allowed on them.
The "state" (i.e. sk_state) has already been used in
INET_DIAG (e.g. ss -t) and getsockopt(TCP_INFO).
- While at it in the new patch 2, also allow narrow load on some
existing fields of the bpf_sock, which are "family", "type", "protocol"
and "src_port". Only allow loading from first byte for now.
i.e. does not allow narrow load starting from the 2nd byte.
- Add some narrow load tests to the test_verifier's sock.c
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a C program to show the usage on
skb->sk and bpf_tcp_sock.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch tests accessing the skb->sk and the new helpers,
bpf_sk_fullsock and bpf_tcp_sock.
The errstr of some existing "reference tracking" tests is changed
with s/bpf_sock/sock/ and s/socket/sock/ where "sock" is from the
verifier's reg_type_str[].
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch sync the uapi bpf.h to tools/.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a helper function BPF_FUNC_tcp_sock and it
is currently available for cg_skb and sched_(cls|act):
struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk);
int cg_skb_foo(struct __sk_buff *skb) {
struct bpf_tcp_sock *tp;
struct bpf_sock *sk;
__u32 snd_cwnd;
sk = skb->sk;
if (!sk)
return 1;
tp = bpf_tcp_sock(sk);
if (!tp)
return 1;
snd_cwnd = tp->snd_cwnd;
/* ... */
return 1;
}
A 'struct bpf_tcp_sock' is also added to the uapi bpf.h to provide
read-only access. bpf_tcp_sock has all the existing tcp_sock's fields
that has already been exposed by the bpf_sock_ops.
i.e. no new tcp_sock's fields are exposed in bpf.h.
This helper returns a pointer to the tcp_sock. If it is not a tcp_sock
or it cannot be traced back to a tcp_sock by sk_to_full_sk(), it
returns NULL. Hence, the caller needs to check for NULL before
accessing it.
The current use case is to expose members from tcp_sock
to allow a cg_skb_bpf_prog to provide per cgroup traffic
policing/shaping.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The next patch will introduce a new "struct bpf_tcp_sock" which
exposes the same tcp_sock's fields already exposed in
"struct bpf_sock_ops".
This patch refactor the existing convert_ctx_access() codes for
"struct bpf_sock_ops" to get them ready to be reused for
"struct bpf_tcp_sock". The "rtt_min" is not refactored
in this patch because its handling is different from other
fields.
The SOCK_OPS_GET_TCP_SOCK_FIELD is new. All other SOCK_OPS_XXX_FIELD
changes are code move only.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds "state", "dst_ip4", "dst_ip6" and "dst_port" to the
bpf_sock. The userspace has already been using "state",
e.g. inet_diag (ss -t) and getsockopt(TCP_INFO).
This patch also allows narrow load on the following existing fields:
"family", "type", "protocol" and "src_port". Unlike IP address,
the load offset is resticted to the first byte for them but it
can be relaxed later if there is a use case.
This patch also folds __sock_filter_check_size() into
bpf_sock_is_valid_access() since it is not called
by any where else. All bpf_sock checking is in
one place.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In kernel, it is common to check "skb->sk && sk_fullsock(skb->sk)"
before accessing the fields in sock. For example, in __netdev_pick_tx:
static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
struct net_device *sb_dev)
{
/* ... */
struct sock *sk = skb->sk;
if (queue_index != new_index && sk &&
sk_fullsock(sk) &&
rcu_access_pointer(sk->sk_dst_cache))
sk_tx_queue_set(sk, new_index);
/* ... */
return queue_index;
}
This patch adds a "struct bpf_sock *sk" pointer to the "struct __sk_buff"
where a few of the convert_ctx_access() in filter.c has already been
accessing the skb->sk sock_common's fields,
e.g. sock_ops_convert_ctx_access().
"__sk_buff->sk" is a PTR_TO_SOCK_COMMON_OR_NULL in the verifier.
Some of the fileds in "bpf_sock" will not be directly
accessible through the "__sk_buff->sk" pointer. It is limited
by the new "bpf_sock_common_is_valid_access()".
e.g. The existing "type", "protocol", "mark" and "priority" in bpf_sock
are not allowed.
The newly added "struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)"
can be used to get a sk with all accessible fields in "bpf_sock".
This helper is added to both cg_skb and sched_(cls|act).
int cg_skb_foo(struct __sk_buff *skb) {
struct bpf_sock *sk;
sk = skb->sk;
if (!sk)
return 1;
sk = bpf_sk_fullsock(sk);
if (!sk)
return 1;
if (sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP)
return 1;
/* some_traffic_shaping(); */
return 1;
}
(1) The sk is read only
(2) There is no new "struct bpf_sock_common" introduced.
(3) Future kernel sock's members could be added to bpf_sock only
instead of repeatedly adding at multiple places like currently
in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc.
(4) After "sk = skb->sk", the reg holding sk is in type
PTR_TO_SOCK_COMMON_OR_NULL.
(5) After bpf_sk_fullsock(), the return type will be in type
PTR_TO_SOCKET_OR_NULL which is the same as the return type of
bpf_sk_lookup_xxx().
However, bpf_sk_fullsock() does not take refcnt. The
acquire_reference_state() is only depending on the return type now.
To avoid it, a new is_acquire_function() is checked before calling
acquire_reference_state().
(6) The WARN_ON in "release_reference_state()" is no longer an
internal verifier bug.
When reg->id is not found in state->refs[], it means the
bpf_prog does something wrong like
"bpf_sk_release(bpf_sk_fullsock(skb->sk))" where reference has
never been acquired by calling "bpf_sk_fullsock(skb->sk)".
A -EINVAL and a verbose are done instead of WARN_ON. A test is
added to the test_verifier in a later patch.
Since the WARN_ON in "release_reference_state()" is no longer
needed, "__release_reference_state()" is folded into
"release_reference_state()" also.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
By adding this test to test_verifier:
{
"reference tracking: access sk->src_ip4 (narrow load)",
.insns = {
BPF_SK_LOOKUP,
BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
BPF_LDX_MEM(BPF_H, BPF_REG_2, BPF_REG_0, offsetof(struct bpf_sock, src_ip4) + 2),
BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
BPF_EMIT_CALL(BPF_FUNC_sk_release),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = ACCEPT,
},
The above test loads 2 bytes from sk->src_ip4 where
sk is obtained by bpf_sk_lookup_tcp().
It hits an internal verifier error from convert_ctx_accesses():
[root@arch-fb-vm1 bpf]# ./test_verifier 665 665
Failed to load prog 'Invalid argument'!
0: (b7) r2 = 0
1: (63) *(u32 *)(r10 -8) = r2
2: (7b) *(u64 *)(r10 -16) = r2
3: (7b) *(u64 *)(r10 -24) = r2
4: (7b) *(u64 *)(r10 -32) = r2
5: (7b) *(u64 *)(r10 -40) = r2
6: (7b) *(u64 *)(r10 -48) = r2
7: (bf) r2 = r10
8: (07) r2 += -48
9: (b7) r3 = 36
10: (b7) r4 = 0
11: (b7) r5 = 0
12: (85) call bpf_sk_lookup_tcp#84
13: (bf) r6 = r0
14: (15) if r0 == 0x0 goto pc+3
R0=sock(id=1,off=0,imm=0) R6=sock(id=1,off=0,imm=0) R10=fp0,call_-1 fp-8=????0000 fp-16=0000mmmm fp-24=mmmmmmmm fp-32=mmmmmmmm fp-40=mmmmmmmm fp-48=mmmmmmmm refs=1
15: (69) r2 = *(u16 *)(r0 +26)
16: (bf) r1 = r6
17: (85) call bpf_sk_release#86
18: (95) exit
from 14 to 18: safe
processed 20 insns (limit 131072), stack depth 48
bpf verifier is misconfigured
Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
The bpf_sock_is_valid_access() is expecting src_ip4 can be narrowly
loaded (meaning load any 1 or 2 bytes of the src_ip4) by
marking info->ctx_field_size. However, this marked
ctx_field_size is not used. This patch fixes it.
Due to the recent refactoring in test_verifier,
this new test will be added to the bpf-next branch
(together with the bpf_tcp_sock patchset)
to avoid merge conflict.
Fixes: c64b7983288e ("bpf: Add PTR_TO_SOCKET verifier type")
Cc: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Bob Tracy <rct@frus.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
Heiner Kallweit says:
====================
r8169: revert two commits due to a regression
Sander reported a regression (kernel panic, see[1]), therefore let's
revert these commits. Removal of the barriers doesn't seem to
contribute to the issue, the patch just overlaps with the problematic
one and only reverting both patches was tested.
[1] https://marc.info/?t=154965066400001&r=1&w=2
v2:
- improve commit message
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
Sander reported a regression causing a kernel panic[1],
therefore let's revert this commit.
[1] https://marc.info/?t=154965066400001&r=1&w=2
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3.
There doesn't seem to be anything wrong with this patch,
it's just reverted to get a stable baseline again.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Heiner Kallweit says:
====================
net: phy: add and use register modifying helpers returning 1 on change
Add and use register modifying helpers returning 1 on change.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use phy_modify_changed() to simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When replacing mv3310_modify() with phy_modify_mmd() we missed that
they behave differently, mv3310_modify() returns 1 on a changed
register value whilst phy_modify_mmd() returns 0. Fix this by replacing
phy_modify_mmd() with phy_modify_mmd_changed() where needed.
Fixes: b52c018ddccf ("net: phy: make use of new MMD accessors")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When modifying registers there are scenarios where we need to know
whether the register content actually changed. This patch adds
new helpers to not break users of the current ones, phy_modify() etc.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
- Fix in at_xdmac fr wrongful channel state
- Fix for imx driver for wrong callback invocation
- Fix to bcm driver for interrupt race & transaction abort.
- Fix in dmatest to abort in mapping error
* tag 'dmaengine-fix-5.0-rc6' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: dmatest: Abort test in case of mapping error
dmaengine: bcm2835: Fix abort of transactions
dmaengine: bcm2835: Fix interrupt race on RT
dmaengine: imx-dma: fix wrong callback invoke
dmaengine: at_xdmac: Fix wrongfull report of a channel as in use
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A handful of fixes:
- Fix an MCE corner case bug/crash found via MCE injection testing
- Fix 5-level paging boot crash
- Fix MCE recovery cache invalidation bug
- Fix regression on Xen guests caused by a recent PMD level mremap
speedup optimization"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Make set_pmd_at() paravirt aware
x86/mm/cpa: Fix set_mce_nospec()
x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting
x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"irqchip driver fixes: most of them are race fixes for ARM GIC (General
Interrupt Controller) variants, but also a fix for the ARM MMP
(Marvell PXA168 et al) irqchip affecting OLPC keyboards"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Fix ITT_entry_size accessor
irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable
irqchip/gic-v3-its: Gracefully fail on LPI exhaustion
irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID
irqchip/gic-v4: Fix occasional VLPI drop
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A couple of kernel side fixes:
- Fix the Intel uncore driver on certain hardware configurations
- Fix a CPU hotplug related memory allocation bug
- Remove a spurious WARN()
... plus also a handful of perf tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf script python: Add Python3 support to tests/attr.py
perf trace: Support multiple "vfs_getname" probes
perf symbols: Filter out hidden symbols from labels
perf symbols: Add fallback definitions for GELF_ST_VISIBILITY()
tools headers uapi: Sync linux/in.h copy from the kernel sources
perf clang: Do not use 'return std::move(something)'
perf mem/c2c: Fix perf_mem_events to support powerpc
perf tests evsel-tp-sched: Fix bitwise operator
perf/core: Don't WARN() for impossible ring-buffer sizes
perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()
perf/x86/intel/uncore: Add Node ID mask
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"An rtmutex (PI-futex) deadlock scenario fix, plus a locking
documentation fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Handle early deadlock return correctly
futex: Fix barrier comment
|
|
Eli Cohen says:
====================
Change tc action identifiers to be more consistent
This two patch series modifies TC actions identifiers to be more consistent and
also puts them in one place so new identifiers numbers can be chosen more
easily.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
TCA_ID_POLICE and also differentiates these identifier, used in struct
tc_action_ops type field, from other macros starting with TCA_ACT_.
To make things clearer, we name the enum defining the TCA_ID_*
identifiers and also change the "type" field of struct tc_action to
id.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move all the TC identifiers to one place, to the same enum that defines
the identifier of police action. This makes it easier choose numbers for
new actions since they are now defined in one place. We preserve the
original values for binary compatibility. New IDs should be added inside
the enum.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes license inconsistent related to the VKMS driver and remove the
redundant boilerplate comment.
Fixes: 854502fa0a38 ("drm/vkms: Add basic CRTC initialization")
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190206140116.7qvy2lpwbcd7wds6@smtp.gmail.com
|
|
set_pmd_at() calls native_set_pmd() unconditionally on x86. This was
fine as long as only huge page entries were written via set_pmd_at(),
as Xen pv guests don't support those.
Commit 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions")
introduced a usage of set_pmd_at() possible on pv guests, leading to
failures like:
BUG: unable to handle kernel paging request at ffff888023e26778
#PF error: [PROT] [WRITE]
RIP: e030:move_page_tables+0x7c1/0xae0
move_vma.isra.3+0xd1/0x2d0
__se_sys_mremap+0x3c6/0x5b0
do_syscall_64+0x49/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Make set_pmd_at() paravirt aware by just letting it use set_pmd().
Fixes: 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions")
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Cc: sstabellini@kernel.org
Cc: hpa@zytor.com
Cc: bp@alien8.de
Cc: torvalds@linux-foundation.org
Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
|
|
By adding this test to test_verifier:
{
"reference tracking: access sk->src_ip4 (narrow load)",
.insns = {
BPF_SK_LOOKUP,
BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
BPF_LDX_MEM(BPF_H, BPF_REG_2, BPF_REG_0, offsetof(struct bpf_sock, src_ip4) + 2),
BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
BPF_EMIT_CALL(BPF_FUNC_sk_release),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = ACCEPT,
},
The above test loads 2 bytes from sk->src_ip4 where
sk is obtained by bpf_sk_lookup_tcp().
It hits an internal verifier error from convert_ctx_accesses():
[root@arch-fb-vm1 bpf]# ./test_verifier 665 665
Failed to load prog 'Invalid argument'!
0: (b7) r2 = 0
1: (63) *(u32 *)(r10 -8) = r2
2: (7b) *(u64 *)(r10 -16) = r2
3: (7b) *(u64 *)(r10 -24) = r2
4: (7b) *(u64 *)(r10 -32) = r2
5: (7b) *(u64 *)(r10 -40) = r2
6: (7b) *(u64 *)(r10 -48) = r2
7: (bf) r2 = r10
8: (07) r2 += -48
9: (b7) r3 = 36
10: (b7) r4 = 0
11: (b7) r5 = 0
12: (85) call bpf_sk_lookup_tcp#84
13: (bf) r6 = r0
14: (15) if r0 == 0x0 goto pc+3
R0=sock(id=1,off=0,imm=0) R6=sock(id=1,off=0,imm=0) R10=fp0,call_-1 fp-8=????0000 fp-16=0000mmmm fp-24=mmmmmmmm fp-32=mmmmmmmm fp-40=mmmmmmmm fp-48=mmmmmmmm refs=1
15: (69) r2 = *(u16 *)(r0 +26)
16: (bf) r1 = r6
17: (85) call bpf_sk_release#86
18: (95) exit
from 14 to 18: safe
processed 20 insns (limit 131072), stack depth 48
bpf verifier is misconfigured
Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
The bpf_sock_is_valid_access() is expecting src_ip4 can be narrowly
loaded (meaning load any 1 or 2 bytes of the src_ip4) by
marking info->ctx_field_size. However, this marked
ctx_field_size is not used. This patch fixes it.
Due to the recent refactoring in test_verifier,
this new test will be added to the bpf-next branch
(together with the bpf_tcp_sock patchset)
to avoid merge conflict.
Fixes: c64b7983288e ("bpf: Add PTR_TO_SOCKET verifier type")
Cc: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|