Age | Commit message (Collapse) | Author |
|
The "return 0" instruction follows other return instruction
and it makes it impossible to execute, hence remove it.
Fixes: 00fc0c51e35b ("rocker: Change world_ops API and implementation to be switchdev independant")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't need to call unmap_mapping_range() prior to calling
nfs_sync_mapping().
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
When ORC support was added for the ftrace_64.S code, an ENDPROC
for function_hook() was missed. This results in the following warning:
arch/x86/kernel/ftrace_64.o: warning: objtool: .entry.text+0x0: unreachable instruction
Fixes: e2ac83d74a4d ("x86/ftrace: Fix ORC unwinding from ftrace handlers")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20180128022150.dqierscqmt3uwwsr@treble
|
|
As explained by Michal Marek at https://lkml.org/lkml/2011/8/31/189
silentoldconfig has become a misnomer. It has become an internal interface
so remove it from "make help" and Documentation/ to stop confusing people
using it as seen for instance at
https://chromium-review.googlesource.com/835632 Don't remove it from
kconfig/Makefile yet not to break any (other) tool using it.
On the other hand, correct and expand its description in the help of
the (internal) scripts/kconfig/conf.c
Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Make it all a function which does the WRMSR instead of having a hairy
inline asm.
[dwmw2: export it, fix CONFIG_RETPOLINE issues]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: dave.hansen@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1517070274-12128-4-git-send-email-dwmw@amazon.co.uk
|
|
Simplify it to call an asm-function instead of pasting 41 insn bytes at
every call site. Also, add alignment to the macro as suggested here:
https://support.google.com/faqs/answer/7625886
[dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: dave.hansen@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1517070274-12128-3-git-send-email-dwmw@amazon.co.uk
|
|
We want to expose the hardware features simply in /proc/cpuinfo as "ibrs",
"ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them
as the user-visible bits.
When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB
capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP
bit is set, set the AMD STIBP that's used for the generic hardware
capability.
Hide the rest from /proc/cpuinfo by putting "" in the comments. Including
RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are
patches to make the sysfs vulnerabilities information non-readable by
non-root, and the same should apply to all information about which
mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo.
The feature bit for whether IBPB is actually used, which is needed for
ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB.
Originally-by: Borislav Petkov <bp@suse.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: dave.hansen@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.uk
|
|
Calling fan related SMM functions implemented by Dell BIOS firmware on Dell
Vostro 3360 freeze kernel for about 500ms.
Unfortunately, it is unlikely for Dell to fix this since the machine
is pretty old, so this commit just disables fan support to make the
system usable again.
Via "force" module param fan support can be enabled.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751
Link: http://lkml.iu.edu/hypermail/linux/kernel/1711.2/06083.html
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Calling fan related SMM functions implemented by Dell BIOS firmware on Dell
Inspiron 7720 freeze kernel for about 500ms. Until Dell fixes it we need to
disable fan support for Dell Inspiron 7720 as it makes system unusable.
Via "force" module param fan support can be enabled.
Reported-by: vova7890@mail.ru
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751
Cc: stable@vger.kernel.org # v4.0+, will need backport
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Some Dell machines are broken and some functionality is disabled. Show
warning into dmesg about this fact and allow user via "force" module param
to override brokenness and enable broken functionality.
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
If sysfs is disabled and RETPOLINE not defined:
arch/x86/kernel/cpu/bugs.c:97:13: warning: ‘spectre_v2_bad_module’ defined but not used
[-Wunused-variable]
static bool spectre_v2_bad_module;
Hide it.
Fixes: caf7501a1b4e ("module/retpoline: Warn about missing retpoline in module")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
|
|
Pick up urgent bug fix and resolve the conflict.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.
If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.
Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.
Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.
Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
|
|
Due to some unfortunate events, I have not been directly involved in
the x86 kernel patch flow for a while now. I have also not been able
to ramp back up by now like I had hoped to, and after reviewing what I
will need to work on both internally at Intel and elsewhere in the near
term, it is clear that I am not going to be able to ramp back up until
late 2018 at the very earliest.
It is not acceptable to not recognize that this load is currently
taken by Ingo and Thomas without my direct participation, so I mark
myself as R: (designated reviewer) rather than M: (maintainer) until
further notice. This is in fact recognizing the de facto situation
for the past few years.
I have obviously no intention of going away, and I will do everything
within my power to improve Linux on x86 and x86 for Linux. This,
however, puts credit where it is due and reflects a change of focus.
This patch also removes stale entries for portions of the x86
architecture which have not been maintained separately from arch/x86
for a long time. If there is a reason to re-introduce them then that
can happen later.
Signed-off-by: H. Peter Anvin <h.peter.anvin@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180125195934.5253-1-hpa@zytor.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Yonghong Song says:
====================
A kernel page fault which happens in lpm map trie_get_next_key is reported
by syzbot and Eric. The issue was introduced by commit b471f2f1de8b
("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map").
Patch #1 fixed the issue in the kernel and patch #2 adds a multithreaded
test case in tools/testing/selftests/bpf/test_lpm_map.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The new test will spawn four threads, doing map update, delete, lookup
and get_next_key in parallel. It is able to reproduce the issue in the
previous commit found by syzbot and Eric Dumazet.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command
for LPM_TRIE map") introduces a bug likes below:
if (!rcu_dereference(trie->root))
return -ENOENT;
if (!key || key->prefixlen > trie->max_prefixlen) {
root = &trie->root;
goto find_leftmost;
}
......
find_leftmost:
for (node = rcu_dereference(*root); node;) {
In the code after label find_leftmost, it is assumed
that *root should not be NULL, but it is not true as
it is possbile trie->root is changed to NULL by an
asynchronous delete operation.
The issue is reported by syzbot and Eric Dumazet with the
below error log:
......
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 8033 Comm: syz-executor3 Not tainted 4.15.0-rc8+ #4
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:trie_get_next_key+0x3c2/0xf10 kernel/bpf/lpm_trie.c:682
......
This patch fixed the issue by use local rcu_dereferenced
pointer instead of *(&trie->root) later on.
Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command or LPM_TRIE map")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Daniel Borkmann says:
====================
This set contains a small cleanup in cBPF prologue generation and
otherwise fixes an outstanding issue related to BPF to BPF calls
and exception handling. For details please see related patches.
Last but not least, BPF selftests is extended with several new
test cases.
Thanks!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Update selftests to relfect recent changes and add various new
test cases.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from arm32 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The verifier in both cBPF and eBPF reject div/mod by 0 imm,
so this can never load. Remove emitting such test and reject
it from being JITed instead (the latter is actually also not
needed, but given practice in sparc64, ppc64 today, so
doesn't hurt to add it here either).
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Daney <david.daney@cavium.com>
Reviewed-by: David Daney <david.daney@cavium.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from mips64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Daney <david.daney@cavium.com>
Reviewed-by: David Daney <david.daney@cavium.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from sparc64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from ppc64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from s390x JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from arm64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from x86_64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
One of the ugly leftovers from the early eBPF days is that div/mod
operations based on registers have a hard-coded src_reg == 0 test
in the interpreter as well as in JIT code generators that would
return from the BPF program with exit code 0. This was basically
adopted from cBPF interpreter for historical reasons.
There are multiple reasons why this is very suboptimal and prone
to bugs. To name one: the return code mapping for such abnormal
program exit of 0 does not always match with a suitable program
type's exit code mapping. For example, '0' in tc means action 'ok'
where the packet gets passed further up the stack, which is just
undesirable for such cases (e.g. when implementing policy) and
also does not match with other program types.
While trying to work out an exception handling scheme, I also
noticed that programs crafted like the following will currently
pass the verifier:
0: (bf) r6 = r1
1: (85) call pc+8
caller:
R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
callee:
frame1: R1=ctx(id=0,off=0,imm=0) R10=fp0,call_1
10: (b4) (u32) r2 = (u32) 0
11: (b4) (u32) r3 = (u32) 1
12: (3c) (u32) r3 /= (u32) r2
13: (61) r0 = *(u32 *)(r1 +76)
14: (95) exit
returning from callee:
frame1: R0_w=pkt(id=0,off=0,r=0,imm=0)
R1=ctx(id=0,off=0,imm=0) R2_w=inv0
R3_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
R10=fp0,call_1
to caller at 2:
R0_w=pkt(id=0,off=0,r=0,imm=0) R6=ctx(id=0,off=0,imm=0)
R10=fp0,call_-1
from 14 to 2: R0=pkt(id=0,off=0,r=0,imm=0)
R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
2: (bf) r1 = r6
3: (61) r1 = *(u32 *)(r1 +80)
4: (bf) r2 = r0
5: (07) r2 += 8
6: (2d) if r2 > r1 goto pc+1
R0=pkt(id=0,off=0,r=8,imm=0) R1=pkt_end(id=0,off=0,imm=0)
R2=pkt(id=0,off=8,r=8,imm=0) R6=ctx(id=0,off=0,imm=0)
R10=fp0,call_-1
7: (71) r0 = *(u8 *)(r0 +0)
8: (b7) r0 = 1
9: (95) exit
from 6 to 8: safe
processed 16 insns (limit 131072), stack depth 0+0
Basically what happens is that in the subprog we make use of a
div/mod by 0 exception and in the 'normal' subprog's exit path
we just return skb->data back to the main prog. This has the
implication that the verifier thinks we always get a pkt pointer
in R0 while we still have the implicit 'return 0' from the div
as an alternative unconditional return path earlier. Thus, R0
then contains 0, meaning back in the parent prog we get the
address range of [0x0, skb->data_end] as read and writeable.
Similar can be crafted with other pointer register types.
Since i) BPF_ABS/IND is not allowed in programs that contain
BPF to BPF calls (and generally it's also disadvised to use in
native eBPF context), ii) unknown opcodes don't return zero
anymore, iii) we don't return an exception code in dead branches,
the only last missing case affected and to fix is the div/mod
handling.
What we would really need is some infrastructure to propagate
exceptions all the way to the original prog unwinding the
current stack and returning that code to the caller of the
BPF program. In user space such exception handling for similar
runtimes is typically implemented with setjmp(3) and longjmp(3)
as one possibility which is not available in the kernel,
though (kgdb used to implement it in kernel long time ago). I
implemented a PoC exception handling mechanism into the BPF
interpreter with porting setjmp()/longjmp() into x86_64 and
adding a new internal BPF_ABRT opcode that can use a program
specific exception code for all exception cases we have (e.g.
div/mod by 0, unknown opcodes, etc). While this seems to work
in the constrained BPF environment (meaning, here, we don't
need to deal with state e.g. from memory allocations that we
would need to undo before going into exception state), it still
has various drawbacks: i) we would need to implement the
setjmp()/longjmp() for every arch supported in the kernel and
for x86_64, arm64, sparc64 JITs currently supporting calls,
ii) it has unconditional additional cost on main program
entry to store CPU register state in initial setjmp() call,
and we would need some way to pass the jmp_buf down into
___bpf_prog_run() for main prog and all subprogs, but also
storing on stack is not really nice (other option would be
per-cpu storage for this, but it also has the drawback that
we need to disable preemption for every BPF program types).
All in all this approach would add a lot of complexity.
Another poor-man's solution would be to have some sort of
additional shared register or scratch buffer to hold state
for exceptions, and test that after every call return to
chain returns and pass R0 all the way down to BPF prog caller.
This is also problematic in various ways: i) an additional
register doesn't map well into JITs, and some other scratch
space could only be on per-cpu storage, which, again has the
side-effect that this only works when we disable preemption,
or somewhere in the input context which is not available
everywhere either, and ii) this adds significant runtime
overhead by putting conditionals after each and every call,
as well as implementation complexity.
Yet another option is to teach verifier that div/mod can
return an integer, which however is also complex to implement
as verifier would need to walk such fake 'mov r0,<code>; exit;'
sequeuence and there would still be no guarantee for having
propagation of this further down to the BPF caller as proper
exception code. For parent prog, it is also is not distinguishable
from a normal return of a constant scalar value.
The approach taken here is a completely different one with
little complexity and no additional overhead involved in
that we make use of the fact that a div/mod by 0 is undefined
behavior. Instead of bailing out, we adapt the same behavior
as on some major archs like ARMv8 [0] into eBPF as well:
X div 0 results in 0, and X mod 0 results in X. aarch64 and
aarch32 ISA do not generate any traps or otherwise aborts
of program execution for unsigned divides. I verified this
also with a test program compiled by gcc and clang, and the
behavior matches with the spec. Going forward we adapt the
eBPF verifier to emit such rewrites once div/mod by register
was seen. cBPF is not touched and will keep existing 'return 0'
semantics. Given the options, it seems the most suitable from
all of them, also since major archs have similar schemes in
place. Given this is all in the realm of undefined behavior,
we still have the option to adapt if deemed necessary and
this way we would also have the option of more flexibility
from LLVM code generation side (which is then fully visible
to verifier). Thus, this patch i) fixes the panic seen in
above program and ii) doesn't bypass the verifier observations.
[0] ARM Architecture Reference Manual, ARMv8 [ARM DDI 0487B.b]
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487b.b/DDI0487B_b_armv8_arm.pdf
1) aarch64 instruction set: section C3.4.7 and C6.2.279 (UDIV)
"A division by zero results in a zero being written to
the destination register, without any indication that
the division by zero occurred."
2) aarch32 instruction set: section F1.4.8 and F5.1.263 (UDIV)
"For the SDIV and UDIV instructions, division by zero
always returns a zero result."
Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Recent findings by syzcaller fixed in 7891a87efc71 ("bpf: arsh is
not supported in 32 bit alu thus reject it") triggered a warning
in the interpreter due to unknown opcode not being rejected by
the verifier. The 'return 0' for an unknown opcode is really not
optimal, since with BPF to BPF calls, this would go untracked by
the verifier.
Do two things here to improve the situation: i) perform basic insn
sanity check early on in the verification phase and reject every
non-uapi insn right there. The bpf_opcode_in_insntable() table
reuses the same mapping as the jumptable in ___bpf_prog_run() sans
the non-public mappings. And ii) in ___bpf_prog_run() we do need
to BUG in the case where the verifier would ever create an unknown
opcode due to some rewrites.
Note that JITs do not have such issues since they would punt to
interpreter in these situations. Moreover, the BPF_JIT_ALWAYS_ON
would also help to avoid such unknown opcodes in the first place.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Given we recently had c131187db2d3 ("bpf: fix branch pruning
logic") and 95a762e2c8c9 ("bpf: fix incorrect sign extension in
check_alu_op()") in particular where before verifier skipped
verification of the wrongly assumed dead branch, we should not
just replace the dead code parts with nops (mov r0,r0). If there
is a bug such as fixed in 95a762e2c8c9 in future again, where
runtime could execute those insns, then one of the potential
issues with the current setting would be that given the nops
would be at the end of the program, we could execute out of
bounds at some point.
The best in such case would be to just exit the BPF program
altogether and return an exception code. However, given this
would require two instructions, and such a dead code gap could
just be a single insn long, we would need to place 'r0 = X; ret'
snippet at the very end after the user program or at the start
before the program (where we'd skip that region on prog entry),
and then place unconditional ja's into the dead code gap.
While more complex but possible, there's still another block
in the road that currently prevents from this, namely BPF to
BPF calls. The issue here is that such exception could be
returned from a callee, but the caller would not know that
it's an exception that needs to be propagated further down.
Alternative that has little complexity is to just use a ja-1
code for now which will trap the execution here instead of
silently doing bad things if we ever get there due to bugs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Very minor optimization; saves 1 byte per program in x86_64
JIT in cBPF prologue.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V update from Palmer Dabbelt:
"RISC-V: We have a new mailing list and git repo!
Sorry to send something essentially as late as possible (Friday after
an rc9), but we managed to get a mailing list for the RISC-V Linux
port. We've been using patches@groups.riscv.org for a while, but that
list has some problems (it's Google Groups and it's shared over all
RISC-V software projects). The new infaread.org list is much better.
We just got it on Wednesday but I used it a bit on Thursday to shake
out all the configuration problems and it appears to be in working
order.
When I updated the mailing list I noticed that the MAINTAINERS file
was pointing to our github repo, but now that we have a kernel.org
repo I'd like to point to that instead so I changed that as well.
We'll be centralizing all RISC-V Linux related development here as
that seems to be the saner way to go about it.
I can understand if it's too late to get this into 4.15, but given
that it's not a code change I was hoping it'd still be OK. It would be
nice to have the new mailing list and git repo in the release tarballs
so when people start to find bugs they'll get to the right place"
* tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Update the RISC-V MAINTAINERS file
|
|
To version 2.11
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
add splice_write support in cifs vfs using iter_file_splice_write
Signed-off-by: Andrés Souto <kai670@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
Quiet minor sparse warnings in new SMB3 rdma patch series
("symbol was not declared ...") by moving these externs to smbdirect.h
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
also replaces memset()+kfree() by kzfree().
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Cc: <stable@vger.kernel.org>
|
|
Minor cleanup of some sparse warnings (including a few misc
endian fixes for the new smb3 rdma code)
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
dump it as first share with an "IPC: " prefix.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
Since IPC now has a tcon object, the caller can just pass it. This
allows domain-based DFS requests to work with smb2+.
Link: https://bugzilla.samba.org/show_bug.cgi?id=12917
Fixes: 9d49640a21bf ("CIFS: implement get_dfs_refer for SMB2+")
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
* Remove ses->ipc_tid.
* Make IPC$ regular tcon.
* Add a direct pointer to it in ses->tcon_ipc.
* Distinguish PIPE tcon from IPC tcon by adding a tcon->pipe flag. All
IPC tcons are pipes but not all pipes are IPC.
* All TreeConnect functions now cannot take a NULL tcon object.
The IPC tcon has the same lifetime as the session it belongs to. It is
created when the session is created and destroyed when the session is
destroyed.
Since no mounts directly refer to the IPC tcon, its refcount should
always be set to initialisation value (1). Thus we make sure
cifs_put_tcon() skips it.
If the mount request resulting in a new session being created requires
encryption, try to require it too for IPC.
* set SERVER_NAME_LENGTH to serverName actual size
The maximum length of an ipv6 string representation is defined in
INET6_ADDRSTRLEN as 45+1 for null but lets keep what we know works.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
Do not build lib/bpf/bpf.o with this Makefile but use the one from the
library directory. This avoid making a buggy bpf.o file (e.g. missing
symbols).
This patch is useful if some code (e.g. Landlock tests) needs both the
bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf).
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Removed commented lines from test_tcpbpf_kern.c
Fixes: d6d4f60c3a09 bpf: add selftest for tcpbpf
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This patch suppresses the message about invalid TC mapping and wrong
selected TX queue. The root cause of this bug was setting too many
TC queue pairs on huge multiprocessor machines. When quantity of the
TC queue pairs is exceeding MSI-X vectors count then TX queue number
can be selected beyond actual TX queues amount.
Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The drivers for i40e and i40evf had a reg_idx value stored in the q_vector
that was going completely unused. I can only assume this was copied over
from ixgbe and nobody knew how to use it.
I'm going to make use of the value to avoid having to compute the vector
and thus the register index for multiple paths throughout the drivers.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
In commit 36777d9fa24c ("i40e: check current configured input set when
adding ntuple filters") some code was added to report the input set
mask for a given filter when reporting it to the user.
This code is necessary so that the reported filter correctly displays
that it is or is not masking certain fields.
Unfortunately the code was incorrect. Development error accidentally
swapped the mask values for the IPv4 addresses with the L4 port numbers.
The port numbers are only 16bits wide while IPv4 addresses are 32 bits.
Unfortunately we assigned only 16 bits to the IPv4 address masks.
Additionally we assigned 32bit value 0xFFFFFFF to the TCP port numbers.
This second part does not matter as the value would be truncated to
16bits regardless, but it is unnecessary.
Fix the reported masks to properly report that the entire field is
masked.
Fixes: 36777d9fa24c ("i40e: check current configured input set when adding ntuple filters")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Our hardware does not allow situations where two filters might conflict
when matching. Essentially hardware only programs one filter for each
set of matching criteria. We don't support filters with overlapping
input sets, because each flow type can only use a single input set.
Additionally, different flow types will never have overlapping matches,
because of how the hardware parses the flow type before checking
matching criteria.
For this reason, we do not need or use the location number when
programming filters to hardware.
In order to avoid confusing scenarios with filters that match the same
criteria but program the flow to different queues, do not allow multiple
filters that match identical criteria to be programmed.
This ensures that we avoid odd scenarios when deleting filters, and when
programming new filters that match the same criteria.
Instead, users that wish to update the criteria for a filter must use
the same location id, or must delete all the matching filters first.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When implementing support for IP_USER_FLOW filters, we correctly
programmed a filter for both the non fragmented IPv4/Other filter, as
well as the fragmented IPv4 filters. However, we did not properly
program the input set for fragmented IPv4 PCTYPE. This meant that the
filters would almost certainly not match, unless the user specified all
of the flow types.
Add support to program the fragmented IPv4 filter input set. Since we
always program these filters together, we'll assume that the two input
sets must match, and will thus always program the input sets to the same
value.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
kdump fails in the system when used in conjunction with Ethernet driver
X722/X710. This is mainly because when we are resource constrained i.e.
when we have just one online_cpus, we are enabling VMDq and iWARP. It
doesn't make sense to enable them with just one CPU and starve kdump
for lack of IRQs.
So don't enable VMDq or iWARP when we just have a single CPU.
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Clean up unnecessary parenthesis.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
|