Age | Commit message (Collapse) | Author |
|
The "336996 Speculative Execution Side Channel Mitigations" from
May defines this as SSB_NO, hence lets sync-up.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The mesh_neighbour_update() function, queued via beacon rx, can race with
userspace creating the same station. If the station already exists by the
time mesh_neighbour_update() is called, the function wrongly assumes rate
control has been initialized and calls rate_control_rate_update(), which
in turn calls into the driver.
Updating the rate control before it has been initialized can cause a
crash in some drivers, for example this firmware crash in ath10k due
to sta->rx_nss being 0:
[ 3078.088247] mesh0: Inserted STA 5c:e2:8c:f1:ab:ba
[ 3078.258407] ath10k_pci 0000:0d:00.0: firmware crashed! (uuid d6ed5961-93cc-4d61-803f-5eda55bb8643)
[ 3078.258421] ath10k_pci 0000:0d:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043202ff sub 0000:0000
[ 3078.258426] ath10k_pci 0000:0d:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 0 testmode 0
[ 3078.258608] ath10k_pci 0000:0d:00.0: firmware ver 10.2.4.70.59-2 api 5 features no-p2p,raw-mode,mfp crc32 4159f498
[ 3078.258613] ath10k_pci 0000:0d:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
[ 3078.258617] ath10k_pci 0000:0d:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1
[ 3078.260627] ath10k_pci 0000:0d:00.0: firmware register dump:
[ 3078.260640] ath10k_pci 0000:0d:00.0: [00]: 0x4100016C 0x000015B3 0x009A31BB 0x00955B31
[ 3078.260647] ath10k_pci 0000:0d:00.0: [04]: 0x009A31BB 0x00060130 0x00000008 0x00000007
[ 3078.260652] ath10k_pci 0000:0d:00.0: [08]: 0x00000000 0x00955B31 0x00000000 0x0040F89E
[ 3078.260656] ath10k_pci 0000:0d:00.0: [12]: 0x00000009 0xFFFFFFFF 0x009580F5 0x00958117
[ 3078.260660] ath10k_pci 0000:0d:00.0: [16]: 0x00958080 0x0094085D 0x00000000 0x00000000
[ 3078.260664] ath10k_pci 0000:0d:00.0: [20]: 0x409A31BB 0x0040AA84 0x00000002 0x00000001
[ 3078.260669] ath10k_pci 0000:0d:00.0: [24]: 0x809A2B8D 0x0040AAE4 0x00000088 0xC09A31BB
[ 3078.260673] ath10k_pci 0000:0d:00.0: [28]: 0x809898C8 0x0040AB04 0x0043F91C 0x009C6458
[ 3078.260677] ath10k_pci 0000:0d:00.0: [32]: 0x809B66AC 0x0040AB34 0x009C6458 0x0043F91C
[ 3078.260686] ath10k_pci 0000:0d:00.0: [36]: 0x809B2824 0x0040ADA4 0x00400000 0x00416EB4
[ 3078.260692] ath10k_pci 0000:0d:00.0: [40]: 0x809C07D9 0x0040ADE4 0x0040AE08 0x00412028
[ 3078.260696] ath10k_pci 0000:0d:00.0: [44]: 0x809486FA 0x0040AE04 0x00000001 0x00000000
[ 3078.260700] ath10k_pci 0000:0d:00.0: [48]: 0x80948E2C 0x0040AEA4 0x0041F4F0 0x00412634
[ 3078.260704] ath10k_pci 0000:0d:00.0: [52]: 0x809BFC39 0x0040AEC4 0x0041F4F0 0x00000001
[ 3078.260709] ath10k_pci 0000:0d:00.0: [56]: 0x80940F18 0x0040AF14 0x00000010 0x00403AC0
[ 3078.284130] ath10k_pci 0000:0d:00.0: failed to to request monitor vdev 1 stop: -108
Fix this by checking whether the sta has already initialized rate control
using the flag for that purpose. We can also drop the unnecessary insert
parameter here.
Signed-off-by: Bob Copeland <bobcopeland@fb.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Allocation size of nlmsg in cfg80211_ft_event is based on ric_ies_len
and doesn't take into account ies_len. This leads to
NL80211_CMD_FT_EVENT message construction failure in case ft_event
contains large enough ies buffer.
Add ies_len to the nlmsg allocation size.
Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
wiphy names were recently limited to 128 bytes by commit a7cfebcb7594
("cfg80211: limit wiphy names to 128 bytes"). As it turns out though,
this isn't sufficient because dev_vprintk_emit() needs the syslog header
string "SUBSYSTEM=ieee80211\0DEVICE=+ieee80211:$devname" to fit into 128
bytes. This triggered the "device/subsystem name too long" WARN when
the device name was >= 90 bytes. As before, this was reproduced by
syzbot by sending an HWSIM_CMD_NEW_RADIO command to the MAC80211_HWSIM
generic netlink family.
Fix it by further limiting wiphy names to 64 bytes.
Reported-by: syzbot+e64565577af34b3768dc@syzkaller.appspotmail.com
Fixes: a7cfebcb7594 ("cfg80211: limit wiphy names to 128 bytes")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since non atomic readq() and writeq() were added some of the drivers
would like to use it in a manner of:
#include <io-64-nonatomic-lo-hi.h>
...
pr_debug("Debug value of some register: %016llx\n", readq(addr));
However, lo_hi_readq() always returns __u64 data, while readq()
on x86_64 defines it as unsigned long. and thus compiler warns
about type mismatch, although they are both 64-bit on x86_64.
Convert readq() and writeq() on x86 to operate on deterministic
64-bit type. The most of architectures in the kernel already are using
either unsigned long long, or u64 type for readq() / writeq().
This change propagates consistency in that sense.
While this is not an issue per se, though if someone wants to address it,
the anchor could be the commit:
797a796a13df ("asm-generic: architecture independent readq/writeq for 32bit environment")
where non-atomic variants had been introduced.
Note, there are only few users of above pattern and they will not be
affected because they do cast returned value. The actual warning has
been issued on not-yet-upstreamed code.
Potentially we might get a new warnings if some 64-bit only code
assigns returned value to unsigned long type of variable. This is
assumed to be addressed on case-by-case basis.
Reported-by: lkp <lkp@intel.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180515115211.55050-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The 'tip' prefix probably referred to the -tip tree and is not required,
remove it.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180515165328.24899-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since the grub_reclaim() function can be made static, make it so.
Silences the following GCC warning (W=1):
kernel/sched/deadline.c:1120:5: warning: no previous prototype for ‘grub_reclaim’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180516200902.959-1-malat@debian.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
kernel/sched/sched.h
In the following commit:
6b55c9654fcc ("sched/debug: Move print_cfs_rq() declaration to kernel/sched/sched.h")
the print_cfs_rq() prototype was added to <kernel/sched/sched.h>,
right next to the prototypes for print_cfs_stats(), print_rt_stats()
and print_dl_stats().
Finish this previous commit and also move related prototypes for
print_rt_rq() and print_dl_rq().
Remove existing extern declarations now that they not needed anymore.
Silences the following GCC warning, triggered by W=1:
kernel/sched/debug.c:573:6: warning: no previous prototype for ‘print_rt_rq’ [-Wmissing-prototypes]
kernel/sched/debug.c:603:6: warning: no previous prototype for ‘print_dl_rq’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180516195348.30426-1-malat@debian.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Commit f65e0d299807 ("ALSA: timer: Call notifier in the same spinlock")
combined the start/continue and stop/pause functions, and in doing so
changed the event code for the pause case to SNDRV_TIMER_EVENT_CONTINUE.
Change it back to SNDRV_TIMER_EVENT_PAUSE.
Fixes: f65e0d299807 ("ALSA: timer: Call notifier in the same spinlock")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Clear the PCR (Processor Compatibility Register) on boot to ensure we
are not running in a compatibility mode.
We've seen this cause problems when a crash (and kdump) occurs while
running compat mode guests. The kdump kernel then runs with the PCR
set and causes problems. The symptom in the kdump kernel (also seen in
petitboot after fast-reboot) is early userspace programs taking
sigills on newer instructions (seen in libc).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
New compilers use the floating-point registers as spill registers when
there is high register pressure. In the purgatory however, the afp control
bit is not set. This leads to an exception whenever a floating-point
instruction is used, which again causes an interrupt loop.
Forbid the compiler to use floating-point instructions by adding
-msoft-float to KBUILD_CFLAGS.
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Fixes: 840798a1f529 (s390/kexec_file: Add purgatory)
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-05-18
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix two bugs in sockmap, a use after free in sockmap's error path
from sock_map_ctx_update_elem() where we mistakenly drop a reference
we didn't take prior to that, and in the same function fix a race
in bpf_prog_inc_not_zero() where we didn't use the progs from prior
READ_ONCE(), from John.
2) Reject program expansions once we figure out that their jump target
which crosses patchlet boundaries could otherwise get truncated in
insn->off space, from Daniel.
3) Check the return value of fopen() in BPF selftest's test_verifier
where we determine whether unpriv BPF is disabled, and iff we do
fail there then just assume it is disabled. This fixes a segfault
when used with older kernels, from Jesper.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use path_equal() to detect whether we're already in root.
Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The __dentry_open function was removed in
commit <2a027e7a18738>("fold __dentry_open() into its sole caller").
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Userptr IOCTL zero size check (Matt)
- Two hardware quirk fixes (Michel & Chris)
* tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
drm/i915/execlists: Use rmb() to order CSB reads
drm/i915/userptr: reject zero user_size
|
|
Fixes: 2cd1f0ddbb56 ("isdn: replace ->proc_fops with ->proc_show")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Recently during testing, I ran into the following panic:
[ 207.892422] Internal error: Accessing user space memory outside uaccess.h routines: 96000004 [#1] SMP
[ 207.901637] Modules linked in: binfmt_misc [...]
[ 207.966530] CPU: 45 PID: 2256 Comm: test_verifier Tainted: G W 4.17.0-rc3+ #7
[ 207.974956] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
[ 207.982428] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 207.987214] pc : bpf_skb_load_helper_8_no_cache+0x34/0xc0
[ 207.992603] lr : 0xffff000000bdb754
[ 207.996080] sp : ffff000013703ca0
[ 207.999384] x29: ffff000013703ca0 x28: 0000000000000001
[ 208.004688] x27: 0000000000000001 x26: 0000000000000000
[ 208.009992] x25: ffff000013703ce0 x24: ffff800fb4afcb00
[ 208.015295] x23: ffff00007d2f5038 x22: ffff00007d2f5000
[ 208.020599] x21: fffffffffeff2a6f x20: 000000000000000a
[ 208.025903] x19: ffff000009578000 x18: 0000000000000a03
[ 208.031206] x17: 0000000000000000 x16: 0000000000000000
[ 208.036510] x15: 0000ffff9de83000 x14: 0000000000000000
[ 208.041813] x13: 0000000000000000 x12: 0000000000000000
[ 208.047116] x11: 0000000000000001 x10: ffff0000089e7f18
[ 208.052419] x9 : fffffffffeff2a6f x8 : 0000000000000000
[ 208.057723] x7 : 000000000000000a x6 : 00280c6160000000
[ 208.063026] x5 : 0000000000000018 x4 : 0000000000007db6
[ 208.068329] x3 : 000000000008647a x2 : 19868179b1484500
[ 208.073632] x1 : 0000000000000000 x0 : ffff000009578c08
[ 208.078938] Process test_verifier (pid: 2256, stack limit = 0x0000000049ca7974)
[ 208.086235] Call trace:
[ 208.088672] bpf_skb_load_helper_8_no_cache+0x34/0xc0
[ 208.093713] 0xffff000000bdb754
[ 208.096845] bpf_test_run+0x78/0xf8
[ 208.100324] bpf_prog_test_run_skb+0x148/0x230
[ 208.104758] sys_bpf+0x314/0x1198
[ 208.108064] el0_svc_naked+0x30/0x34
[ 208.111632] Code: 91302260 f9400001 f9001fa1 d2800001 (29500680)
[ 208.117717] ---[ end trace 263cb8a59b5bf29f ]---
The program itself which caused this had a long jump over the whole
instruction sequence where all of the inner instructions required
heavy expansions into multiple BPF instructions. Additionally, I also
had BPF hardening enabled which requires once more rewrites of all
constant values in order to blind them. Each time we rewrite insns,
bpf_adj_branches() would need to potentially adjust branch targets
which cross the patchlet boundary to accommodate for the additional
delta. Eventually that lead to the case where the target offset could
not fit into insn->off's upper 0x7fff limit anymore where then offset
wraps around becoming negative (in s16 universe), or vice versa
depending on the jump direction.
Therefore it becomes necessary to detect and reject any such occasions
in a generic way for native eBPF and cBPF to eBPF migrations. For
the latter we can simply check bounds in the bpf_convert_filter()'s
BPF_EMIT_JMP helper macro and bail out once we surpass limits. The
bpf_patch_insn_single() for native eBPF (and cBPF to eBPF in case
of subsequent hardening) is a bit more complex in that we need to
detect such truncations before hitting the bpf_prog_realloc(). Thus
the latter is split into an extra pass to probe problematic offsets
on the original program in order to fail early. With that in place
and carefully tested I no longer hit the panic and the rewrites are
rejected properly. The above example panic I've seen on bpf-next,
though the issue itself is generic in that a guard against this issue
in bpf seems more appropriate in this case.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Two k10temp fixes:
- fix race condition when accessing System Management Network
registers
- fix reading critical temperatures on F15h M60h and M70h
Also add PCI ID's for the AMD Raven Ridge root bridge"
* tag 'hwmon-for-linus-v4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (k10temp) Use API function to access System Management Network
x86/amd_nb: Add support for Raven Ridge CPUs
hwmon: (k10temp) Fix reading critical temperature register
|
|
In the sockmap design BPF programs (SK_SKB_STREAM_PARSER,
SK_SKB_STREAM_VERDICT and SK_MSG_VERDICT) are attached to the sockmap
map type and when a sock is added to the map the programs are used by
the socket. However, sockmap updates from both userspace and BPF
programs can happen concurrently with the attach and detach of these
programs.
To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
primitive to ensure the program pointer is not refeched and
possibly NULL'd before the refcnt increment. This happens inside
a RCU critical section so although the pointer reference in the map
object may be NULL (by a concurrent detach operation) the reference
from READ_ONCE will not be free'd until after grace period. This
ensures the object returned by READ_ONCE() is valid through the
RCU criticl section and safe to use as long as we "know" it may
be free'd shortly.
Daniel spotted a case in the sock update API where instead of using
the READ_ONCE() program reference we used the pointer from the
original map, stab->bpf_{verdict|parse|txmsg}. The problem with this
is the logic checks the object returned from the READ_ONCE() is not
NULL and then tries to reference the object again but using the
above map pointer, which may have already been NULL'd by a parallel
detach operation. If this happened bpf_porg_inc_not_zero could
dereference a NULL pointer.
Fix this by using variable returned by READ_ONCE() that is checked
for NULL.
Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
If the user were to only attach one of the parse or verdict programs
then it is possible a subsequent sockmap update could incorrectly
decrement the refcnt on the program. This happens because in the
rollback logic, after an error, we have to decrement the program
reference count when its been incremented. However, we only increment
the program reference count if the user has both a verdict and a
parse program. The reason for this is because, at least at the
moment, both are required for any one to be meaningful. The problem
fixed here is in the rollback path we decrement the program refcnt
even if only one existing. But we never incremented the refcnt in
the first place creating an imbalance.
This patch fixes the error path to handle this case.
Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Device features may change during transmission. In particular with
corking, a device may toggle scatter-gather in between allocating
and writing to an skb.
Do not unconditionally assume that !NETIF_F_SG at write time implies
that the same held at alloc time and thus the skb has sufficient
tailroom.
This issue predates git history.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Petr Machata says:
====================
net: ip6_gre: Fixes in headroom handling
This series mends some problems in headroom management in ip6_gre
module. The current code base has the following three closely-related
problems:
- ip6gretap tunnels neglect to ensure there's enough writable headroom
before pushing GRE headers.
- ip6erspan does this, but assumes that dev->needed_headroom is primed.
But that doesn't happen until ip6_tnl_xmit() is called later. Thus for
the first packet, ip6erspan actually behaves like ip6gretap above.
- ip6erspan shares some of the code with ip6gretap, including
calculations of needed header length. While there is custom
ERSPAN-specific code for calculating the headroom, the computed
values are overwritten by the ip6gretap code.
The first two issues lead to a kernel panic in situations where a packet
is mirrored from a veth device to the device in question. They are
fixed, respectively, in patches #1 and #2, which include the full panic
trace and a reproducer.
The rest of the patchset deals with the last issue. In patches #3 to #6,
several functions are split up into reusable parts. Finally in patch #7
these blocks are used to compose ERSPAN-specific callbacks where
necessary to fix the hlen calculation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
overwrites these settings with GRE-specific ones.
Similarly for changelink callbacks, which are handled by
ip6gre_changelink() calls ip6gre_tnl_change() calls
ip6gre_tnl_link_config() as well.
The difference ends up being 12 vs. 20 bytes, and this is generally not
a problem, because a 12-byte request likely ends up allocating more and
the extra 8 bytes are thus available. However correct it is not.
So replace the newlink and changelink callbacks with an ERSPAN-specific
ones, reusing the newly-introduced _common() functions.
Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extract from ip6gre_changelink() a reusable function
ip6gre_changelink_common(). This will allow introduction of
ERSPAN-specific _changelink() function with not a lot of code
duplication.
Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extract from ip6gre_newlink() a reusable function
ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
made customizable for ERSPAN, thus reorder it with calls to
ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
function without a lot of duplicity.
Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Split a reusable function ip6gre_tnl_copy_tnl_parm() from
ip6gre_tnl_change(). This will allow ERSPAN-specific code to
reuse the common parts while customizing the behavior for ERSPAN.
Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function ip6gre_tnl_link_config() is used for setting up
configuration of both ip6gretap and ip6erspan tunnels. Split the
function into the common part and the route-lookup part. The latter then
takes the calculated header length as an argument. This split will allow
the patches down the line to sneak in a custom header length computation
for the ERSPAN tunnel.
Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dev->needed_headroom is not primed until ip6_tnl_xmit(), so it starts
out zero. Thus the call to skb_cow_head() fails to actually make sure
there's enough headroom to push the ERSPAN headers to. That can lead to
the panic cited below. (Reproducer below that).
Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.
[ 190.703567] kernel BUG at net/core/skbuff.c:104!
[ 190.708384] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[ 190.714007] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_temp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[ 190.728975] CPU: 1 PID: 959 Comm: kworker/1:2 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[ 190.737647] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[ 190.747006] Workqueue: ipv6_addrconf addrconf_dad_work
[ 190.752222] RIP: 0010:skb_panic+0xc3/0x100
[ 190.756358] RSP: 0018:ffff8801d54072f0 EFLAGS: 00010282
[ 190.761629] RAX: 0000000000000085 RBX: ffff8801c1a8ecc0 RCX: 0000000000000000
[ 190.768830] RDX: 0000000000000085 RSI: dffffc0000000000 RDI: ffffed003aa80e54
[ 190.776025] RBP: ffff8801bd1ec5a0 R08: ffffed003aabce19 R09: ffffed003aabce19
[ 190.783226] R10: 0000000000000001 R11: ffffed003aabce18 R12: ffff8801bf695dbe
[ 190.790418] R13: 0000000000000084 R14: 00000000000006c0 R15: ffff8801bf695dc8
[ 190.797621] FS: 0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[ 190.805786] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 190.811582] CR2: 000055fa929aced0 CR3: 0000000003228004 CR4: 00000000001606e0
[ 190.818790] Call Trace:
[ 190.821264] <IRQ>
[ 190.823314] ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
[ 190.828940] ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
[ 190.834562] skb_push+0x78/0x90
[ 190.837749] ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
[ 190.843219] ? ip6gre_tunnel_ioctl+0xd90/0xd90 [ip6_gre]
[ 190.848577] ? debug_check_no_locks_freed+0x210/0x210
[ 190.853679] ? debug_check_no_locks_freed+0x210/0x210
[ 190.858783] ? print_irqtrace_events+0x120/0x120
[ 190.863451] ? sched_clock_cpu+0x18/0x210
[ 190.867496] ? cyc2ns_read_end+0x10/0x10
[ 190.871474] ? skb_network_protocol+0x76/0x200
[ 190.875977] dev_hard_start_xmit+0x137/0x770
[ 190.880317] ? do_raw_spin_trylock+0x6d/0xa0
[ 190.884624] sch_direct_xmit+0x2ef/0x5d0
[ 190.888589] ? pfifo_fast_dequeue+0x3fa/0x670
[ 190.892994] ? pfifo_fast_change_tx_queue_len+0x810/0x810
[ 190.898455] ? __lock_is_held+0xa0/0x160
[ 190.902422] __qdisc_run+0x39e/0xfc0
[ 190.906041] ? _raw_spin_unlock+0x29/0x40
[ 190.910090] ? pfifo_fast_enqueue+0x24b/0x3e0
[ 190.914501] ? sch_direct_xmit+0x5d0/0x5d0
[ 190.918658] ? pfifo_fast_dequeue+0x670/0x670
[ 190.923047] ? __dev_queue_xmit+0x172/0x1770
[ 190.927365] ? preempt_count_sub+0xf/0xd0
[ 190.931421] __dev_queue_xmit+0x410/0x1770
[ 190.935553] ? ___slab_alloc+0x605/0x930
[ 190.939524] ? print_irqtrace_events+0x120/0x120
[ 190.944186] ? memcpy+0x34/0x50
[ 190.947364] ? netdev_pick_tx+0x1c0/0x1c0
[ 190.951428] ? __skb_clone+0x2fd/0x3d0
[ 190.955218] ? __copy_skb_header+0x270/0x270
[ 190.959537] ? rcu_read_lock_sched_held+0x93/0xa0
[ 190.964282] ? kmem_cache_alloc+0x344/0x4d0
[ 190.968520] ? cyc2ns_read_end+0x10/0x10
[ 190.972495] ? skb_clone+0x123/0x230
[ 190.976112] ? skb_split+0x820/0x820
[ 190.979747] ? tcf_mirred+0x554/0x930 [act_mirred]
[ 190.984582] tcf_mirred+0x554/0x930 [act_mirred]
[ 190.989252] ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[ 190.996109] ? __lock_acquire+0x706/0x26e0
[ 191.000239] ? sched_clock_cpu+0x18/0x210
[ 191.004294] tcf_action_exec+0xcf/0x2a0
[ 191.008179] tcf_classify+0xfa/0x340
[ 191.011794] __netif_receive_skb_core+0x8e1/0x1c60
[ 191.016630] ? debug_check_no_locks_freed+0x210/0x210
[ 191.021732] ? nf_ingress+0x500/0x500
[ 191.025458] ? process_backlog+0x347/0x4b0
[ 191.029619] ? print_irqtrace_events+0x120/0x120
[ 191.034302] ? lock_acquire+0xd8/0x320
[ 191.038089] ? process_backlog+0x1b6/0x4b0
[ 191.042246] ? process_backlog+0xc2/0x4b0
[ 191.046303] process_backlog+0xc2/0x4b0
[ 191.050189] net_rx_action+0x5cc/0x980
[ 191.053991] ? napi_complete_done+0x2c0/0x2c0
[ 191.058386] ? mark_lock+0x13d/0xb40
[ 191.062001] ? clockevents_program_event+0x6b/0x1d0
[ 191.066922] ? print_irqtrace_events+0x120/0x120
[ 191.071593] ? __lock_is_held+0xa0/0x160
[ 191.075566] __do_softirq+0x1d4/0x9d2
[ 191.079282] ? ip6_finish_output2+0x524/0x1460
[ 191.083771] do_softirq_own_stack+0x2a/0x40
[ 191.087994] </IRQ>
[ 191.090130] do_softirq.part.13+0x38/0x40
[ 191.094178] __local_bh_enable_ip+0x135/0x190
[ 191.098591] ip6_finish_output2+0x54d/0x1460
[ 191.102916] ? ip6_forward_finish+0x2f0/0x2f0
[ 191.107314] ? ip6_mtu+0x3c/0x2c0
[ 191.110674] ? ip6_finish_output+0x2f8/0x650
[ 191.114992] ? ip6_output+0x12a/0x500
[ 191.118696] ip6_output+0x12a/0x500
[ 191.122223] ? ip6_route_dev_notify+0x5b0/0x5b0
[ 191.126807] ? ip6_finish_output+0x650/0x650
[ 191.131120] ? ip6_fragment+0x1a60/0x1a60
[ 191.135182] ? icmp6_dst_alloc+0x26e/0x470
[ 191.139317] mld_sendpack+0x672/0x830
[ 191.143021] ? igmp6_mcf_seq_next+0x2f0/0x2f0
[ 191.147429] ? __local_bh_enable_ip+0x77/0x190
[ 191.151913] ipv6_mc_dad_complete+0x47/0x90
[ 191.156144] addrconf_dad_completed+0x561/0x720
[ 191.160731] ? addrconf_rs_timer+0x3a0/0x3a0
[ 191.165036] ? mark_held_locks+0xc9/0x140
[ 191.169095] ? __local_bh_enable_ip+0x77/0x190
[ 191.173570] ? addrconf_dad_work+0x50d/0xa20
[ 191.177886] ? addrconf_dad_work+0x529/0xa20
[ 191.182194] addrconf_dad_work+0x529/0xa20
[ 191.186342] ? addrconf_dad_completed+0x720/0x720
[ 191.191088] ? __lock_is_held+0xa0/0x160
[ 191.195059] ? process_one_work+0x45d/0xe20
[ 191.199302] ? process_one_work+0x51e/0xe20
[ 191.203531] ? rcu_read_lock_sched_held+0x93/0xa0
[ 191.208279] process_one_work+0x51e/0xe20
[ 191.212340] ? pwq_dec_nr_in_flight+0x200/0x200
[ 191.216912] ? get_lock_stats+0x4b/0xf0
[ 191.220788] ? preempt_count_sub+0xf/0xd0
[ 191.224844] ? worker_thread+0x219/0x860
[ 191.228823] ? do_raw_spin_trylock+0x6d/0xa0
[ 191.233142] worker_thread+0xeb/0x860
[ 191.236848] ? process_one_work+0xe20/0xe20
[ 191.241095] kthread+0x206/0x300
[ 191.244352] ? process_one_work+0xe20/0xe20
[ 191.248587] ? kthread_stop+0x570/0x570
[ 191.252459] ret_from_fork+0x3a/0x50
[ 191.256082] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[ 191.275327] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d54072f0
[ 191.281024] ---[ end trace 7ea51094e099e006 ]---
[ 191.285724] Kernel panic - not syncing: Fatal exception in interrupt
[ 191.292168] Kernel Offset: disabled
[ 191.295697] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Reproducer:
ip link add h1 type veth peer name swp1
ip link add h3 type veth peer name swp3
ip link set dev h1 up
ip address add 192.0.2.1/28 dev h1
ip link add dev vh3 type vrf table 20
ip link set dev h3 master vh3
ip link set dev vh3 up
ip link set dev h3 up
ip link set dev swp3 up
ip address add dev swp3 2001:db8:2::1/64
ip link set dev swp1 up
tc qdisc add dev swp1 clsact
ip link add name gt6 type ip6erspan \
local 2001:db8:2::1 remote 2001:db8:2::2 oseq okey 123
ip link set dev gt6 up
sleep 1
tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
action mirred egress mirror dev gt6
ping -I h1 192.0.2.2
Fixes: e41c7c68ea77 ("ip6erspan: make sure enough headroom at xmit.")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__gre6_xmit() pushes GRE headers before handing over to ip6_tnl_xmit()
for generic IP-in-IP processing. However it doesn't make sure that there
is enough headroom to push the header to. That can lead to the panic
cited below. (Reproducer below that).
Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.
[ 158.576725] kernel BUG at net/core/skbuff.c:104!
[ 158.581510] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[ 158.587174] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_temp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[ 158.602268] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[ 158.610938] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[ 158.620426] RIP: 0010:skb_panic+0xc3/0x100
[ 158.624586] RSP: 0018:ffff8801d3f27110 EFLAGS: 00010286
[ 158.629882] RAX: 0000000000000082 RBX: ffff8801c02cc040 RCX: 0000000000000000
[ 158.637127] RDX: 0000000000000082 RSI: dffffc0000000000 RDI: ffffed003a7e4e18
[ 158.644366] RBP: ffff8801bfec8020 R08: ffffed003aabce19 R09: ffffed003aabce19
[ 158.651574] R10: 000000000000000b R11: ffffed003aabce18 R12: ffff8801c364de66
[ 158.658786] R13: 000000000000002c R14: 00000000000000c0 R15: ffff8801c364de68
[ 158.666007] FS: 0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[ 158.674212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 158.680036] CR2: 00007f4b3702dcd0 CR3: 0000000003228002 CR4: 00000000001606e0
[ 158.687228] Call Trace:
[ 158.689752] ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[ 158.694475] ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[ 158.699141] skb_push+0x78/0x90
[ 158.702344] __gre6_xmit+0x246/0xd80 [ip6_gre]
[ 158.706872] ip6gre_tunnel_xmit+0x3bc/0x610 [ip6_gre]
[ 158.711992] ? __gre6_xmit+0xd80/0xd80 [ip6_gre]
[ 158.716668] ? debug_check_no_locks_freed+0x210/0x210
[ 158.721761] ? print_irqtrace_events+0x120/0x120
[ 158.726461] ? sched_clock_cpu+0x18/0x210
[ 158.730572] ? sched_clock_cpu+0x18/0x210
[ 158.734692] ? cyc2ns_read_end+0x10/0x10
[ 158.738705] ? skb_network_protocol+0x76/0x200
[ 158.743216] ? netif_skb_features+0x1b2/0x550
[ 158.747648] dev_hard_start_xmit+0x137/0x770
[ 158.752010] sch_direct_xmit+0x2ef/0x5d0
[ 158.755992] ? pfifo_fast_dequeue+0x3fa/0x670
[ 158.760460] ? pfifo_fast_change_tx_queue_len+0x810/0x810
[ 158.765975] ? __lock_is_held+0xa0/0x160
[ 158.770002] __qdisc_run+0x39e/0xfc0
[ 158.773673] ? _raw_spin_unlock+0x29/0x40
[ 158.777781] ? pfifo_fast_enqueue+0x24b/0x3e0
[ 158.782191] ? sch_direct_xmit+0x5d0/0x5d0
[ 158.786372] ? pfifo_fast_dequeue+0x670/0x670
[ 158.790818] ? __dev_queue_xmit+0x172/0x1770
[ 158.795195] ? preempt_count_sub+0xf/0xd0
[ 158.799313] __dev_queue_xmit+0x410/0x1770
[ 158.803512] ? ___slab_alloc+0x605/0x930
[ 158.807525] ? ___slab_alloc+0x605/0x930
[ 158.811540] ? memcpy+0x34/0x50
[ 158.814768] ? netdev_pick_tx+0x1c0/0x1c0
[ 158.818895] ? __skb_clone+0x2fd/0x3d0
[ 158.822712] ? __copy_skb_header+0x270/0x270
[ 158.827079] ? rcu_read_lock_sched_held+0x93/0xa0
[ 158.831903] ? kmem_cache_alloc+0x344/0x4d0
[ 158.836199] ? skb_clone+0x123/0x230
[ 158.839869] ? skb_split+0x820/0x820
[ 158.843521] ? tcf_mirred+0x554/0x930 [act_mirred]
[ 158.848407] tcf_mirred+0x554/0x930 [act_mirred]
[ 158.853104] ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[ 158.860005] ? __lock_acquire+0x706/0x26e0
[ 158.864162] ? mark_lock+0x13d/0xb40
[ 158.867832] tcf_action_exec+0xcf/0x2a0
[ 158.871736] tcf_classify+0xfa/0x340
[ 158.875402] __netif_receive_skb_core+0x8e1/0x1c60
[ 158.880334] ? nf_ingress+0x500/0x500
[ 158.884059] ? process_backlog+0x347/0x4b0
[ 158.888241] ? lock_acquire+0xd8/0x320
[ 158.892050] ? process_backlog+0x1b6/0x4b0
[ 158.896228] ? process_backlog+0xc2/0x4b0
[ 158.900291] process_backlog+0xc2/0x4b0
[ 158.904210] net_rx_action+0x5cc/0x980
[ 158.908047] ? napi_complete_done+0x2c0/0x2c0
[ 158.912525] ? rcu_read_unlock+0x80/0x80
[ 158.916534] ? __lock_is_held+0x34/0x160
[ 158.920541] __do_softirq+0x1d4/0x9d2
[ 158.924308] ? trace_event_raw_event_irq_handler_exit+0x140/0x140
[ 158.930515] run_ksoftirqd+0x1d/0x40
[ 158.934152] smpboot_thread_fn+0x32b/0x690
[ 158.938299] ? sort_range+0x20/0x20
[ 158.941842] ? preempt_count_sub+0xf/0xd0
[ 158.945940] ? schedule+0x5b/0x140
[ 158.949412] kthread+0x206/0x300
[ 158.952689] ? sort_range+0x20/0x20
[ 158.956249] ? kthread_stop+0x570/0x570
[ 158.960164] ret_from_fork+0x3a/0x50
[ 158.963823] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[ 158.983235] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d3f27110
[ 158.988935] ---[ end trace 5af56ee845aa6cc8 ]---
[ 158.993641] Kernel panic - not syncing: Fatal exception in interrupt
[ 159.000176] Kernel Offset: disabled
[ 159.003767] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Reproducer:
ip link add h1 type veth peer name swp1
ip link add h3 type veth peer name swp3
ip link set dev h1 up
ip address add 192.0.2.1/28 dev h1
ip link add dev vh3 type vrf table 20
ip link set dev h3 master vh3
ip link set dev vh3 up
ip link set dev h3 up
ip link set dev swp3 up
ip address add dev swp3 2001:db8:2::1/64
ip link set dev swp1 up
tc qdisc add dev swp1 clsact
ip link add name gt6 type ip6gretap \
local 2001:db8:2::1 remote 2001:db8:2::2
ip link set dev gt6 up
sleep 1
tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
action mirred egress mirror dev gt6
ping -I h1 192.0.2.2
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 0a6748740368 ("selftests/bpf: Only run tests if !bpf_disabled")
forgot to check return value of fopen.
This caused some confusion, when running test_verifier (from
tools/testing/selftests/bpf/) on an older kernel (< v4.4) as it will
simply seqfault.
This fix avoids the segfault and prints an error, but allow program to
continue. Given the sysctl was introduced in 1be7f75d1668 ("bpf:
enable non-root eBPF programs"), we know that the running kernel
cannot support unpriv, thus continue with unpriv_disabled = true.
Fixes: 0a6748740368 ("selftests/bpf: Only run tests if !bpf_disabled")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
When perf data is recorded with the call-graph option enabled, the
callchain shown by perf script shows the binary offsets of the symbols
as the ip. This is incorrect for kernel symbols as the ip values are
always off by a fixed offset depending on the architecture. If the
offsets from the start of the symbols are printed, they are also
incorrect for both kernel and userspace symbols.
Without the call-graph option, the callchain shows the virtual addresses
of the symbols rather than their binary offsets. The offsets printed in
this case are also correct.
This fixes the inconsistency in perf script's output.
This can be verified on a powerpc64le system running Fedora 27 as
follows:
# cat /proc/kallsyms | grep sys_write
...
c0000000004025a0 T sys_write
c0000000004025a0 T __se_sys_write
...
# perf probe -a sys_write
Before applying this patch:
# perf record -e probe:sys_write -g ~/test
# perf script -F ip,sym,symoff
4125b0 sys_write+0x8000000000008010
1b9e0 system_call+0x8000000000008058
118234 __GI___libc_write+0xffff0000f52c0024
92c74 _IO_file_write@@GLIBC_2.17+0xffff0000f52c0044
5afbfd8a [unknown]
91a60 new_do_write+0xffff0000f52c0090
94638 _IO_do_write@@GLIBC_2.17+0xffff0000f52c0038
94bbc _IO_file_overflow@@GLIBC_2.17+0xffff0000f52c014c
95a24 __overflow+0xffff0000f52c0064
84548 _IO_puts+0xffff0000f52c0218
440 main+0xffffffffe0000020
236a0 generic_start_main.isra.0+0xffff0000f52c0140
23898 __libc_start_main+0xffff0000f52c00b8
0 [unknown]
...
# perf record -e probe:sys_write ~/test
# perf script -F ip,sym,symoff
c0000000004025b0 sys_write+0x10
...
After applying this patch:
# perf record -e probe:sys_write -g ~/test
# perf script -F ip,sym,symoff
c0000000004025b0 sys_write+0x10
c00000000000b9e0 system_call+0x58
7fffb70d8234 __GI___libc_write+0x24
7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44
5afc1818 [unknown]
7fffb7051a60 new_do_write+0x90
7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38
7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c
7fffb7055a24 __overflow+0x64
7fffb7044548 _IO_puts+0x218
10000440 main+0x20
7fffb6fe36a0 generic_start_main.isra.0+0x140
7fffb6fe3898 __libc_start_main+0xb8
0 [unknown]
...
# perf record -e probe:sys_write ~/test
# perf script -F ip,sym,symoff
c0000000004025b0 sys_write+0x10
...
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180517063326.6319-1-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
ERSPAN only support version 1 and 2. When packets send to an
erspan device which does not have proper version number set,
drop the packet. In real case, we observe multicast packets
sent to the erspan pernet device, erspan0, which does not have
erspan version configured.
Reported-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Let tools that need to have those variables with the sysctl current
values use a function that will read them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1ljj3oeo5kpt2n1icfd9vowe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It is not read as commonly as 'page_size', so it makes sense to read it
lazily, caching its value when it is first read.
Less files open unconditionally at startup.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-35xhrq91u94uc1djtclek1ie@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rick bisected a regression on large systems which use the x2apic cluster
mode for interrupt delivery to the commit wich reworked the cluster
management.
The problem is caused by a missing initialization of the clusterid field
in the shared cluster data structures. So all structures end up with
cluster ID 0 which only allows sharing between all CPUs which belong to
cluster 0. All other CPUs with a cluster ID > 0 cannot share the data
structure because they cannot find existing data with their cluster
ID. This causes malfunction with IPIs because IPIs are sent to the wrong
cluster and the caller waits for ever that the target CPU handles the IPI.
Add the missing initialization when a upcoming CPU is the first in a
cluster so that the later booting CPUs can find the data and share it for
proper operation.
Fixes: 023a611748fd ("x86/apic/x2apic: Simplify cluster management")
Reported-by: Rick Warner <rick@microway.com>
Bisected-by: Rick Warner <rick@microway.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Rick Warner <rick@microway.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1805171418210.1947@nanos.tec.linutronix.de
|
|
Thomas Falcon says:
====================
ibmvnic: Fix bugs and memory leaks
This is a small patch series fixing up some bugs and memory leaks
in the ibmvnic driver. The first fix frees up previously allocated
memory that should be freed in case of an error. The second fixes
a reset case that was failing due to TX/RX queue IRQ's being
erroneously disabled without being enabled again. The final patch
fixes incorrect reallocated of statistics buffers during a device
reset, resulting in loss of statistics information and a memory leak.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move initialization of statistics buffers from ibmvnic_init function
into ibmvnic_probe. In the current state, ibmvnic_init will be called
again during a device reset, resulting in the allocation of new
buffers without freeing the old ones.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is not necessary to disable interrupt lines here during a reset
to handle a non-fatal firmware error. Move that call within the code
block that handles the other cases that do require interrupts to be
disabled and re-enabled.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the firmware map fails for whatever reason, remember to free
up the memory after.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Updating the FIB tracepoint for the recent change to allow rules using
the protocol and ports exposed a few places where the entries in the flow
struct are not initialized.
For __fib_validate_source add the call to fib4_rules_early_flow_dissect
since it is invoked for the input path. For netfilter, add the memset on
the flow struct to avoid future problems like this. In ip_route_input_slow
need to set the fields if the skb dissection does not happen.
Fixes: bfff4862653b ("net: fib_rules: support for match on ip_proto, sport and dport")
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
scatterlist code expects virt_to_page() to work, which fails with
CONFIG_VMAP_STACK=y.
Fixes: c46234ebb4d1e ("tls: RX path for ktls")
Signed-off-by: Matt Mullins <mmullins@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adopt it from the kernel sources, will be used soon.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-oubheiqj8edo5rzewt11cbn0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Not anymore accessed outside this library, keep it private.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-wg1m07flfrg1rm06jjzie8si@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
That takes care of using the right call to get the tracing_path
directory, the one that will end up calling tracing_path_set() to figure
out where tracefs is mounted.
One more step in doing just lazy reading of system structures to reduce
the number of operations done unconditionaly at 'perf' start.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-42zzi0f274909bg9mxzl81bu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Instead of accessing the trace_events_path variable directly, that may
not have been properly initialized wrt detecting where tracefs is
mounted.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-id7hzn1ydgkxbumeve5wapqz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When using for_each_event() we needlessly rebuild the whole path to
the tracepoint directory, reuse the dir_path instead, saving some cycles
and reducing the size of the next patch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-54bcs15n0cp6gwcgpc4hptyc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pull kvm fixes from Paolo Bonzini:
- ARM/ARM64 locking fixes
- x86 fixes: PCID, UMIP, locking
- improved support for recent Windows version that have a 2048 Hz APIC
timer
- rename KVM_HINTS_DEDICATED CPUID bit to KVM_HINTS_REALTIME
- better behaved selftests
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: rename KVM_HINTS_DEDICATED to KVM_HINTS_REALTIME
KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
KVM: arm/arm64: VGIC/ITS: Promote irq_lock() in update_affinity
KVM: arm/arm64: Properly protect VGIC locks from IRQs
KVM: X86: Lower the default timer frequency limit to 200us
KVM: vmx: update sec exec controls for UMIP iff emulating UMIP
kvm: x86: Suppress CR3_PCID_INVD bit only when PCIDs are enabled
KVM: selftests: exit with 0 status code when tests cannot be run
KVM: hyperv: idr_find needs RCU protection
x86: Delay skip of emulated hypercall instruction
KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
KVM: s390: Fix vsie handling for transactional diagnostic block
vsie (nested KVM) might reject a valid input. Fix it.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We have a core fix in the compat code for covering a potential race
(double references), but it's a very minor change.
The rest are all small device-specific quirks, as well as a correction
of the new UAC3 support code"
* tag 'sound-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Use Class Specific EP for UAC3 devices.
ALSA: hda/realtek - Clevo P950ER ALC1220 Fixup
ALSA: usb: mixer: volume quirk for CM102-A+/102S+
ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
ALSA: control: fix a redundant-copy issue
|
|
KVM_HINTS_DEDICATED seems to be somewhat confusing:
Guest doesn't really care whether it's the only task running on a host
CPU as long as it's not preempted.
And there are more reasons for Guest to be preempted than host CPU
sharing, for example, with memory overcommit it can get preempted on a
memory access, post copy migration can cause preemption, etc.
Let's call it KVM_HINTS_REALTIME which seems to better
match what guests expect.
Also, the flag most be set on all vCPUs - current guests assume this.
Note so in the documentation.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|