Age | Commit message (Collapse) | Author |
|
We must zero struct pages for memory that is not backed by physical
memory, or kernel does not have access to.
Recently, there was a change which zeroed all memmap for all holes in
e820. Unfortunately, it introduced a bug that is discussed here:
https://www.spinics.net/lists/linux-mm/msg156764.html
Linus, also saw this bug on his machine, and confirmed that reverting
commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved") fixes the issue.
The problem is that we incorrectly zero some struct pages after they
were setup.
The fix is to zero unavailable struct pages prior to initializing of
struct pages.
A more detailed fix should come later that would avoid double zeroing
cases: one in __init_single_page(), the other one in
zero_resv_unavail().
Fixes: 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch augments the output of bpftool's map dump and map lookup
commands to print data along side btf info, if the correspondin btf
info is available. The outputs for each of map dump and map lookup
commands are augmented in two ways:
1. when neither of -j and -p are supplied, btf-ful map data is printed
whose aim is human readability. This means no commitments for json- or
backward- compatibility.
2. when either -j or -p are supplied, a new json object named
"formatted" is added for each key-value pair. This object contains the
same data as the key-value pair, but with btf info. "formatted" object
promises json- and backward- compatibility. Below is a sample output.
$ bpftool map dump -p id 8
[{
"key": ["0x0f","0x00","0x00","0x00"
],
"value": ["0x03", "0x00", "0x00", "0x00", ...
],
"formatted": {
"key": 15,
"value": {
"int_field": 3,
...
}
}
}
]
This patch calls btf_dumper introduced in previous patch to accomplish
the above. Indeed, btf-ful info is only displayed if btf data for the
given map is available. Otherwise existing output is displayed as-is.
Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This consumes functionality exported in the previous patch. It does the
main job of printing with BTF data. This is used in the following patch
to provide a more readable output of a map's dump. It relies on
json_writer to do json printing. Below is sample output where map keys
are ints and values are of type struct A:
typedef int int_type;
enum E {
E0,
E1,
};
struct B {
int x;
int y;
};
struct A {
int m;
unsigned long long n;
char o;
int p[8];
int q[4][8];
enum E r;
void *s;
struct B t;
const int u;
int_type v;
unsigned int w1: 3;
unsigned int w2: 3;
};
$ sudo bpftool map dump id 14
[{
"key": 0,
"value": {
"m": 1,
"n": 2,
"o": "c",
"p": [15,16,17,18,15,16,17,18
],
"q": [[25,26,27,28,25,26,27,28
],[35,36,37,38,35,36,37,38
],[45,46,47,48,45,46,47,48
],[55,56,57,58,55,56,57,58
]
],
"r": 1,
"s": 0x7ffd80531cf8,
"t": {
"x": 5,
"y": 10
},
"u": 100,
"v": 20,
"w1": 0x7,
"w2": 0x3
}
}
]
This patch uses json's {} and [] to imply struct/union and array. More
explicit information can be added later. For example, a command line
option can be introduced to print whether a key or value is struct
or union, name of a struct etc. This will however come at the expense
of duplicating info when, for example, printing an array of structs.
enums are printed as ints without their names.
Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This patch introduces btf__resolve_type() function and exports two
existing functions from libbpf. btf__resolve_type follows modifier
types like const and typedef until it hits a type which actually takes
up memory, and then returns it. This function follows similar pattern
to btf__resolve_size but instead of computing size, it just returns
the type.
These functions will be used in the followig patch which parses
information inside array of `struct btf_type *`. btf_name_by_offset is
used for printing variable names.
Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
platform_get_resource() may fail and return NULL, so we should
better check it's return value to avoid a NULL pointer dereference
a bit later in the code.
This is detected by Coccinelle semantic patch.
@@
expression pdev, res, n, t, e, e1, e2;
@@
res = platform_get_resource(pdev, t, n);
+ if (!res)
+ return -EINVAL;
... when != res == NULL
e = devm_ioremap_nocache(e1, res->start, e2);
Fixes: cc4fa83f66e9 ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The > comparisons should be >= or else we read beyond the end of the
pinctrl->functions[] array.
Fixes: cc4fa83f66e9 ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The datasheet does not document any registers to control drive strength,
and no drive strength registers are for this reason described for this
SoC. The flags indicating that drive strength can be controlled are
however set for some pins in the driver.
This leads to a NULL pointer dereference when the sh-pfc core tries to
access the struct describing the drive strength registers, for example
when reading the sysfs file pinconf-pins.
Fix this by removing the SH_PFC_PIN_CFG_DRIVE_STRENGTH from all pins.
Fixes: b92ac66a1819602b ("pinctrl: sh-pfc: Add R8A77970 PFC support")
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The .gpio_set_direction() callback was setting inverted direction
for SoCs older than the JZ4770, this restores the correct behaviour.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
When we are explicitly using GPIO hogging mechanism in the pinctrl node,
such as:
&pio {
line_input {
gpio-hog;
gpios = <95 0>, <96 0>, <97 0>;
input;
};
};
A kernel panic happens at dereferencing a NULL pointer: In this case, the
drvdata is still not setup properly yet when it is being accessed.
A better solution for fixing up this issue should be we should obtain the
private data from struct gpio_chip using a specific gpiochip_get_data
instead of a generic dev_get_drvdata.
[ 0.249424] Unable to handle kernel NULL pointer dereference at virtual
address 000000c8
[ 0.257818] Mem abort info:
[ 0.260704] ESR = 0x96000005
[ 0.263869] Exception class = DABT (current EL), IL = 32 bits
[ 0.270011] SET = 0, FnV = 0
[ 0.273167] EA = 0, S1PTW = 0
[ 0.276421] Data abort info:
[ 0.279398] ISV = 0, ISS = 0x00000005
[ 0.283372] CM = 0, WnR = 0
[ 0.286440] [00000000000000c8] user address but active_mm is swapper
[ 0.293027] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 0.298795] Modules linked in:
[ 0.301958] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1+ #389
[ 0.308716] Hardware name: MediaTek MT7622 RFB1 board (DT)
[ 0.314396] pstate: 80000005 (Nzcv daif -PAN -UAO)
[ 0.319362] pc : mtk_hw_pin_field_get+0x28/0x118
[ 0.324140] lr : mtk_hw_set_value+0x30/0x104
[ 0.328557] sp : ffffff800801b6d0
[ 0.331983] x29: ffffff800801b6d0 x28: ffffff80086b7970
[ 0.337484] x27: 0000000000000000 x26: ffffff80087b8000
[ 0.342986] x25: 0000000000000000 x24: ffffffc00324c230
[ 0.348487] x23: 0000000000000003 x22: 0000000000000000
[ 0.353988] x21: ffffff80087b8000 x20: 0000000000000000
[ 0.359489] x19: 0000000000000054 x18: 00000000fffff7c0
[ 0.364990] x17: 0000000000006300 x16: 000000000000003f
[ 0.370492] x15: 000000000000000e x14: ffffffffffffffff
[ 0.375993] x13: 0000000000000000 x12: 0000000000000020
[ 0.381494] x11: 0000000000000006 x10: 0101010101010101
[ 0.386995] x9 : fffffffffffffffa x8 : 0000000000000007
[ 0.392496] x7 : ffffff80085d63f8 x6 : 0000000000000003
[ 0.397997] x5 : 0000000000000054 x4 : ffffffc0031eb800
[ 0.403499] x3 : ffffff800801b728 x2 : 0000000000000003
[ 0.409000] x1 : 0000000000000054 x0 : 0000000000000000
[ 0.414502] Process swapper/0 (pid: 1, stack limit = 0x000000002a913c1c)
[ 0.421441] Call trace:
[ 0.423968] mtk_hw_pin_field_get+0x28/0x118
[ 0.428387] mtk_hw_set_value+0x30/0x104
[ 0.432445] mtk_gpio_set+0x20/0x28
[ 0.436052] mtk_gpio_direction_output+0x18/0x30
[ 0.440833] gpiod_direction_output_raw_commit+0x7c/0xa0
[ 0.446333] gpiod_direction_output+0x104/0x114
[ 0.451022] gpiod_configure_flags+0xbc/0xfc
[ 0.455441] gpiod_hog+0x8c/0x140
[ 0.458869] of_gpiochip_add+0x27c/0x2d4
[ 0.462928] gpiochip_add_data_with_key+0x338/0x5f0
[ 0.467976] mtk_pinctrl_probe+0x388/0x400
[ 0.472217] platform_drv_probe+0x58/0xa4
[ 0.476365] driver_probe_device+0x204/0x44c
[ 0.480783] __device_attach_driver+0xac/0x108
[ 0.485384] bus_for_each_drv+0x7c/0xac
[ 0.489352] __device_attach+0xa0/0x144
[ 0.493320] device_initial_probe+0x10/0x18
[ 0.497647] bus_probe_device+0x2c/0x8c
[ 0.501616] device_add+0x2f8/0x540
[ 0.505226] of_device_add+0x3c/0x44
[ 0.508925] of_platform_device_create_pdata+0x80/0xb8
[ 0.514245] of_platform_bus_create+0x290/0x3e8
[ 0.518933] of_platform_populate+0x78/0x100
[ 0.523352] of_platform_default_populate+0x24/0x2c
[ 0.528403] of_platform_default_populate_init+0x94/0xa4
[ 0.533903] do_one_initcall+0x98/0x130
[ 0.537874] kernel_init_freeable+0x13c/0x1d4
[ 0.542385] kernel_init+0x10/0xf8
[ 0.545903] ret_from_fork+0x10/0x18
[ 0.549603] Code: 900020a1 f9400800 911dcc21 1400001f (f9406401)
[ 0.555916] ---[ end trace de8c34787fdad3b3 ]---
[ 0.560722] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b
[ 0.560722]
[ 0.570188] SMP: stopping secondary CPUs
[ 0.574253] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x0000000b
[ 0.574253]
Cc: stable@vger.kernel.org
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
If the pinctrl node has the gpio-ranges property, the range will be added
by the gpio core and doesn't need to be added by the pinctrl driver.
But for keeping backward compatibility, an explicit pinctrl_add_gpio_range
is still needed to be called when there is a missing gpio-ranges in pinctrl
node in old dts files.
Cc: stable@vger.kernel.org
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
To allow claiming hogs by pinctrl, we cannot enable pinctrl until all
groups and functions are being added done. Also, it's necessary that
the corresponding gpiochip is being added when the pinctrl device is
enabled.
Cc: stable@vger.kernel.org
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Because gpichip applied in the driver must depend on mtk eint to implement
the input data debouncing and the translation between gpio and irq, it's
better to keep logic consistent with mtk eint being built prior to gpiochip
being added.
Cc: stable@vger.kernel.org
Fixes: e6dabd38d8e7 ("pinctrl: mediatek: add EINT support to MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
It should be to return an error code when failing at groups building.
Cc: stable@vger.kernel.org
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
perf propagates its feature check results to libbpf. This means
features for which perf probes must be a superset of libbpf's
required features. perf depends on FEATURE_TESTS_BASIC for its list
of features.
commit 531b014e7a2f ("tools: bpf: make use of reallocarray") added
reallocarray use to libbpf, make perf also perform the reallocarray
feature check.
Fixes: 531b014e7a2f ("tools: bpf: make use of reallocarray")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Fixes: f9358e12a0af ("net: mvpp2: split ingress traffic into multiple flows")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Yuchung Cheng says:
====================
fix DCTCP delayed ACK
This patch series addresses the issue that sometimes DCTCP
fail to acknowledge the latest sequence and result in sender timeout
if inflight is small.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After fixing the way DCTCP tracking delayed ACKs, the delayed-ACK
related callbacks are no longer needed
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously, when a data segment was sent an ACK was piggybacked
on the data segment without generating a CA_EVENT_NON_DELAYED_ACK
event to notify congestion control modules. So the DCTCP
ca->delayed_ack_reserved flag could incorrectly stay set when
in fact there were no delayed ACKs being reserved. This could result
in sending a special ECN notification ACK that carries an older
ACK sequence, when in fact there was no need for such an ACK.
DCTCP keeps track of the delayed ACK status with its own separate
state ca->delayed_ack_reserved. Previously it may accidentally cancel
the delayed ACK without updating this field upon sending a special
ACK that carries a older ACK sequence. This inconsistency would
lead to DCTCP receiver never acknowledging the latest data until the
sender times out and retry in some cases.
Packetdrill script (provided by Larry Brakmo)
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001
0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 2:3(1) ack 2001
0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
0.200 > [ect01] . 3:3(0) ack 4001
0.210 < [ce] P. 4001:4501(500) ack 3 win 257
+0.001 read(4, ..., 4500) = 4500
+0 write(4, ..., 1) = 1
+0 > [ect01] PE. 3:4(1) ack 4501
+0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
// Previously the ACK sequence below would be 4501, causing a long RTO
+0.040~+0.045 > [ect01] . 4:4(0) ack 5501 // delayed ack
+0.311 < [ect0] . 5501:6501(1000) ack 4 win 257 // More data
+0 > [ect01] . 4:4(0) ack 6501 // now acks everything
+0.500 < F. 9501:9501(0) ack 4 win 257
Reported-by: Larry Brakmo <brakmo@fb.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We accidentally left out the error handling for kstrtoul().
Fixes: a520030e326a ("qlcnic: Implement flash sysfs callback for 83xx adapter")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Michael Heimpold <mhei@heimpold.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
By a simple extension of of_phy_get_and_connect() drivers
that have a fixed link on e.g. RGMII can support also
fixed links, so in addition to:
ethernet-port {
phy-mode = "rgmii";
phy-handle = <&foo>;
};
This setup with a fixed-link node and no phy-handle will
now also work just fine:
ethernet-port {
phy-mode = "rgmii";
fixed-link {
speed = <1000>;
full-duplex;
pause;
};
};
This is very helpful for connecting random ethernet ports
to e.g. DSA switches that typically reside on fixed links.
The phy-mode is still there as the fixes link in this case
is still an RGMII link.
Tested on the Cortina Gemini driver with the Vitesse DSA
router chip on a fixed 1Gbit link.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend struct tcf_walker with additional 'cookie' field. It is intended to
be used by classifier walk implementations to continue iteration directly
from particular filter, instead of iterating 'skip' number of times.
Change flower walk implementation to save filter handle in 'cookie'. Each
time flower walk is called, it looks up filter with saved handle directly
with idr, instead of iterating over filter linked list 'skip' number of
times. This change improves complexity of dumping flower classifier from
quadratic to linearithmic. (assuming idr lookup has logarithmic complexity)
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reported-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
People noticed that the code match on IEEE 802.1ad (ETH_P_8021AD) ethertype,
and this implies Q-in-Q or double tagged VLANs. Thus, we better parse
the next VLAN header too. It is even marked as a TODO.
This is relevant for real world use-cases, as XDP cpumap redirect can be
used when the NIC RSS hashing is broken. E.g. the ixgbe driver HW cannot
handle double tagged VLAN packets, and places everything into a single
RX queue. Using cpumap redirect, users can redistribute traffic across
CPUs to solve this, which is faster than the network stacks RPS solution.
It is left as an exerise how to distribute the packets across CPUs. It
would be convenient to use the RX hash, but that is not _yet_ exposed
to XDP programs. For now, users can code their own hash, as I've demonstrated
in the Suricata code (where Q-in-Q is handled correctly).
Reported-by: Florian Maury <florian.maury-cv@x-cli.eu>
Reported-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- I2C core bugfix regarding bus recovery
- driver bugfix for the tegra driver
- typo correction
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: recovery: if possible send STOP with recovery pulses
i2c: tegra: Fix NACK error handling
i2c: stu300: use non-archaic spelling of failes
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-07-13
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix AF_XDP TX error reporting before final kernel release such that it
becomes consistent between copy mode and zero-copy, from Magnus.
2) Fix three different syzkaller reported issues: oob due to ld_abs
rewrite with too large offset, another oob in l3 based skb test run
and a bug leaving mangled prog in subprog JITing error path, from Daniel.
3) Fix BTF handling for bitfield extraction on big endian, from Okash.
4) Fix a missing linux/errno.h include in cgroup/BPF found by kbuild bot,
from Roman.
5) Fix xdp2skb_meta.sh sample by using just command names instead of
absolute paths for tc and ip and allow them to be redefined, from Taeung.
6) Fix availability probing for BPF seg6 helpers before final kernel ships
so they can be detected at prog load time, from Mathieu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 8b7008620b84 ("net: Don't copy pfmemalloc flag in
__copy_skb_header()") introduced a different handling for the
pfmemalloc flag in copy and clone paths.
In __skb_clone(), now, the flag is set only if it was set in the
original skb, but not cleared if it wasn't. This is wrong and
might lead to socket buffers being flagged with pfmemalloc even
if the skb data wasn't allocated from pfmemalloc reserves. Copy
the flag instead of ORing it.
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Fixes: 8b7008620b84 ("net: Don't copy pfmemalloc flag in __copy_skb_header()")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds support for IGMPMSG_WRVIFWHOLE which is used to pass
full packet and real vif id when the incoming interface is wrong.
While the RP and FHR are setting up state we need to be sending the
registers encapsulated with all the data inside otherwise we lose it.
The RP then decapsulates it and forwards it to the interested parties.
Currently with WRONGVIF we can only be sending empty register packets
and will lose that data.
This behaviour can be enabled by using MRT_PIM with
val == IGMPMSG_WRVIFWHOLE. This doesn't prevent IGMPMSG_WRONGVIF from
happening, it happens in addition to it, also it is controlled by the same
throttling parameters as WRONGVIF (i.e. 1 packet per 3 seconds currently).
Both messages are generated to keep backwards compatibily and avoid
breaking someone who was enabling MRT_PIM with val == 4, since any
positive val is accepted and treated the same.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"A clocksource driver fix and a revert"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: arm_arch_timer: Set arch_mem_timer cpumask to cpu_possible_mask
Revert "tick: Prefer a lower rating device only if it's CPU local device"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tool fixes from Ingo Molnar:
"Misc tooling fixes: python3 related fixes, gcc8 fix, bashism fixes and
some other smaller fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Use python-config --includes rather than --cflags
perf script python: Fix dict reference counting
perf stat: Fix --interval_clear option
perf tools: Fix compilation errors on gcc8
perf test shell: Prevent temporary editor files from being considered test scripts
perf llvm-utils: Remove bashism from kernel include fetch script
perf test shell: Make perf's inet_pton test more portable
perf test shell: Replace '|&' with '2>&1 |' to work with more shells
perf scripts python: Add Python 3 support to EventClass.py
perf scripts python: Add Python 3 support to sched-migration.py
perf scripts python: Add Python 3 support to Util.py
perf scripts python: Add Python 3 support to SchedGui.py
perf scripts python: Add Python 3 support to Core.py
perf tools: Generate a Python script compatible with Python 2 and 3
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Ingo Molnar:
"Fix a UEFI mixed mode (64-bit kernel on 32-bit UEFI) reboot loop
regression"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Fix mixed mode reboot loop by removing pointless call to PciIo->Attributes()
|
|
Jakub Kicinski says:
====================
This set is adding support for loading driver and offload XDP
at the same time. This enables advanced use cases where some
of the work is offloaded to the NIC and some is done by the host.
Separate netlink attributes are added for each mode of operation.
Driver callbacks for offload are cleaned up a little, including
removal of .prog_attached flag.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fixes from Ingo Molnar:
"Various rseq ABI fixes and cleanups: use get_user()/put_user(),
validate parameters and use proper uapi types, etc"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq/selftests: cleanup: Update comment above rseq_prepare_unload
rseq: Remove unused types_32_64.h uapi header
rseq: uapi: Declare rseq_cs field as union, update includes
rseq: uapi: Update uapi comments
rseq: Use get_user/put_user rather than __get_user/__put_user
rseq: Use __u64 for rseq_cs fields, validate user inputs
|
|
Pull rdma fixes from Jason Gunthorpe:
"Things have been quite slow, only 6 RC patches have been sent to the
list. Regression, user visible bugs, and crashing fixes:
- cxgb4 could wrongly fail MR creation due to a typo
- various crashes if the wrong QP type is mixed in with APIs that
expect other types
- syzkaller oops
- using ERR_PTR and NULL together cases HFI1 to crash in some cases
- mlx5 memory leak in error unwind"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error path
RDMA/uverbs: Don't fail in creation of multiple flows
IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow
RDMA/uverbs: Protect from attempts to create flows on unsupported QP
iw_cxgb4: correctly enforce the max reg_mr depth
|
|
Pull VFIO fix from Alex Williamson:
"Fix deadlock in mbochs sample driver (Alexey Khoroshilov)"
* tag 'vfio-v4.18-rc5' of git://github.com/awilliam/linux-vfio:
sample: vfio-mdev: avoid deadlock in mdev_access()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- update Kbuild and Kconfig documents
- sanitize -I compiler option handling
- update extract-vmlinux script to recognize LZ4 and ZSTD
- fix tools Makefiles
- update tags.sh to handle __ro_after_init
- suppress warnings in case getconf does not recognize LFS_* parameters
* tag 'kbuild-fixes-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: suppress warnings from 'getconf LFS_*'
scripts/tags.sh: add __ro_after_init
tools: build: Use HOSTLDFLAGS with fixdep
tools: build: Fixup host c flags
tools build: fix # escaping in .cmd files for future Make
scripts: teach extract-vmlinux about LZ4 and ZSTD
kbuild: remove duplicated comments about PHONY
kbuild: .PHONY is not a variable, but PHONY is
kbuild: do not drop -I without parameter
kbuild: document the KBUILD_KCONFIG env. variable
kconfig: update user kconfig tools doc.
kbuild: delete INSTALL_FW_PATH from kbuild documentation
kbuild: update ARCH alias info for sparc
kbuild: update ARCH alias info for sh
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Catalin's out enjoying the sunshine, so I'm sending the fixes for a
couple of weeks (although there hopefully won't be any more!).
We've got a revert of a previous fix because it broke the build with
some distro toolchains and a preemption fix when detemining whether or
not the SIMD unit is in use.
Summary:
- Revert back to the 'linux' target for LD, as 'elf' breaks some
distributions
- Fix preemption race when testing whether the vector unit is in use
or not"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: neon: Fix function may_use_simd() return error status
Revert "arm64: Use aarch64elf and aarch64elfb emulation mode variants"
|
|
Pull ARM fixes from Russell King:
"A couple of small fixes this time around from Steven for an
interaction between ftrace and kernel read-only protection, and
Vladimir for nommu"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8780/1: ftrace: Only set kernel memory back to read-only after boot
ARM: 8775/1: NOMMU: Use instr_sync instead of plain isb in common code
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixlet from Steven Rostedt:
"Joel Fernandes asked to add a feature in tracing that Android had its
own patch internally for. I took it back in 4.13. Now he realizes that
he had a mistake, and swapped the values from what Android had. This
means that the old Android tools will break when using a new kernel
that has the new feature on it.
The options are:
1. To swap it back to what Android wants.
2. Add a command line option or something to do the swap
3. Just let Android carry a patch that swaps it back
Since it requires setting a tracing option to enable this anyway, I
doubt there are other users of this than Android. Thus, I've decided
to take option 1. If someone else is actually depending on the order
that is in the kernel, then we will have to revert this change and go
to option 2 or 3"
* tag 'trace-v4.18-rc3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Reorder display of TGID to be after PID
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a few HD-auio fixes: one fix for a possible mutex deadlock at
HDMI hotplug handling is somewhat subtle and delicate, while the rest
are usual device-specific quirks"
* tag 'sound-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/ca0132: Update a pci quirk device name
ALSA: hda/ca0132: Add Recon3Di quirk for Gigabyte G1.Sniper Z97
ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
ALSA: hda - Handle pm failure during hotplug
|
|
Split handling of offloaded and driver programs completely. Since
offloaded programs always come with XDP_FLAGS_HW_MODE set in reality
there could be no sharing, anyway, programs would only be installed
in driver or in hardware. Splitting the handling allows us to install
programs in HW and in driver at the same time.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add tests for having an XDP program attached in the driver and
another one attached in HW simultaneously.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Allow netdevsim to accept driver and offload attachment of XDP
BPF programs at the same time.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Split the query of HW-attached program from the software one.
Introduce new .ndo_bpf command to query HW-attached program.
This will allow drivers to install different programs in HW
and SW at the same time. Netlink can now also carry multiple
programs on dump (in which case mode will be set to
XDP_ATTACHED_MULTI and user has to check per-attachment point
attributes, IFLA_XDP_PROG_ID will not be present). We reuse
IFLA_XDP_PROG_ID skb space for second mode, so rtnl_xdp_size()
doesn't need to be updated.
Note that the installation side is still not there, since all
drivers currently reject installing more than one program at
the time.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Basic operations drivers perform during xdp setup and query can
be moved to helpers in the core. Encapsulate program and flags
into a structure and add helpers. Note that the structure is
intended as the "main" program information source in the driver.
Most drivers will additionally place the program pointer in their
fast path or ring structures.
The helpers don't have a huge impact now, but they will
decrease the code duplication when programs can be installed
in HW and driver at the same time. Encapsulating the basic
operations in helpers will hopefully also reduce the number
of changes to drivers which adopt them.
Helpers could really be static inline, but they depend on
definition of struct netdev_bpf which means they'd have
to be placed in netdevice.h, an already 4500 line header.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
prog_attached of struct netdev_bpf should have been superseded
by simply setting prog_id long time ago, but we kept it around
to allow offloading drivers to communicate attachment mode (drv
vs hw). Subsequently drivers were also allowed to report back
attachment flags (prog_flags), and since nowadays only programs
attached will XDP_FLAGS_HW_MODE can get offloaded, we can tell
the attachment mode from the flags driver reports. Remove
prog_attached member.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
In preparation for support of simultaneous driver and hardware XDP
support add per-mode attributes. The catch-all IFLA_XDP_PROG_ID
will still be reported, but user space can now also access the
program ID in a new IFLA_XDP_<mode>_PROG_ID attribute.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dave Jiang:
- ensure that a variable passed in by reference to acpi_nfit_ctl is
always set to a value. An incremental patch is provided due to notice
from testing in -next. The rest of the commits did not exhibit
issues.
- fix a return path in nsio_rw_bytes() that was not returning "bytes
remain" as expected for the function.
- address an issue where applications polling on scrub-completion for
the NVDIMM may falsely wakeup and read the wrong state value and
cause hang.
- change the test unit persistent capability attribute to fix up a
broken assumption in the unit test infrastructure wrt the
'write_cache' attribute
- ratelimit dev_info() in the dax device check_vma() function since
this is easily triggered from userspace
* tag 'libnvdimm-fixes-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: fix unchecked dereference in acpi_nfit_ctl
acpi, nfit: Fix scrub idle detection
tools/testing/nvdimm: advertise a write cache for nfit_test
acpi/nfit: fix cmd_rc for acpi_nfit_ctl to always return a value
dev-dax: check_vma: ratelimit dev_info-s
libnvdimm, pmem: Fix memcpy_mcsafe() return code handling in nsio_rw_bytes()
|
|
Rather than using the index variable stored in vram. If
the device fails to come back online after a resume cycle,
reads from vram will return all 1s which will cause a
segfault. Based on a patch from Thomas Martitz <kugel@rockbox.org>.
This avoids the segfault, but we still need to sort out
why the GPU does not come back online after a resume.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
btrfs_cmp_data_free() puts cmp's src_pages and dst_pages, but leaves
their page address intact. Now, if you hit "goto again" in
btrfs_extent_same_range() and hit some error in
btrfs_cmp_data_prepare(), you'll try to unlock/put already put pages.
This is simple fix to reset the address to avoid use-after-free.
Fixes: 67b07bd4bec5 ("Btrfs: reuse cmp workspace in EXTENT_SAME ioctl")
Signed-off-by: Naohiro Aota <naota@elisp.net>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Magnus Karlsson says:
====================
This patch set adjusts the AF_XDP TX error reporting so that it becomes
consistent between copy mode and zero-copy. First some background:
Copy-mode for TX uses the SKB path in which the action of sending the
packet is performed from process context using the sendmsg
syscall. Completions are usually done asynchronously from NAPI mode by
using a TX interrupt. In this mode, send errors can be returned back
through the syscall.
In zero-copy mode both the sending of the packet and the completions
are done asynchronously from NAPI mode for performance reasons. In
this mode, the sendmsg syscall only makes sure that the TX NAPI loop
will be run that performs both the actions of sending and
completing. In this mode it is therefore not possible to return errors
through the sendmsg syscall as the sending is done from the NAPI
loop. Note that it is possible to implement a synchronous send with
our API, but in our benchmarks that made the TX performance drop by
nearly half due to synchronization requirements and cache line
bouncing. But for some netdevs this might be preferable so let us
leave it up to the implementation to decide.
The problem is that the current code base returns some errors in
copy-mode that are not possible to return in zero-copy mode. This
patch set aligns them so that the two modes always return the same
error code. We achieve this by removing some of the errors returned by
sendmsg in copy-mode (and in one case adding an error message for
zero-copy mode) and offering alternative error detection methods that
are consistent between the two modes.
The structure of the patch set is as follows:
Patch 1: removes the ENXIO return code from copy-mode when someone has
forcefully changed the number of queues on the device so that the
queue bound to the socket is no longer available. Just silently stop
sending anything as in zero-copy mode.
Patch 2: stop returning EAGAIN in copy mode when the completion queue
is full as zero-copy does not do this. Instead this situation can be
detected by comparing the head and tail pointers of the completion
queue in both modes. In any case, EAGAIN was not the correct error code
here since no amount of calling sendmsg will solve the problem. Only
consuming one or more messages on the completion queue will fix this.
Patch 3: Always return ENOBUFS from sendmsg if there is no TX queue
configured. This was not the case for zero-copy mode.
Patch 4: stop returning EMSGSIZE when the size of the packet is larger
than the MTU. Just send it to the device so that it will drop it as in
zero-copy mode.
Note that copy-mode can still return EAGAIN in certain circumstances,
but as these conditions cannot occur in zero-copy mode it is fine for
copy-mode to return them.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|