Age | Commit message (Collapse) | Author |
|
Alexei Starovoitov says:
====================
pull-request: bpf 2018-12-15
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) fix liveness propagation of callee saved registers, from Jakub.
2) fix overflow in bpf_jit_limit knob, from Daniel.
3) bpf_flow_dissector api fix, from Stanislav.
4) bpf_perf_event api fix on powerpc, from Sandipan.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Salil Mehta says:
====================
net: hns3: Add more commands to Debugfs in HNS3 driver
This patch-set adds few more debugfs commands to HNS3 Ethernet
Driver. Support has been added to query info related to below
items:
1. Packet buffer descriptor ("echo bd info [queue no] [bd index] > cmd")
2. Manager table("echo dump mng tbl > cmd")
3. Dfx status register("echo dump reg ssu [prt id] > cmd")
4. Dcb status register("echo dump reg dcb [port id] > cmd")
5. Queue map ("echo queue map [queue no] > cmd")
6. Tm map ("echo tm map [queue no] > cmd")
NOTE: Above commands are *read-only* and are only intended to
query the information from the SoC(and dump inside the kernel,
for now) and in no way tries to perform write operations for
the purpose of configuration etc.
Change Log:
V1-->V2:
1. Addressed the GCC-8.2 compiler issue reported by David S. Miller.
Link: https://lkml.org/lkml/2018/12/14/1298
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch prints dcb register status information by module.
debugfs command:
root@(none)# echo dump tm map 100 > cmd
queue_id | qset_id | pri_id | tc_id
0100 | 0065 | 08 | 00
root@(none)#
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch prints queue map information.
debugfs command:
echo dump queue map > cmd
Sample Command:
root@(none)# echo queue map > cmd
local queue id | global queue id | vector id
0 32 769
1 33 770
2 34 771
3 35 772
4 36 773
5 37 774
6 38 775
7 39 776
8 40 777
9 41 778
10 42 779
11 43 780
12 44 781
13 45 782
14 46 783
15 47 784
root@(none)#
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch prints dcb register status information by module.
debugfs command:
root@(none)# echo dump reg dcb > cmd
roce_qset_mask: 0x0
nic_qs_mask: 0x0
qs_shaping_pass: 0x0
qs_bp_sts: 0x0
pri_mask: 0x0
pri_cshaping_pass: 0x0
pri_pshaping_pass: 0x0
root@(none)#
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch prints status register information by module.
debugfs command:
echo dump reg [mode name] > cmd
Sample Command:
root@(none)# echo dump reg bios common > cmd
BP_CPU_STATE: 0x0
DFX_MSIX_INFO_NIC_0: 0xc000
DFX_MSIX_INFO_NIC_1: 0xf
DFX_MSIX_INFO_NIC_2: 0x2
DFX_MSIX_INFO_NIC_3: 0x2
DFX_MSIX_INFO_ROC_0: 0xc000
DFX_MSIX_INFO_ROC_1: 0x0
DFX_MSIX_INFO_ROC_2: 0x0
DFX_MSIX_INFO_ROC_3: 0x0
root@(none)#
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch prints manager table information.
debugfs command:
echo dump mng tbl > cmd
Sample Command:
root@(none)# echo dump mng tbl > cmd
entry|mac_addr |mask|ether|mask|vlan|mask|i_map|i_dir|e_type
00 |01:00:5e:00:00:01|0 |00000|0 |0000|0 |00 |00 |0
01 |c2:f1:c5:82:68:17|0 |00000|0 |0000|0 |00 |00 |0
root@(none)#
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch prints Sending and receiving
package descriptor information.
debugfs command:
echo dump bd info 1 > cmd
Sample Command:
root@(none)# echo bd info 1 > cmd
hns3 0000:7d:00.0: TX Queue Num: 0, BD Index: 0
hns3 0000:7d:00.0: (TX) addr: 0x0
hns3 0000:7d:00.0: (TX)vlan_tag: 0
hns3 0000:7d:00.0: (TX)send_size: 0
hns3 0000:7d:00.0: (TX)vlan_tso: 0
hns3 0000:7d:00.0: (TX)l2_len: 0
hns3 0000:7d:00.0: (TX)l3_len: 0
hns3 0000:7d:00.0: (TX)l4_len: 0
hns3 0000:7d:00.0: (TX)vlan_tag: 0
hns3 0000:7d:00.0: (TX)tv: 0
hns3 0000:7d:00.0: (TX)vlan_msec: 0
hns3 0000:7d:00.0: (TX)ol2_len: 0
hns3 0000:7d:00.0: (TX)ol3_len: 0
hns3 0000:7d:00.0: (TX)ol4_len: 0
hns3 0000:7d:00.0: (TX)paylen: 0
hns3 0000:7d:00.0: (TX)vld_ra_ri: 0
hns3 0000:7d:00.0: (TX)mss: 0
hns3 0000:7d:00.0: RX Queue Num: 0, BD Index: 120
hns3 0000:7d:00.0: (RX)addr: 0xffee7000
hns3 0000:7d:00.0: (RX)pkt_len: 0
hns3 0000:7d:00.0: (RX)size: 0
hns3 0000:7d:00.0: (RX)rss_hash: 0
hns3 0000:7d:00.0: (RX)fd_id: 0
hns3 0000:7d:00.0: (RX)vlan_tag: 0
hns3 0000:7d:00.0: (RX)o_dm_vlan_id_fb: 0
hns3 0000:7d:00.0: (RX)ot_vlan_tag: 0
hns3 0000:7d:00.0: (RX)bd_base_info: 0
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use my korg address instead of the bootlin one.
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
|
|
marvell_nfc_wait_op() waits for completion during 'timeout_ms'
milliseconds before throwing an error. While the logic is fine, the
value of 'timeout_ms' is given by the core and actually correspond to
the maximum time the NAND chip will take to complete the
operation. Assuming there is no overhead in the propagation of the
interrupt signal to the the NAND controller (through the Ready/Busy
line), this delay does not take into account the latency of the
operating system. For instance, for a page write, the delay given by
the core is rounded up to 1ms. Hence, when the machine is over loaded,
there is chances that this timeout will be reached.
There are two ways to solve this issue that are not incompatible:
1/ Enlarge the timeout value (if so, how much?).
2/ Check after the waiting method if we did not miss any interrupt
because of the OS latency (an interrupt is still pending). In this
case, we assume the operation exited successfully.
We choose the second approach that is a must in all cases, with the
possibility to also modify the timeout value to be, e.g. at least 1
second in all cases.
Fixes: 02f26ecf8c77 ("mtd: nand: add reworked Marvell NAND controller driver")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
|
|
Commit
379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link")
accidentally broke unwinding from userspace, because ld would strip the
.eh_frame sections when linking.
Originally, the compiler would implicitly add --eh-frame-hdr when
invoking the linker, but when this Makefile was converted from invoking
ld via the compiler, to invoking it directly (like vmlinux does),
the flag was missed. (The EH_FRAME section is important for the VDSO
shared libraries, but not for vmlinux.)
Fix the problem by explicitly specifying --eh-frame-hdr, which restores
parity with the old method.
See relevant bug reports for additional info:
https://bugzilla.kernel.org/show_bug.cgi?id=201741
https://bugzilla.redhat.com/show_bug.cgi?id=1659295
Fixes: 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link")
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: Carlos O'Donell <carlos@redhat.com>
Reported-by: "H. J. Lu" <hjl.tools@gmail.com>
Signed-off-by: Alistair Strachan <astrachan@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Carlos O'Donell <carlos@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: kernel-team@android.com
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181214223637.35954-1-astrachan@google.com
|
|
The -Wimplicit-fallthrough option requires that the /* fall through */
comment is placed in the 'case' statement that falls through, rather
than in the following one. Case seems to matter as well.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Often a new processor gets a new model number, but from a turbostat
point of view, it is the same as a previous model. Support duplicates
with 1-line updates, rather than error-prone scattering of model #'s.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
When the C-state limit is 8 on Goldmont, PC10 is enabled.
Previously turbostat saw this as "undefined", and thus assumed
it should not show some counters, such as pc3, pc6, pc7.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Quentin Monnet says:
====================
This series contains several minor fixes for bpftool source and
documentation.
The first patches focus on documentation: addition of an option in the page
for "bpftool prog", clean up and update of the same page, and addition of
an example of prog array map manipulation in "bpftool map" page.
The last two fix warnings susceptible to appear when libbfd is not present
(patch 4), or with additional warning flags passed to the compiler (last
patch).
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Help compiler check arguments for several utility functions used to
print items to the console by adding the "printf" attribute when
declaring those functions.
Also, declare as "static" two functions that are only used in prog.c.
All of them discovered by compiling bpftool with
-Wmissing-format-attribute -Wmissing-declarations.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The following warning appears when compiling bpftool without BFD
support:
main.h:198:23: warning: 'struct bpf_prog_linfo' declared inside
parameter list will not be visible outside of this definition or
declaration
const struct bpf_prog_linfo *prog_linfo,
Fix it by declaring struct bpf_prog_linfo even in the case BFD is not
supported.
Fixes: b053b439b72a ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add an example in map documentation to show how to use bpftool in order
to update the references to programs hold by prog array maps.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Bring various fixes to the manual page for "bpftool prog" set of
commands:
- Fix typos ("dum" -> "dump")
- Harmonise indentation and format for command output
- Update date format for program load time
- Add instruction numbers on program dumps
- Fix JSON format for the example program listing
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The --mapcompat|-m option has been documented on the main bpftool.rst
page, and on the interactive help. As this option is useful for loading
programs with maps with the "bpftool prog load" command, it should also
appear in the related bpftool-prog.rst documentation page. Let's add it.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Alexei Starovoitov says:
====================
v1->v2:
With optimization suggested by Jakub patch 4 safety check became
cheap enough.
Several improvements to verifier state logic.
Patch 1 - trivial optimization
Patch 3 - significant optimization for stack state equivalence
Patch 4 - safety check for liveness and prep for future state merging
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Introduce REG_LIVE_DONE to check the liveness propagation
and prepare the states for merging.
See algorithm description in clean_live_states().
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
"if (old->allocated_stack > cur->allocated_stack)" check is too conservative.
In some cases explored stack could have allocated more space,
but that stack space was not live.
The test case improves from 19 to 15 processed insns
and improvement on real programs is significant as well:
before after
bpf_lb-DLB_L3.o 1940 1831
bpf_lb-DLB_L4.o 3089 3029
bpf_lb-DUNKNOWN.o 1065 1064
bpf_lxc-DDROP_ALL.o 28052 26309
bpf_lxc-DUNKNOWN.o 35487 33517
bpf_netdev.o 10864 9713
bpf_overlay.o 6643 6184
bpf_lcx_jit.o 38437 37335
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Edward Cree <ecree@solarflare.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Teach test_verifier to parse verifier output for insn processed
and compare with expected number.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Edward Cree <ecree@solarflare.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Don't check the same stack liveness condition 8 times.
once is enough.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Edward Cree <ecree@solarflare.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Peter Oskolkov says:
====================
net: prefer listeners bound to an address
A relatively common use case is to have several IPs configured
on a host, and have different listeners for each of them. We would
like to add a "catch all" listener on addr_any, to match incoming
connections not served by any of the listeners bound to a specific
address.
However, port-only lookups can match addr_any sockets when sockets
listening on specific addresses are present if so_reuseport flag
is set. This patchset eliminates lookups into port-only hashtable,
as lookups by (addr,port) tuple are easily available.
In a future patchset I plan to explore whether it is possible
to remove port-only hashtables completely: additional refactoring
will be required, as some non-lookup code uses the hashtables.
====================
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a selftest that verifies that a socket listening
on a specific address is chosen in preference over sockets
that listen on any address. The test covers UDP/UDP6/TCP/TCP6.
It is based on, and similar to, reuseport_dualstack.c selftest.
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A relatively common use case is to have several IPs configured
on a host, and have different listeners for each of them. We would
like to add a "catch all" listener on addr_any, to match incoming
connections not served by any of the listeners bound to a specific
address.
However, port-only lookups can match addr_any sockets when sockets
listening on specific addresses are present if so_reuseport flag
is set. This patch eliminates lookups into port-only hashtable,
as lookups by (addr,port) tuple are easily available.
In addition, compute_score() is tweaked to _not_ match
addr_any sockets to specific addresses, as hash collisions
could result in the unwanted behavior described above.
Tested: the patch compiles; full test in the last patch in this
patchset. Existing reuseport_* selftests also pass.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A relatively common use case is to have several IPs configured
on a host, and have different listeners for each of them. We would
like to add a "catch all" listener on addr_any, to match incoming
connections not served by any of the listeners bound to a specific
address.
However, port-only lookups can match addr_any sockets when sockets
listening on specific addresses are present if so_reuseport flag
is set. This patch eliminates lookups into port-only hashtable,
as lookups by (addr,port) tuple are easily available.
In addition, compute_score() is tweaked to _not_ match
addr_any sockets to specific addresses, as hash collisions
could result in the unwanted behavior described above.
Tested: the patch compiles; full test in the last patch in this
patchset. Existing reuseport_* selftests also pass.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A relatively common use case is to have several IPs configured
on a host, and have different listeners for each of them. We would
like to add a "catch all" listener on addr_any, to match incoming
connections not served by any of the listeners bound to a specific
address.
However, port-only lookups can match addr_any sockets when sockets
listening on specific addresses are present if so_reuseport flag
is set. This patch eliminates lookups into port-only hashtable,
as lookups by (addr,port) tuple are easily available.
In addition, compute_score() is tweaked to _not_ match
addr_any sockets to specific addresses, as hash collisions
could result in the unwanted behavior described above.
Tested: the patch compiles; full test in the last patch in this
patchset. Existing reuseport_* selftests also pass.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A relatively common use case is to have several IPs configured
on a host, and have different listeners for each of them. We would
like to add a "catch all" listener on addr_any, to match incoming
connections not served by any of the listeners bound to a specific
address.
However, port-only lookups can match addr_any sockets when sockets
listening on specific addresses are present if so_reuseport flag
is set. This patch eliminates lookups into port-only hashtable,
as lookups by (addr,port) tuple are easily available.
In addition, compute_score() is tweaked to _not_ match
addr_any sockets to specific addresses, as hash collisions
could result in the unwanted behavior described above.
Tested: the patch compiles; full test in the last patch in this
patchset. Existing reuseport_* selftests also pass.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add explainations for some general IP counters, SACK and DSACK related
counters
Signed-off-by: yupeng <yupeng0921@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tipc_wait_for_cond() drops socket lock before going to sleep,
but tsk->group could be freed right after that release_sock().
So we have to re-check and reload tsk->group after it wakes up.
After this patch, tipc_wait_for_cond() returns -ERESTARTSYS when
tsk->group is NULL, instead of continuing with the assumption of
a non-NULL tsk->group.
(It looks like 'dsts' should be re-checked and reloaded too, but
it is a different bug.)
Similar for tipc_send_group_unicast() and tipc_send_group_anycast().
Reported-by: syzbot+10a9db47c3a0e13eb31c@syzkaller.appspotmail.com
Fixes: b7d42635517f ("tipc: introduce flow control for group broadcast messages")
Fixes: ee106d7f942d ("tipc: introduce group anycast messaging")
Fixes: 27bd9ec027f3 ("tipc: introduce group unicast messaging")
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
David Ahern says:
====================
neighbor: More gc_list changes
More gc_list changes and cleanups.
The first 2 patches are bug fixes from the first gc_list change.
Specifically, fix the locking order to be consistent - table lock
followed by neighbor lock, and then entries in the FAILED state
should always be candidates for forced_gc without waiting for any
time span (return to the eviction logic prior to the separate gc_list).
Patch 3 removes 2 now unnecessary arguments to neigh_del.
Patch 4 moves a helper from a header file to core code in preparation
for Patch 5 which removes NTF_EXT_LEARNED entries from the gc_list.
These entries are already exempt from forced_gc; patch 5 removes them
from consideration and makes them on par with PERMANENT entries given
that they are also managed by userspace.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Externally learned entries are similar to PERMANENT entries in the
sense they are managed by userspace and can not be garbage collected.
As such remove them from the gc_list, remove the flags check from
neigh_forced_gc and skip threshold checks in neigh_alloc. As with
PERMANENT entries, this allows unlimited number of NTF_EXT_LEARNED
entries.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
neigh_update_ext_learned has one caller in neighbour.c so does not need
to be defined in the header. Move it and in the process remove the
intialization of ndm_flags and just set it based on the flags check.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
neigh_del now only has 1 caller, and the state and flags arguments
are both 0. Remove them and simplify neigh_del.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PERMANENT entries are not on the gc_list so the state check is now
redundant. Also, the move to not purge entries until after 5 seconds
should not apply to FAILED entries; those can be removed immediately
to make way for newer ones. This restores the previous logic prior to
the gc_list.
Fixes: 58956317c8de ("neighbor: Improve garbage collection")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Lock checker noted an inverted lock order between neigh_change_state
(neighbor lock then table lock) and neigh_periodic_work (table lock and
then neighbor lock) resulting in:
[ 121.057652] ======================================================
[ 121.058740] WARNING: possible circular locking dependency detected
[ 121.059861] 4.20.0-rc6+ #43 Not tainted
[ 121.060546] ------------------------------------------------------
[ 121.061630] kworker/0:2/65 is trying to acquire lock:
[ 121.062519] (____ptrval____) (&n->lock){++--}, at: neigh_periodic_work+0x237/0x324
[ 121.063894]
[ 121.063894] but task is already holding lock:
[ 121.064920] (____ptrval____) (&tbl->lock){+.-.}, at: neigh_periodic_work+0x194/0x324
[ 121.066274]
[ 121.066274] which lock already depends on the new lock.
[ 121.066274]
[ 121.067693]
[ 121.067693] the existing dependency chain (in reverse order) is:
...
Fix by renaming neigh_change_state to neigh_update_gc_list, changing
it to only manage whether an entry should be on the gc_list and taking
locks in the same order as neigh_periodic_work. Invoke at the end of
neigh_update only if diff between old or new states has the PERMANENT
flag set.
Fixes: 8cc196d6ef86 ("neighbor: gc_list changes should be protected by table lock")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While most distributions long ago switched to the iproute2 suite
of utilities, which allow class-e (240.0.0.0/4) address assignment,
distributions relying on busybox, toybox and other forms of
ifconfig cannot assign class-e addresses without this kernel patch.
While CIDR has been obsolete for 2 decades, and a survey of all the
open source code in the world shows the IN_whatever macros are also
obsolete... rather than obsolete CIDR from this ioctl entirely, this
patch merely enables class-e assignment, sanely.
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/spdxcheck.py: always open files in binary mode
checkstack.pl: fix for aarch64
userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
fs/iomap.c: get/put the page in iomap_page_create/release()
hugetlbfs: call VM_BUG_ON_PAGE earlier in free_huge_page()
memblock: annotate memblock_is_reserved() with __init_memblock
psi: fix reference to kernel commandline enable
arch/sh/include/asm/io.h: provide prototypes for PCI I/O mapping in asm/io.h
mm/sparse: add common helper to mark all memblocks present
mm: introduce common STRUCT_PAGE_MAX_SHIFT define
alpha: fix hang caused by the bootmem removal
|
|
vr.mifi is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
net/ipv6/ip6mr.c:1845 ip6mr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
net/ipv6/ip6mr.c:1919 ip6mr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
Fix this by sanitizing vr.mifi before using it to index mrt->vif_table'
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 69bd48404f25 ("net/sched: Remove egdev mechanism"),
tc_setup_cb_call() is nearly identical to tcf_block_cb_call(),
so we can just fold tcf_block_cb_call() into tc_setup_cb_call()
and remove its unused parameter 'exts'.
Fixes: 69bd48404f25 ("net/sched: Remove egdev mechanism")
Cc: Oz Shlomo <ozsh@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The spdxcheck script currently falls over when confronted with a binary
file (such as Documentation/logo.gif). To avoid that, always open files
in binary mode and decode line-by-line, ignoring encoding errors.
One tricky case is when piping data into the script and reading it from
standard input. By default, standard input will be opened in text mode,
so we need to reopen it in binary mode.
The breakage only happens with python3 and results in a
UnicodeDecodeError (according to Uwe).
Link: http://lkml.kernel.org/r/20181212131210.28024-1-thierry.reding@gmail.com
Fixes: 6f4d29df66ac ("scripts/spdxcheck.py: make python3 compliant")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Jeremy Cline <jcline@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joe Perches <joe@perches.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is actually a space after "sp," like this,
ffff2000080813c8: a9bb7bfd stp x29, x30, [sp, #-80]!
Right now, checkstack.pl isn't able to print anything on aarch64,
because it won't be able to match the stating objdump line of a function
due to this missing space. Hence, it displays every stack as zero-size.
After this patch, checkpatch.pl is able to match the start of a
function's objdump, and is then able to calculate each function's stack
correctly.
Link: http://lkml.kernel.org/r/20181207195843.38528-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON. Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.
Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: <stable@vger.kernel.org>
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
migrate_page_move_mapping() expects pages with private data set to have
a page_count elevated by 1. This is what used to happen for xfs through
the buffer_heads code before the switch to iomap in commit 82cb14175e7d
("xfs: add support for sub-pagesize writeback without buffer_heads").
Not having the count elevated causes move_pages() to fail on memory
mapped files coming from xfs.
Make iomap compatible with the migrate_page_move_mapping() assumption by
elevating the page count as part of iomap_page_create() and lowering it
in iomap_page_release().
It causes the move_pages() syscall to misbehave on memory mapped files
from xfs. It does not not move any pages, which I suppose is "just" a
perf issue, but it also ends up returning a positive number which is out
of spec for the syscall. Talking to Michal Hocko, it sounds like
returning positive numbers might be a necessary update to move_pages()
anyway though
(https://lkml.kernel.org/r/20181116114955.GJ14706@dhcp22.suse.cz).
I only hit this in tests that verify that move_pages() actually moved
the pages. The test also got confused by the positive return from
move_pages() (it got treated as a success as positive numbers were not
expected and not handled) making it a bit harder to track down what's
going on.
Link: http://lkml.kernel.org/r/20181115184140.1388751-1-pjaroszynski@nvidia.com
Fixes: 82cb14175e7d ("xfs: add support for sub-pagesize writeback without buffer_heads")
Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A stack trace was triggered by VM_BUG_ON_PAGE(page_mapcount(page), page)
in free_huge_page(). Unfortunately, the page->mapping field was set to
NULL before this test. This made it more difficult to determine the
root cause of the problem.
Move the VM_BUG_ON_PAGE tests earlier in the function so that if they do
trigger more information is present in the page struct.
Link: http://lkml.kernel.org/r/1543491843-23438-1-git-send-email-nic_w@163.com
Signed-off-by: Yongkai Wu <nic_w@163.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Found warning:
WARNING: EXPORT symbol "gsi_write_channel_scratch" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: vmlinux.o(.text+0x1e0a0): Section mismatch in reference from the function valid_phys_addr_range() to the function .init.text:memblock_is_reserved()
The function valid_phys_addr_range() references
the function __init memblock_is_reserved().
This is often because valid_phys_addr_range lacks a __init
annotation or the annotation of memblock_is_reserved is wrong.
Use __init_memblock instead of __init.
Link: http://lkml.kernel.org/r/BLUPR13MB02893411BF12EACB61888E80DFAE0@BLUPR13MB0289.namprd13.prod.outlook.com
Signed-off-by: Yueyi Li <liyueyi@live.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The kernel commandline parameter named in CONFIG_PSI_DEFAULT_DISABLED
help text contradicts the documentation in kernel-parameters.txt, and
the code. Fix that.
Link: http://lkml.kernel.org/r/20181203213416.GA12627@cmpxchg.org
Fixes: e0c274472d ("psi: make disabling/enabling easier for vendor kernels")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|