Age | Commit message (Collapse) | Author |
|
This is in preparation for the logic behind MEMORY_DEVICE_DEVDAX also
being used by non DAX devices.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: https://lore.kernel.org/r/20200901083326.21264-3-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
In order to protect against the header being included multiple times
on the same compilation unit.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20200901083326.21264-2-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
syzbot reports,
WARNING: inconsistent lock state
5.9.0-rc2-syzkaller #0 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor.0/26715 takes:
(padata_works_lock){+.?.}-{2:2}, at: padata_do_parallel kernel/padata.c:220
{IN-SOFTIRQ-W} state was registered at:
spin_lock include/linux/spinlock.h:354 [inline]
padata_do_parallel kernel/padata.c:220
...
__do_softirq kernel/softirq.c:298
...
sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt arch/x86/include/asm/idtentry.h:581
Possible unsafe locking scenario:
CPU0
----
lock(padata_works_lock);
<Interrupt>
lock(padata_works_lock);
padata_do_parallel() takes padata_works_lock with softirqs enabled, so a
deadlock is possible if, on the same CPU, the lock is acquired in
process context and then softirq handling done in an interrupt leads to
the same path.
Fix by leaving softirqs disabled while do_parallel holds
padata_works_lock.
Reported-by: syzbot+f4b9f49e38e25eb4ef52@syzkaller.appspotmail.com
Fixes: 4611ce2246889 ("padata: allocate work structures for parallel jobs from a pool")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Commit:
cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability")
uses "-1" as the starting node ID, which causes the strange kernel log as
follows, when "numa=fake=32G" is added to the kernel command line:
Faking node -1 at [mem 0x0000000000000000-0x0000000893ffffff] (35136MB)
Faking node 0 at [mem 0x0000001840000000-0x000000203fffffff] (32768MB)
Faking node 1 at [mem 0x0000000894000000-0x000000183fffffff] (64192MB)
Faking node 2 at [mem 0x0000002040000000-0x000000283fffffff] (32768MB)
Faking node 3 at [mem 0x0000002840000000-0x000000303fffffff] (32768MB)
And finally the kernel crashes:
BUG: Bad page state in process swapper pfn:00011
page:(____ptrval____) refcount:0 mapcount:1 mapping:(____ptrval____) index:0x55cd7e44b270 pfn:0x11
failed to read mapping contents, not a valid kernel address?
flags: 0x5(locked|uptodate)
raw: 0000000000000005 000055cd7e44af30 000055cd7e44af50 0000000100000006
raw: 000055cd7e44b270 000055cd7e44b290 0000000000000000 000055cd7e44b510
page dumped because: page still charged to cgroup
page->mem_cgroup:000055cd7e44b510
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc2 #1
Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
Call Trace:
dump_stack+0x57/0x80
bad_page.cold+0x63/0x94
__free_pages_ok+0x33f/0x360
memblock_free_all+0x127/0x195
mem_init+0x23/0x1f5
start_kernel+0x219/0x4f5
secondary_startup_64+0xb6/0xc0
Fix this bug via using 0 as the starting node ID. This restores the
original behavior before cc9aec03e58f.
[ mingo: Massaged the changelog. ]
Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200904061047.612950-1-ying.huang@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tools fixes from Arnaldo Carvalho de Melo:
- Use uintptr_t when casting numbers to pointers
- Keep output expected by 3rd parties: Turn off summary for interval
mode by default.
- BPF is in kernel space, make sure do_validate_kcore_modules() knows
about that.
- Explicitly call out event modifiers in the documentation.
- Fix jevents() allocation of space for regular expressions.
- Address libtraceevent build warnings on 32-bit arches.
- Fix checking of functions returns using ERR_PTR() in 'perf bench'.
* tag 'perf-tools-fixes-for-v5.9-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Add bpf image check to __map__is_kmodule
perf record/stat: Explicitly call out event modifiers in the documentation
perf bench: The do_run_multi_threaded() function must use IS_ERR(perf_session__new())
perf stat: Turn off summary for interval mode by default
libtraceevent: Fix build warning on 32-bit arches
perf jevents: Fix suspicious code in fixregex()
perf parse-events: Use uintptr_t when casting numbers to pointers
|
|
Pull networking fixes from David Miller:
1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
Kivilinna.
2) Fix loss of RTT samples in rxrpc, from David Howells.
3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.
4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.
5) We disable BH for too lokng in sctp_get_port_local(), add a
cond_resched() here as well, from Xin Long.
6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.
7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
Yonghong Song.
8) Missing of_node_put() in mt7530 DSA driver, from Sumera
Priyadarsini.
9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.
10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.
11) Memory leak in rxkad_verify_response, from Dinghao Liu.
12) In tipc, don't use smp_processor_id() in preemptible context. From
Tuong Lien.
13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.
14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.
15) Fix ABI mismatch between driver and firmware in nfp, from Louis
Peens.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
net/smc: fix sock refcounting in case of termination
net/smc: reset sndbuf_desc if freed
net/smc: set rx_off for SMCR explicitly
net/smc: fix toleration of fake add_link messages
tg3: Fix soft lockup when tg3_reset_task() fails.
doc: net: dsa: Fix typo in config code sample
net: dp83867: Fix WoL SecureOn password
nfp: flower: fix ABI mismatch between driver and firmware
tipc: fix shutdown() of connectionless socket
ipv6: Fix sysctl max for fib_multipath_hash_policy
drivers/net/wan/hdlc: Change the default of hard_header_len to 0
net: gemini: Fix another missing clk_disable_unprepare() in probe
net: bcmgenet: fix mask check in bcmgenet_validate_flow()
amd-xgbe: Add support for new port mode
net: usb: dm9601: Add USB ID of Keenetic Plus DSL
vhost: fix typo in error message
net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
pktgen: fix error message with wrong function name
net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
cxgb4: fix thermal zone device registration
...
|
|
Merge gate page refcount fix from Dave Hansen:
"During the conversion over to pin_user_pages(), gate pages were missed.
The fix is pretty simple, and is accompanied by a new test from Andy
which probably would have caught this earlier"
* emailed patches from Dave Hansen <dave.hansen@linux.intel.com>:
selftests/x86/test_vsyscall: Improve the process_vm_readv() test
mm: fix pin vs. gup mismatch with gate pages
|
|
The existing code accepted process_vm_readv() success or failure as long
as it didn't return garbage. This is too weak: if the vsyscall page is
readable, then process_vm_readv() should succeed and, if the page is not
readable, then it should fail.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Gate pages were missed when converting from get to pin_user_pages().
This can lead to refcount imbalances. This is reliably and quickly
reproducible running the x86 selftests when vsyscall=emulate is enabled
(the default). Fix by using try_grab_page() with appropriate flags
passed.
The long story:
Today, pin_user_pages() and get_user_pages() are similar interfaces for
manipulating page reference counts. However, "pins" use a "bias" value
and manipulate the actual reference count by 1024 instead of 1 used by
plain "gets".
That means that pin_user_pages() must be matched with unpin_user_pages()
and can't be mixed with a plain put_user_pages() or put_page().
Enter gate pages, like the vsyscall page. They are pages usually in the
kernel image, but which are mapped to userspace. Userspace is allowed
access to them, including interfaces using get/pin_user_pages(). The
refcount of these kernel pages is manipulated just like a normal user
page on the get/pin side so that the put/unpin side can work the same
for normal user pages or gate pages.
get_gate_page() uses try_get_page() which only bumps the refcount by
1, not 1024, even if called in the pin_user_pages() path. If someone
pins a gate page, this happens:
pin_user_pages()
get_gate_page()
try_get_page() // bump refcount +1
... some time later
unpin_user_pages()
page_ref_sub_and_test(page, 1024))
... and boom, we get a refcount off by 1023. This is reliably and
quickly reproducible running the x86 selftests when booted with
vsyscall=emulate (the default). The selftests use ptrace(), but I
suspect anything using pin_user_pages() on gate pages could hit this.
To fix it, simply use try_grab_page() instead of try_get_page(), and
pass 'gup_flags' in so that FOLL_PIN can be respected.
This bug traces back to the very beginning of the FOLL_PIN support in
commit 3faa52c03f44 ("mm/gup: track FOLL_PIN pages"), which showed up in
the 5.7 release.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A couple of minor fixes to the display changes that went in for 5.9.
The most important of which is a workaround for a HW bug that was
exposed by better push buffer space management, leading to
random(ish...) display engine hangs.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv5QDxyMihrxbPk+-sORnaYtjR6_dbM68gEhb2wxht_G1w@mail.gmail.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc4:
- Clang build warning fix
- HDCP fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sgbz2pnx.fsf@intel.com
|
|
git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.9-2020-09-03:
amdgpu:
- Fix for 32bit systems
- SW CTF fix
- Update for Sienna Cichlid
- CIK bug fixes
radeon:
- PLL fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200903050022.3960-1-alexander.deucher@amd.com
|
|
Yonghong Song says:
====================
Currently, the bpf hashmap iterator takes a bucket_lock, a spin_lock,
before visiting each element in the bucket. This will cause a deadlock
if a map update/delete operates on an element with the same
bucket id of the visited map.
To avoid the deadlock, let us just use rcu_read_lock instead of
bucket_lock. This may result in visiting stale elements, missing some elements,
or repeating some elements, if concurrent map delete/update happens for the
same map. I think using rcu_read_lock is a reasonable compromise.
For users caring stale/missing/repeating element issues, bpf map batch
access syscall interface can be used.
Note that another approach is during bpf_iter link stage, we check
whether the iter program might be able to do update/delete to the visited
map. If it is, reject the link_create. Verifier needs to record whether
an update/delete operation happens for each map for this approach.
I just feel this checking is too specialized, hence still prefer
rcu_read_lock approach.
Patch #1 has the kernel implementation and Patch #2 added a selftest
which can trigger deadlock without Patch #1.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Added bpf_{updata,delete}_map_elem to the very map element the
iter program is visiting. Due to rcu protection, the visited map
elements, although stale, should still contain correct values.
$ ./test_progs -n 4/18
#4/18 bpf_hash_map:OK
#4 bpf_iter:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200902235341.2001534-1-yhs@fb.com
|
|
Currently, for hashmap, the bpf iterator will grab a bucket lock, a
spinlock, before traversing the elements in the bucket. This can ensure
all bpf visted elements are valid. But this mechanism may cause
deadlock if update/deletion happens to the same bucket of the
visited map in the program. For example, if we added bpf_map_update_elem()
call to the same visited element in selftests bpf_iter_bpf_hash_map.c,
we will have the following deadlock:
============================================
WARNING: possible recursive locking detected
5.9.0-rc1+ #841 Not tainted
--------------------------------------------
test_progs/1750 is trying to acquire lock:
ffff9a5bb73c5e70 (&htab->buckets[i].raw_lock){....}-{2:2}, at: htab_map_update_elem+0x1cf/0x410
but task is already holding lock:
ffff9a5bb73c5e20 (&htab->buckets[i].raw_lock){....}-{2:2}, at: bpf_hash_map_seq_find_next+0x94/0x120
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&htab->buckets[i].raw_lock);
lock(&htab->buckets[i].raw_lock);
*** DEADLOCK ***
...
Call Trace:
dump_stack+0x78/0xa0
__lock_acquire.cold.74+0x209/0x2e3
lock_acquire+0xba/0x380
? htab_map_update_elem+0x1cf/0x410
? __lock_acquire+0x639/0x20c0
_raw_spin_lock_irqsave+0x3b/0x80
? htab_map_update_elem+0x1cf/0x410
htab_map_update_elem+0x1cf/0x410
? lock_acquire+0xba/0x380
bpf_prog_ad6dab10433b135d_dump_bpf_hash_map+0x88/0xa9c
? find_held_lock+0x34/0xa0
bpf_iter_run_prog+0x81/0x16e
__bpf_hash_map_seq_show+0x145/0x180
bpf_seq_read+0xff/0x3d0
vfs_read+0xad/0x1c0
ksys_read+0x5f/0xe0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
...
The bucket_lock first grabbed in seq_ops->next() called by bpf_seq_read(),
and then grabbed again in htab_map_update_elem() in the bpf program, causing
deadlocks.
Actually, we do not need bucket_lock here, we can just use rcu_read_lock()
similar to netlink iterator where the rcu_read_{lock,unlock} likes below:
seq_ops->start():
rcu_read_lock();
seq_ops->next():
rcu_read_unlock();
/* next element */
rcu_read_lock();
seq_ops->stop();
rcu_read_unlock();
Compared to old bucket_lock mechanism, if concurrent updata/delete happens,
we may visit stale elements, miss some elements, or repeat some elements.
I think this is a reasonable compromise. For users wanting to avoid
stale, missing/repeated accesses, bpf_map batch access syscall interface
can be used.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200902235340.2001375-1-yhs@fb.com
|
|
The returned value of bpf_object__open_file() should be checked with
libbpf_get_error() rather than NULL. This fix prevents test_progs from
crash when test_global_data.o is not present.
Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200903200528.747884-1-haoluo@google.com
|
|
Andrii Nakryiko says:
====================
Currently, libbpf supports a limited form of BPF-to-BPF subprogram calls. The
restriction is that entry-point BPF program should use *all* of defined
sub-programs in BPF .o file. If any of the subprograms is not used, such
entry-point BPF program will be rejected by verifier as containing unreachable
dead code. This is not a big limitation for cases with single entry-point BPF
programs, but is quite a heavy restriction for multi-programs that use only
partially overlapping set of subprograms.
This patch set removes all such restrictions and adds complete support for
using BPF sub-program calls on BPF side. This is achieved through libbpf
tracking subprograms individually and detecting which subprograms are used by
any given entry-point BPF program, and subsequently only appending and
relocating code for just those used subprograms.
In addition, libbpf now also supports multiple entry-point BPF programs within
the same ELF section. This allows to structure code so that there are few
variants of BPF programs of the same type and attaching to the same target
(e.g., for tracepoints and kprobes) without the need to worry about ELF
section name clashes.
This patch set opens way for more wider adoption of BPF subprogram calls,
especially for real-world production use-cases with complicated net of
subprograms. This will allow to further scale BPF verification process through
good use of global functions, which can be verified independently. This is
also important prerequisite for static linking which allows static BPF
libraries to not worry about naming clashes for section names, as well as use
static non-inlined functions (subprograms) without worries of verifier
rejecting program due to dead code.
Patch set is structured as follows:
- patched 1-6 contain all the libbpf changes necessary to support multi-prog
sections and bpf2bpf subcalls;
- patch 7 adds dedicated selftests validating all combinations of possible
sub-calls (within and across sections, static vs global functions);
- patch 8 deprecated bpf_program__title() in favor of
bpf_program__section_name(). The intent was to also deprecate
bpf_object__find_program_by_title() as it's now non-sensical with multiple
programs per section. But there were too many selftests uses of this and
I didn't want to delay this patches further and make it even bigger, so left
it for a follow up cleanup;
- patches 9-10 remove uses for title-related APIs from bpftool and
bpf_program__title() use from selftests;
- patch 11 is converting fexit_bpf2bpf to have explicit subtest (it does
contain 4 subtests, which are not handled as sub-tests);
- patches 12-14 convert few complicated BPF selftests to use __noinline
functions to further validate correctness of libbpf's bpf2bpf processing
logic.
v2->v3:
- explained subprog relocation algorithm in more details (Alexei);
- pyperf, strobelight and cls_redirect got new subprog variants, leaving
other modes intact (Alexei);
v1->v2:
- rename DEPRECATED to LIBBPF_DEPRECATED to avoid name clashes;
- fix test_subprogs build;
- convert a bunch of complicated selftests to __noinline (Alexei).
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
As one of the most complicated and close-to-real-world programs, cls_redirect
is a good candidate to exercise libbpf's logic of handling bpf2bpf calls. So
add variant with using explicit __noinline for majority of functions except
few most basic ones. If those few functions are inlined, verifier starts to
complain about program instruction limit of 1mln instructions being exceeded,
most probably due to instruction overhead of doing a sub-program call.
Convert user-space part of selftest to have to sub-tests: with and without
inlining.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-15-andriin@fb.com
|
|
Update xdp_noinline to use BPF skeleton and force __noinline on helper
sub-programs. Also, split existing logic into v4- and v6-only to complicate
sub-program calling patterns (partially overlapped sets of functions for
entry-level BPF programs).
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-14-andriin@fb.com
|
|
Add use of non-inlined subprogs to few bigger selftests to excercise libbpf's
bpf2bpf handling logic. Also split l4lb_all selftest into two sub-tests.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-13-andriin@fb.com
|
|
There are clearly 4 subtests, so make it official.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-12-andriin@fb.com
|
|
BPF program title is ambigious and misleading term. It is ELF section name, so
let's just call it that and deprecate bpf_program__title() API in favor of
bpf_program__section_name().
Additionally, using bpf_object__find_program_by_title() is now inherently
dangerous and ambiguous, as multiple BPF program can have the same section
name. So deprecate this API as well and recommend to switch to non-ambiguous
bpf_object__find_program_by_name().
Internally, clean up usage and mis-usage of BPF program section name for
denoting BPF program name. Shorten the field name to prog->sec_name to be
consistent with all other prog->sec_* variables.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-11-andriin@fb.com
|
|
Remove all uses of bpf_program__title() and
bpf_program__find_program_by_title().
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-10-andriin@fb.com
|
|
bpf_program__title() is deprecated, switch to bpf_program__section_name() and
avoid compilation warnings.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-9-andriin@fb.com
|
|
Add a selftest excercising bpf-to-bpf subprogram calls, as well as multiple
entry-point BPF programs per section. Also make sure that BPF CO-RE works for
such set ups both for sub-programs and for multi-entry sections.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-8-andriin@fb.com
|
|
Adjust struct_ops handling code to work with multi-program ELF sections
properly.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-7-andriin@fb.com
|
|
Complete multi-prog sections and multi sub-prog support in libbpf by properly
adjusting .BTF.ext's line and function information. Mark exposed
btf_ext__reloc_func_info() and btf_ext__reloc_func_info() APIs as deprecated.
These APIs have simplistic assumption that all sub-programs are going to be
appended to all main BPF programs, which doesn't hold in real life. It's
unlikely there are any users of this API, as it's very libbpf
internals-specific.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-6-andriin@fb.com
|
|
This patch implements general and correct logic for bpf-to-bpf sub-program
calls. Only sub-programs used (called into) from entry-point (main) BPF
program are going to be appended at the end of main BPF program. This ensures
that BPF verifier won't encounter any dead code due to copying unreferenced
sub-program. This change means that each entry-point (main) BPF program might
have a different set of sub-programs appended to it and potentially in
different order. This has implications on how sub-program call relocations
need to be handled, described below.
All relocations are now split into two categores: data references (maps and
global variables) and code references (sub-program calls). This distinction is
important because data references need to be relocated just once per each BPF
program and sub-program. These relocation are agnostic to instruction
locations, because they are not code-relative and they are relocating against
static targets (maps, variables with fixes offsets, etc).
Sub-program RELO_CALL relocations, on the other hand, are highly-dependent on
code position, because they are recorded as instruction-relative offset. So
BPF sub-programs (those that do calls into other sub-programs) can't be
relocated once, they need to be relocated each time such a sub-program is
appended at the end of the main entry-point BPF program. As mentioned above,
each main BPF program might have different subset and differen order of
sub-programs, so call relocations can't be done just once. Splitting data
reference and calls relocations as described above allows to do this
efficiently and cleanly.
bpf_object__find_program_by_name() will now ignore non-entry BPF programs.
Previously one could have looked up '.text' fake BPF program, but the
existence of such BPF program was always an implementation detail and you
can't do much useful with it. Now, though, all non-entry sub-programs get
their own BPF program with name corresponding to a function name, so there is
no more '.text' name for BPF program. This means there is no regression,
effectively, w.r.t. API behavior. But this is important aspect to highlight,
because it's going to be critical once libbpf implements static linking of BPF
programs. Non-entry static BPF programs will be allowed to have conflicting
names, but global and main-entry BPF program names should be unique. Just like
with normal user-space linking process. So it's important to restrict this
aspect right now, keep static and non-entry functions as internal
implementation details, and not have to deal with regressions in behavior
later.
This patch leaves .BTF.ext adjustment as is until next patch.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200903203542.15944-5-andriin@fb.com
|
|
Fix up CO-RE relocation code to handle relocations against ELF sections
containing multiple BPF programs. This requires lookup of a BPF program by its
section name and instruction index it contains. While it could have been done
as a simple loop, it could run into performance issues pretty quickly, as
number of CO-RE relocations can be quite large in real-world applications, and
each CO-RE relocation incurs BPF program look up now. So instead of simple
loop, implement a binary search by section name + insn offset.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-4-andriin@fb.com
|
|
Teach libbpf how to parse code sections into potentially multiple bpf_program
instances, based on ELF FUNC symbols. Each BPF program will keep track of its
position within containing ELF section for translating section instruction
offsets into program instruction offsets: regardless of BPF program's location
in ELF section, it's first instruction is always at local instruction offset
0, so when libbpf is working with relocations (which use section-based
instruction offsets) this is critical to make proper translations.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-3-andriin@fb.com
|
|
libbpf ELF parsing logic might need symbols available before ELF parsing is
completed, so we need to make sure that symbols table section is found in
a separate pass before all the subsequent sections are processed.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200903203542.15944-2-andriin@fb.com
|
|
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.
When memory is allocated in 'smsc9420_probe()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.
While at it, rewrite the size passed to 'dma_alloc_coherent()' the same way
as the one passed to 'dma_free_coherent()'. This form is less verbose:
sizeof(struct smsc9420_dma_desc) * RX_RING_SIZE +
sizeof(struct smsc9420_dma_desc) * TX_RING_SIZE,
vs
sizeof(struct smsc9420_dma_desc) * (RX_RING_SIZE + TX_RING_SIZE)
@@
@@
- PCI_DMA_BIDIRECTIONAL
+ DMA_BIDIRECTIONAL
@@
@@
- PCI_DMA_TODEVICE
+ DMA_TO_DEVICE
@@
@@
- PCI_DMA_FROMDEVICE
+ DMA_FROM_DEVICE
@@
@@
- PCI_DMA_NONE
+ DMA_NONE
@@
expression e1, e2, e3;
@@
- pci_alloc_consistent(e1, e2, e3)
+ dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
@@
expression e1, e2, e3;
@@
- pci_zalloc_consistent(e1, e2, e3)
+ dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
@@
expression e1, e2, e3, e4;
@@
- pci_free_consistent(e1, e2, e3, e4)
+ dma_free_coherent(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_map_single(e1, e2, e3, e4)
+ dma_map_single(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_single(e1, e2, e3, e4)
+ dma_unmap_single(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4, e5;
@@
- pci_map_page(e1, e2, e3, e4, e5)
+ dma_map_page(&e1->dev, e2, e3, e4, e5)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_page(e1, e2, e3, e4)
+ dma_unmap_page(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_map_sg(e1, e2, e3, e4)
+ dma_map_sg(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_sg(e1, e2, e3, e4)
+ dma_unmap_sg(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+ dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_single_for_device(e1, e2, e3, e4)
+ dma_sync_single_for_device(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+ dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+ dma_sync_sg_for_device(&e1->dev, e2, e3, e4)
@@
expression e1, e2;
@@
- pci_dma_mapping_error(e1, e2)
+ dma_mapping_error(&e1->dev, e2)
@@
expression e1, e2;
@@
- pci_set_dma_mask(e1, e2)
+ dma_set_mask(&e1->dev, e2)
@@
expression e1, e2;
@@
- pci_set_consistent_dma_mask(e1, e2)
+ dma_set_coherent_mask(&e1->dev, e2)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.
When memory is allocated in 'epic_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.
@@
@@
- PCI_DMA_BIDIRECTIONAL
+ DMA_BIDIRECTIONAL
@@
@@
- PCI_DMA_TODEVICE
+ DMA_TO_DEVICE
@@
@@
- PCI_DMA_FROMDEVICE
+ DMA_FROM_DEVICE
@@
@@
- PCI_DMA_NONE
+ DMA_NONE
@@
expression e1, e2, e3;
@@
- pci_alloc_consistent(e1, e2, e3)
+ dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
@@
expression e1, e2, e3;
@@
- pci_zalloc_consistent(e1, e2, e3)
+ dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
@@
expression e1, e2, e3, e4;
@@
- pci_free_consistent(e1, e2, e3, e4)
+ dma_free_coherent(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_map_single(e1, e2, e3, e4)
+ dma_map_single(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_single(e1, e2, e3, e4)
+ dma_unmap_single(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4, e5;
@@
- pci_map_page(e1, e2, e3, e4, e5)
+ dma_map_page(&e1->dev, e2, e3, e4, e5)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_page(e1, e2, e3, e4)
+ dma_unmap_page(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_map_sg(e1, e2, e3, e4)
+ dma_map_sg(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_sg(e1, e2, e3, e4)
+ dma_unmap_sg(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+ dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_single_for_device(e1, e2, e3, e4)
+ dma_sync_single_for_device(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+ dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+ dma_sync_sg_for_device(&e1->dev, e2, e3, e4)
@@
expression e1, e2;
@@
- pci_dma_mapping_error(e1, e2)
+ dma_mapping_error(&e1->dev, e2)
@@
expression e1, e2;
@@
- pci_set_dma_mask(e1, e2)
+ dma_set_mask(&e1->dev, e2)
@@
expression e1, e2;
@@
- pci_set_consistent_dma_mask(e1, e2)
+ dma_set_coherent_mask(&e1->dev, e2)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Karsten Graul says:
====================
net/smc: fixes 2020-09-03
Please apply the following patch series for smc to netdev's net tree.
Patch 1 fixes the toleration of older SMC implementations. Patch 2
takes care of a problem that happens when SMCR is used after SMCD
initialization failed. Patch 3 fixes a problem with freed send buffers,
and patch 4 corrects refcounting when SMC terminates due to device
removal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When an ISM device is removed, all its linkgroups are terminated,
i.e. all the corresponding connections are killed.
Connection killing invokes smc_close_active_abort(), which decreases
the sock refcount for certain states to simulate passive closing.
And it cancels the close worker and has to give up the sock lock for
this timeframe. This opens the door for a passive close worker or a
socket close to run in between. In this case smc_close_active_abort() and
passive close worker resp. smc_release() might do a sock_put for passive
closing. This causes:
[ 1323.315943] refcount_t: underflow; use-after-free.
[ 1323.316055] WARNING: CPU: 3 PID: 54469 at lib/refcount.c:28 refcount_warn_saturate+0xe8/0x130
[ 1323.316069] Kernel panic - not syncing: panic_on_warn set ...
[ 1323.316084] CPU: 3 PID: 54469 Comm: uperf Not tainted 5.9.0-20200826.rc2.git0.46328853ed20.300.fc32.s390x+debug #1
[ 1323.316096] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
[ 1323.316108] Call Trace:
[ 1323.316125] [<00000000c0d4aae8>] show_stack+0x90/0xf8
[ 1323.316143] [<00000000c15989b0>] dump_stack+0xa8/0xe8
[ 1323.316158] [<00000000c0d8344e>] panic+0x11e/0x288
[ 1323.316173] [<00000000c0d83144>] __warn+0xac/0x158
[ 1323.316187] [<00000000c1597a7a>] report_bug+0xb2/0x130
[ 1323.316201] [<00000000c0d36424>] monitor_event_exception+0x44/0xc0
[ 1323.316219] [<00000000c195c716>] pgm_check_handler+0x1da/0x238
[ 1323.316234] [<00000000c151844c>] refcount_warn_saturate+0xec/0x130
[ 1323.316280] ([<00000000c1518448>] refcount_warn_saturate+0xe8/0x130)
[ 1323.316310] [<000003ff801f2e2a>] smc_release+0x192/0x1c8 [smc]
[ 1323.316323] [<00000000c169f1fa>] __sock_release+0x5a/0xe0
[ 1323.316334] [<00000000c169f2ac>] sock_close+0x2c/0x40
[ 1323.316350] [<00000000c1086de0>] __fput+0xb8/0x278
[ 1323.316362] [<00000000c0db1e0e>] task_work_run+0x76/0xb8
[ 1323.316393] [<00000000c0d8ab84>] do_exit+0x26c/0x520
[ 1323.316408] [<00000000c0d8af08>] do_group_exit+0x48/0xc0
[ 1323.316421] [<00000000c0d8afa8>] __s390x_sys_exit_group+0x28/0x38
[ 1323.316433] [<00000000c195c32c>] system_call+0xe0/0x2b4
[ 1323.316446] 1 lock held by uperf/54469:
[ 1323.316456] #0: 0000000044125e60 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: __sock_release+0x44/0xe0
The patch rechecks sock state in smc_close_active_abort() after
smc_close_cancel_work() to avoid duplicate decrease of sock
refcount for the same purpose.
Fixes: 611b63a12732 ("net/smc: cancel tx worker in case of socket aborts")
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When an SMC connection is created, and there is a problem to
create an RMB or DMB, the previously created send buffer is
thrown away as well including buffer descriptor freeing.
Make sure the connection no longer references the freed
buffer descriptor, otherwise bugs like this are possible:
[71556.835148] =============================================================================
[71556.835168] BUG kmalloc-128 (Tainted: G B OE ): Poison overwritten
[71556.835172] -----------------------------------------------------------------------------
[71556.835179] INFO: 0x00000000d20894be-0x00000000aaef63e9 @offset=2724. First byte 0x0 instead of 0x6b
[71556.835215] INFO: Allocated in __smc_buf_create+0x184/0x578 [smc] age=0 cpu=5 pid=46726
[71556.835234] ___slab_alloc+0x5a4/0x690
[71556.835239] __slab_alloc.constprop.0+0x70/0xb0
[71556.835243] kmem_cache_alloc_trace+0x38e/0x3f8
[71556.835250] __smc_buf_create+0x184/0x578 [smc]
[71556.835257] smc_buf_create+0x2e/0xe8 [smc]
[71556.835264] smc_listen_work+0x516/0x6a0 [smc]
[71556.835275] process_one_work+0x280/0x478
[71556.835280] worker_thread+0x66/0x368
[71556.835287] kthread+0x17a/0x1a0
[71556.835294] ret_from_fork+0x28/0x2c
[71556.835301] INFO: Freed in smc_buf_create+0xd8/0xe8 [smc] age=0 cpu=5 pid=46726
[71556.835307] __slab_free+0x246/0x560
[71556.835311] kfree+0x398/0x3f8
[71556.835318] smc_buf_create+0xd8/0xe8 [smc]
[71556.835324] smc_listen_work+0x516/0x6a0 [smc]
[71556.835328] process_one_work+0x280/0x478
[71556.835332] worker_thread+0x66/0x368
[71556.835337] kthread+0x17a/0x1a0
[71556.835344] ret_from_fork+0x28/0x2c
[71556.835348] INFO: Slab 0x00000000a0744551 objects=51 used=51 fp=0x0000000000000000 flags=0x1ffff00000010200
[71556.835352] INFO: Object 0x00000000563480a1 @offset=2688 fp=0x00000000289567b2
[71556.835359] Redzone 000000006783cde2: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835363] Redzone 00000000e35b876e: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835367] Redzone 0000000023074562: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835372] Redzone 00000000b9564b8c: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835376] Redzone 00000000810c6362: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835380] Redzone 0000000065ef52c3: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835384] Redzone 00000000c5dd6984: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835388] Redzone 000000004c480f8f: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[71556.835392] Object 00000000563480a1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[71556.835397] Object 000000009c479d06: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[71556.835401] Object 000000006e1dce92: 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b kkkk....kkkkkkkk
[71556.835405] Object 00000000227f7cf8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[71556.835410] Object 000000009a701215: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[71556.835414] Object 000000003731ce76: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[71556.835418] Object 00000000f7085967: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[71556.835422] Object 0000000007f99927: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[71556.835427] Redzone 00000000579c4913: bb bb bb bb bb bb bb bb ........
[71556.835431] Padding 00000000305aef82: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[71556.835435] Padding 00000000b1cdd722: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[71556.835438] Padding 00000000c7568199: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[71556.835442] Padding 00000000fad4c4d4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[71556.835451] CPU: 0 PID: 47939 Comm: kworker/0:15 Tainted: G B OE 5.9.0-rc1uschi+ #54
[71556.835456] Hardware name: IBM 3906 M03 703 (LPAR)
[71556.835464] Workqueue: events smc_listen_work [smc]
[71556.835470] Call Trace:
[71556.835478] [<00000000d5eaeb10>] show_stack+0x90/0xf8
[71556.835493] [<00000000d66fc0f8>] dump_stack+0xa8/0xe8
[71556.835499] [<00000000d61a511c>] check_bytes_and_report+0x104/0x130
[71556.835504] [<00000000d61a57b2>] check_object+0x26a/0x2e0
[71556.835509] [<00000000d61a59bc>] alloc_debug_processing+0x194/0x238
[71556.835514] [<00000000d61a8c14>] ___slab_alloc+0x5a4/0x690
[71556.835519] [<00000000d61a9170>] __slab_alloc.constprop.0+0x70/0xb0
[71556.835524] [<00000000d61aaf66>] kmem_cache_alloc_trace+0x38e/0x3f8
[71556.835530] [<000003ff80549bbc>] __smc_buf_create+0x184/0x578 [smc]
[71556.835538] [<000003ff8054a396>] smc_buf_create+0x2e/0xe8 [smc]
[71556.835545] [<000003ff80540c16>] smc_listen_work+0x516/0x6a0 [smc]
[71556.835549] [<00000000d5f0f448>] process_one_work+0x280/0x478
[71556.835554] [<00000000d5f0f6a6>] worker_thread+0x66/0x368
[71556.835559] [<00000000d5f18692>] kthread+0x17a/0x1a0
[71556.835563] [<00000000d6abf3b8>] ret_from_fork+0x28/0x2c
[71556.835569] INFO: lockdep is turned off.
[71556.835573] FIX kmalloc-128: Restoring 0x00000000d20894be-0x00000000aaef63e9=0x6b
[71556.835577] FIX kmalloc-128: Marking all objects used
Fixes: fd7f3a746582 ("net/smc: remove freed buffer from list")
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SMC tries to make use of SMCD first. If a problem shows up,
it tries to switch to SMCR. If the SMCD initializing problem shows
up after the SMCD connection has already been initialized, field
rx_off keeps the wrong SMCD value for SMCR, which results in corrupted
data at the receiver.
This patch adds an explicit (re-)setting of field rx_off to zero if the
connection uses SMCR.
Fixes: be244f28d22f ("net/smc: add SMC-D support in data transfer")
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Older SMCR implementations had no link failover support and used one
link only. Because the handshake protocol requires to try the
establishment of a second link the old code sent a fake add_link message
and declined any server response afterwards.
The current code supports multiple links and inspects the received fake
add_link message more closely. To tolerate the fake add_link messages
smc_llc_is_local_add_link() needs an improved check of the message to
be able to separate between locally enqueued and fake add_link messages.
And smc_llc_cli_add_link() needs to check if the provided qp_mtu size is
invalid and reject the add_link request in that case.
Fixes: c48254fa48e5 ("net/smc: move add link processing for new device into llc layer")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix spacing issues reported for misaligned switch..case and extra new
lines.
Also updated the file header to comply with networking commet style.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Expose all exisiting inet sockopt bits through inet_diag for debug purpose.
Corresponding changes in iproute2 ss will be submitted to output all
these values.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florian Fainelli says:
====================
net: dsa: bcm_sf2: Clock support
This patch series adds support for controlling the SF2 switch core and
divider clock (where applicable).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Whenever a port gets enabled/disabled, recalcultate the required switch
clock rate to make sure it always gets set to the expected rate
targeting our switch use case. This is only done for the BCM7445 switch
as there is no clocking profile available for BCM7278.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fetch the corresponding clock resource and enable/disable it during
suspend/resume if and only if we have no ports defined for Wake-on-LAN.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Describe the two possible clocks feeding into the Broadcom SF2
integrated Ethernet switch. BCM7445 systems have two clocks, one for the
main switch core clock, and another for controlling the switch clock
divider whereas BCM7278 systems only have the first kind.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florian Fainelli says:
====================
net: systemport: Clock support
This patch series makes the SYSTEMPORT driver request and manage its
main and Wake-on-LAN clocks appropriately.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is necessary to manage the Wake-on-LAN clock to turn on the
appropriate blocks for MPD or CFP-based packet matching to work
otherwise we will not be able to reliably match packets during suspend.
Reported-by: Blair Prescott <blair.prescott@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We disable clocks shortly after probing the device to save as much power as
possible in case the interface is never used. When bcm_sysport_open() is
invoked, clocks are enabled, and disabled in bcm_sysport_stop().
A similar scheme is applied to the suspend/resume functions.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Broadcom SYSTEMPORT adapters require the use of two clocks for
normal operations and during Wake-on-LAN, document those in the binding
document.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If ops->set_phys_id() returned an error, previously we would only break
out of the inner loop, which neither stopped the outer loop nor returned
the error to the user (since 'rc' would be overwritten on the next pass
through the loop).
Thus, rewrite it to use a single loop, so that the break does the right
thing. Use u64 for 'count' and 'i' to prevent overflow in case of
(unreasonably) large values of id.data and n.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before changing this it's a bit confusing to read test output:
raw csum_off with bad offset (fails)
./psock_snd: write: Invalid argument
Change "fails" in the test case description to "expected to fail", so
that the test output can be more understandable.
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|