Age | Commit message (Collapse) | Author |
|
Ioana Radulescu says:
====================
dpaa2-eth: Add support for Rx flow classification
The Management Complex (MC) firmware initially allowed the
configuration of a single key to be used both for Rx flow hashing
and flow classification. This prevented us from supporting
Rx flow classification independently of the hash key configuration.
Newer firmware versions expose separate commands for
configuring the two types of keys, so we can use them to
introduce Rx classification support. For frames that don't match
any classification rule, we fall back to statistical distribution
based on the current hash key.
The first patch in this set updates the Rx hashing code to use
the new firmware API for key config. Subsequent patches introduce
the firmware API for configuring the classification and actual
support for adding and deleting rules via ethtool.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for inserting and deleting Rx flow classification
rules through ethtool.
We support classification based on some header fields for
flow-types ether, ip4, tcp4, udp4 and sctp4.
Rx queues are core affine, so the action argument effectively
selects on which cpu the matching frame will be processed.
Discarding the frame is also supported.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For firmware versions that support it, configure an Rx flow
classification key at probe time.
Hardware expects all rules in the classification table to share
the same key. So we setup a key containing all supported fields
at driver init and when a user adds classification rules through
ethtool, we will just mask out the unused header fields.
Since the key composition process is the same for flow
classification and hashing, reuse existing code where possible.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the array of supported header fields will be used for
Rx flow classification as well, rename it from "hash_fields" to
the more inclusive "dist_fields".
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Management Complex (MC) firmware initially allowed the
configuration of a single key to be used both for Rx flow hashing
and flow classification. This prevented us from supporting
Rx flow classification through ethtool.
Starting with version 10.7.0, the Management Complex(MC) offers
a new set of APIs for separate configuration of Rx hashing and
classification keys.
Update the Rx flow hashing support to use the new API, if available.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver_info field that is used for describing each of the usb-net
drivers using the usbnet.c core all declare their information as const
and the usbnet.c itself does not try and modify the struct.
It is therefore a good idea to make this const in the usbnet.c structure
in case anyone tries to modify it.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-10-01
This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.
For -stable v4.11:
"6e0a4a23c59a ('net/mlx5: E-Switch, Fix out of bound access when setting vport rate')"
For -stable v4.18:
"98d6627c372a ('net/mlx5e: Set vlan masks for all offloaded TC rules')"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzkaller was able to hit the WARN_ON(sock_owned_by_user(sk));
in tcp_close()
While a socket is being closed, it is very possible other
threads find it in rtnetlink dump.
tcp_get_info() will acquire the socket lock for a short amount
of time (slow = lock_sock_fast(sk)/unlock_sock_fast(sk, slow);),
enough to trigger the warning.
Fixes: 67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling tcp_get_info()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Subash Abhinov Kasiviswanathan says:
====================
net: qualcomm: rmnet: Updates 2018-10-02
This series is a set of small fixes for rmnet driver
Patch 1 is a fix for a scenario reported by syzkaller
Patch 2 & 3 are fixes for incorrect allocation flags
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The incoming skb needs to be reallocated in case the headroom
is not sufficient to adjust the ethernet header. This allocation
needs to be atomic otherwise it results in this splat
[<600601bb>] ___might_sleep+0x185/0x1a3
[<603f6314>] ? _raw_spin_unlock_irqrestore+0x0/0x27
[<60069bb0>] ? __wake_up_common_lock+0x95/0xd1
[<600602b0>] __might_sleep+0xd7/0xe2
[<60065598>] ? enqueue_task_fair+0x112/0x209
[<600eea13>] __kmalloc_track_caller+0x5d/0x124
[<600ee9b6>] ? __kmalloc_track_caller+0x0/0x124
[<602696d5>] __kmalloc_reserve.isra.34+0x30/0x7e
[<603f629b>] ? _raw_spin_lock_irqsave+0x0/0x3d
[<6026b744>] pskb_expand_head+0xbf/0x310
[<6025ca6a>] rmnet_rx_handler+0x7e/0x16b
[<6025c9ec>] ? rmnet_rx_handler+0x0/0x16b
[<6027ad0c>] __netif_receive_skb_core+0x301/0x96f
[<60033c17>] ? set_signals+0x0/0x40
[<6027bbcb>] __netif_receive_skb+0x24/0x8e
Fixes: 74692caf1b0b ("net: qualcomm: rmnet: Process packets over ethernet")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The incoming skb needs to be reallocated in case the headroom
is not sufficient to add the MAP header. This allocation needs to
be atomic otherwise it results in the following splat
[32805.801456] BUG: sleeping function called from invalid context
[32805.841141] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[32805.904773] task: ffffffd7c5f62280 task.stack: ffffff80464a8000
[32805.910851] pc : ___might_sleep+0x180/0x188
[32805.915143] lr : ___might_sleep+0x180/0x188
[32806.131520] Call trace:
[32806.134041] ___might_sleep+0x180/0x188
[32806.137980] __might_sleep+0x50/0x84
[32806.141653] __kmalloc_track_caller+0x80/0x3bc
[32806.146215] __kmalloc_reserve+0x3c/0x88
[32806.150241] pskb_expand_head+0x74/0x288
[32806.154269] rmnet_egress_handler+0xb0/0x1d8
[32806.162239] rmnet_vnd_start_xmit+0xc8/0x13c
[32806.166627] dev_hard_start_xmit+0x148/0x280
[32806.181181] sch_direct_xmit+0xa4/0x198
[32806.185125] __qdisc_run+0x1f8/0x310
[32806.188803] net_tx_action+0x23c/0x26c
[32806.192655] __do_softirq+0x220/0x408
[32806.196420] do_softirq+0x4c/0x70
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RMNET RX handler was processing invalid packets that were
originally sent on the real device and were looped back via
dev_loopback_xmit(). This was detected using syzkaller.
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Joe Stringer says:
====================
This series proposes a new helper for the BPF API which allows BPF programs to
perform lookups for sockets in a network namespace. This would allow programs
to determine early on in processing whether the stack is expecting to receive
the packet, and perform some action (eg drop, forward somewhere) based on this
information.
The series is structured roughly into:
* Misc refactor
* Add the socket pointer type
* Add reference tracking to ensure that socket references are freed
* Extend the BPF API to add sk_lookup_xxx() / sk_release() functions
* Add tests/documentation
The helper proposed in this series includes a parameter for a tuple which must
be filled in by the caller to determine the socket to look up. The simplest
case would be filling with the contents of the packet, ie mapping the packet's
5-tuple into the parameter. In common cases, it may alternatively be useful to
reverse the direction of the tuple and perform a lookup, to find the socket
that initiates this connection; and if the BPF program ever performs a form of
IP address translation, it may further be useful to be able to look up
arbitrary tuples that are not based upon the packet, but instead based on state
held in BPF maps or hardcoded in the BPF program.
Currently, access into the socket's fields are limited to those which are
otherwise already accessible, and are restricted to read-only access.
Changes since v3:
* New patch: "bpf: Reuse canonical string formatter for ctx errs"
* Add PTR_TO_SOCKET to is_ctx_reg().
* Add a few new checks to prevent mixing of socket/non-socket pointers.
* Swap order of checks in sock_filter_is_valid_access().
* Prefix register spill macros with "bpf_".
* Add acks from previous round
* Rebase
Changes since v2:
* New patch: "selftests/bpf: Generalize dummy program types".
This enables adding verifier tests for socket lookup with tail calls.
* Define the semantics of the new helpers more clearly in uAPI header.
* Fix release of caller_net when netns is not specified.
* Use skb->sk to find caller net when skb->dev is unavailable.
* Fix build with !CONFIG_NET.
* Replace ptr_id defensive coding when releasing reference state with an
internal error (-EFAULT).
* Remove flags argument to sk_release().
* Add several new assembly tests suggested by Daniel.
* Add a few new C tests.
* Fix typo in verifier error message.
Changes since v1:
* Limit netns_id field to 32 bits
* Reuse reg_type_mismatch() in more places
* Reduce the number of passes at convert_ctx_access()
* Replace ptr_id defensive coding when releasing reference state with an
internal error (-EFAULT)
* Rework 'struct bpf_sock_tuple' to allow passing a packet pointer
* Allow direct packet access from helper
* Fix compile error with CONFIG_IPV6 enabled
* Improve commit messages
Changes since RFC:
* Split up sk_lookup() into sk_lookup_tcp(), sk_lookup_udp().
* Only take references on the socket when necessary.
* Make sk_release() only free the socket reference in this case.
* Fix some runtime reference leaks:
* Disallow BPF_LD_[ABS|IND] instructions while holding a reference.
* Disallow bpf_tail_call() while holding a reference.
* Prevent the same instruction being used for reference and other
pointer type.
* Simplify locating copies of a reference during helper calls by caching
the pointer id from the caller.
* Fix kbuild compilation warnings with particular configs.
* Improve code comments describing the new verifier pieces.
* Tested by Nitin
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Document the new pointer types in the verifier and how the pointer ID
tracking works to ensure that references which are taken are later
released.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add some tests that demonstrate and test the balanced lookup/free
nature of socket lookup. Section names that start with "fail" represent
programs that are expected to fail verification; all others should
succeed.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Allow the individual program load to be invoked. This will help with
testing, where a single ELF may contain several sections, some of which
denote subprograms that are expected to fail verification, along with
some which are expected to pass verification. By allowing programs to be
iterated and individually loaded, each program can be independently
checked against its expected verification result.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
reference tracking: leak potential reference
reference tracking: leak potential reference on stack
reference tracking: leak potential reference on stack 2
reference tracking: zero potential reference
reference tracking: copy and zero potential references
reference tracking: release reference without check
reference tracking: release reference
reference tracking: release reference twice
reference tracking: release reference twice inside branch
reference tracking: alloc, check, free in one subbranch
reference tracking: alloc, check, free in both subbranches
reference tracking in call: free reference in subprog
reference tracking in call: free reference in subprog and outside
reference tracking in call: alloc & leak reference in subprog
reference tracking in call: alloc in subprog, release outside
reference tracking in call: sk_ptr leak into caller stack
reference tracking in call: sk_ptr spill into caller stack
reference tracking: allow LD_ABS
reference tracking: forbid LD_ABS while holding reference
reference tracking: allow LD_IND
reference tracking: forbid LD_IND while holding reference
reference tracking: check reference or tail call
reference tracking: release reference then tail call
reference tracking: leak possible reference over tail call
reference tracking: leak checked reference over tail call
reference tracking: mangle and release sock_or_null
reference tracking: mangle and release sock
reference tracking: access member
reference tracking: write to member
reference tracking: invalid 64-bit access of member
reference tracking: access after release
reference tracking: direct access for lookup
unpriv: spill/fill of different pointers stx - ctx and sock
unpriv: spill/fill of different pointers stx - leak sock
unpriv: spill/fill of different pointers stx - sock and ctx (read)
unpriv: spill/fill of different pointers stx - sock and ctx (write)
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Don't hardcode the dummy program types to SOCKET_FILTER type, as this
prevents testing bpf_tail_call in conjunction with other program types.
Instead, use the program type specified in the test case.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and
bpf_sk_lookup_udp() which allows BPF programs to find out if there is a
socket listening on this host, and returns a socket pointer which the
BPF program can then access to determine, for instance, whether to
forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the
socket, so when a BPF program makes use of this function, it must
subsequently pass the returned pointer into the newly added sk_release()
to return the reference.
By way of example, the following pseudocode would filter inbound
connections at XDP if there is no corresponding service listening for
the traffic:
struct bpf_sock_tuple tuple;
struct bpf_sock_ops *sk;
populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0);
if (!sk) {
// Couldn't find a socket listening for this traffic. Drop.
return TC_ACT_SHOT;
}
bpf_sk_release(sk, 0);
return TC_ACT_OK;
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Allow helper functions to acquire a reference and return it into a
register. Specific pointer types such as the PTR_TO_SOCKET will
implicitly represent such a reference. The verifier must ensure that
these references are released exactly once in each path through the
program.
To achieve this, this commit assigns an id to the pointer and tracks it
in the 'bpf_func_state', then when the function or program exits,
verifies that all of the acquired references have been freed. When the
pointer is passed to a function that frees the reference, it is removed
from the 'bpf_func_state` and all existing copies of the pointer in
registers are marked invalid.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
An upcoming commit will need very similar copy/realloc boilerplate, so
refactor the existing stack copy/realloc functions into macros to
simplify it.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Teach the verifier a little bit about a new type of pointer, a
PTR_TO_SOCKET. This pointer type is accessed from BPF through the
'struct bpf_sock' structure.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This check will be reused by an upcoming commit for conditional jump
checks for sockets. Refactor it a bit to simplify the later commit.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The array "reg_type_str" provides canonical formatting of register
types, however a couple of places would previously check whether a
register represented the context and write the name "context" directly.
An upcoming commit will add another pointer type to these statements, so
to provide more accurate error messages in the verifier, update these
error messages to use "reg_type_str" instead.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
An upcoming commit will add another two pointer types that need very
similar behaviour, so generalise this function now.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add this iterator for spilled registers, it concentrates the details of
how to get the current frame's spilled registers into a single macro
while clarifying the intention of the code which is calling the macro.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The AON_PM_L2 is normally used to trigger and identify the source of a
wake-up event. Since the RX_SYS clock is no longer turned off, we also
have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
that interrupt remains active up until the magic packet detector is
disabled which happens much later during the driver resumption.
The race happens if we have a CPU that is entering the SYSTEMPORT
INTRL2_0 handler during resume, and another CPU has managed to clear the
wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
have the first CPU stuck in the interrupt handler with an interrupt
cause that has been cleared under its feet, and so we keep returning
IRQ_NONE and we never make any progress.
This was not a problem before because we would always turn off the
RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
off as well, thus not latching the interrupt.
The fix is to make sure we do not enable either the MPD or
BRCM_TAG_MATCH interrupts since those are redundant with what the
AON_PM_L2 interrupt controller already processes and they would cause
such a race to occur.
Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER")
Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes problem (discovered by Aurelien) introduced by recent commit:
commit b24df3e30cbf48255db866720fb71f14bf9d2f39
("cifs: update receive_encrypted_standard to handle compounded responses")
which broke the ability to respond to some lease breaks
(lease breaks being ignored is a problem since can block
server response for duration of the lease break timeout).
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
For compounded PDUs we whould only wake the waiting thread for the
very last PDU of the compound.
We do this so that we are guaranteed that the demultiplex_thread will
not process or access any of those MIDs any more once the send/recv
thread starts processing.
Else there is a race where at the end of the send/recv processing we
will try to delete all the mids of the compound. If the multiplex
thread still has other mids to process at this point for this compound
this can lead to an oops.
Needed to fix recent commit:
commit 730928c8f4be88e9d6a027a16b1e8fa9c59fc077
("cifs: update smb2_queryfs() to use compounding")
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.19
First, and also hopefully the last, set of fixes for 4.19. All small
but still important fixes
mt76x0
* fix a bug when a virtual interface is removed multiple times
b43
* fix DMA error related regression with proprietary firmware
iwlwifi
* fix an oops which was a regression in v4.19-rc1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(allows for better compiler optimization)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(allows for better compiler optimization)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(allows for better compiler optimization)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(the parameter in question is mark)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
memset
(allows for better compiler optimization)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(allows for better compiler optimization)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(allows for better compiler optimization)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
cifs_delete_mid() is called once we are finished handling a mid and we
expect no more work done on this mid.
Needed to fix recent commit:
commit 730928c8f4be88e9d6a027a16b1e8fa9c59fc077
("cifs: update smb2_queryfs() to use compounding")
Add a warning if someone tries to dequeue a mid that has already been
flagged to be deleted.
Also change list_del() to list_del_init() so that if we have similar bugs
resurface in the future we will not oops.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
When mounting a Windows share that is the root of a drive (eg. C$)
the server does not return . and .. directory entries. This results in
the smb2 code path erroneously skipping the 2 first entries.
Pseudo-code of the readdir() code path:
cifs_readdir(struct file, struct dir_context)
initiate_cifs_search <-- if no reponse cached yet
server->ops->query_dir_first
dir_emit_dots
dir_emit <-- adds "." and ".." if we're at pos=0
find_cifs_entry
initiate_cifs_search <-- if pos < start of current response
(restart search)
server->ops->query_dir_next <-- if pos > end of current response
(fetch next search res)
for(...) <-- loops over cur response entries
starting at pos
cifs_filldir <-- skip . and .., emit entry
cifs_fill_dirent
dir_emit
pos++
A) dir_emit_dots() always adds . & ..
and sets the current dir pos to 2 (0 and 1 are done).
Therefore we always want the index_to_find to be 2 regardless of if
the response has . and ..
B) smb1 code initializes index_of_last_entry with a +2 offset
in cifssmb.c CIFSFindFirst():
psrch_inf->index_of_last_entry = 2 /* skip . and .. */ +
psrch_inf->entries_in_buffer;
Later in find_cifs_entry() we want to find the next dir entry at pos=2
as a result of (A)
first_entry_in_buffer = cfile->srch_inf.index_of_last_entry -
cfile->srch_inf.entries_in_buffer;
This var is the dir pos that the first entry in the buffer will
have therefore it must be 2 in the first call.
If we don't offset index_of_last_entry by 2 (like in (B)),
first_entry_in_buffer=0 but we were instructed to get pos=2 so this
code in find_cifs_entry() skips the 2 first which is ok for non-root
shares, as it skips . and .. from the response but is not ok for root
shares where the 2 first are actual files
pos_in_buf = index_to_find - first_entry_in_buffer;
// pos_in_buf=2
// we skip 2 first response entries :(
for (i = 0; (i < (pos_in_buf)) && (cur_ent != NULL); i++) {
/* go entry by entry figuring out which is first */
cur_ent = nxt_dir_entry(cur_ent, end_of_smb,
cfile->srch_inf.info_level);
}
C) cifs_filldir() skips . and .. so we can safely ignore them for now.
Sample program:
int main(int argc, char **argv)
{
const char *path = argc >= 2 ? argv[1] : ".";
DIR *dh;
struct dirent *de;
printf("listing path <%s>\n", path);
dh = opendir(path);
if (!dh) {
printf("opendir error %d\n", errno);
return 1;
}
while (1) {
de = readdir(dh);
if (!de) {
if (errno) {
printf("readdir error %d\n", errno);
return 1;
}
printf("end of listing\n");
break;
}
printf("off=%lu <%s>\n", de->d_off, de->d_name);
}
return 0;
}
Before the fix with SMB1 on root shares:
<.> off=1
<..> off=2
<$Recycle.Bin> off=3
<bootmgr> off=4
and on non-root shares:
<.> off=1
<..> off=4 <-- after adding .., the offsets jumps to +2 because
<2536> off=5 we skipped . and .. from response buffer (C)
<411> off=6 but still incremented pos
<file> off=7
<fsx> off=8
Therefore the fix for smb2 is to mimic smb1 behaviour and offset the
index_of_last_entry by 2.
Test results comparing smb1 and smb2 before/after the fix on root
share, non-root shares and on large directories (ie. multi-response
dir listing):
PRE FIX
=======
pre-1-root VS pre-2-root:
ERR pre-2-root is missing [bootmgr, $Recycle.Bin]
pre-1-nonroot VS pre-2-nonroot:
OK~ same files, same order, different offsets
pre-1-nonroot-large VS pre-2-nonroot-large:
OK~ same files, same order, different offsets
POST FIX
========
post-1-root VS post-2-root:
OK same files, same order, same offsets
post-1-nonroot VS post-2-nonroot:
OK same files, same order, same offsets
post-1-nonroot-large VS post-2-nonroot-large:
OK same files, same order, same offsets
REGRESSION?
===========
pre-1-root VS post-1-root:
OK same files, same order, same offsets
pre-1-nonroot VS post-1-nonroot:
OK same files, same order, same offsets
BugLink: https://bugzilla.samba.org/show_bug.cgi?id=13107
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Paulo Alcantara <palcantara@suse.deR>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
|
|
We have an impressive number of syzkaller bugs that are linked
to the fact that syzbot was able to create a networking device
with millions of TX (or RX) queues.
Let's limit the number of RX/TX queues to 4096, this really should
cover all known cases.
A separate patch will add various cond_resched() in the loops
handling sysfs entries at device creation and dismantle.
Tested:
lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap
RTNETLINK answers: Invalid argument
lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap
real 0m0.180s
user 0m0.000s
sys 0m0.107s
Fixes: 76ff5cc91935 ("rtnl: allow to specify number of rx and tx queues on device creation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RX queue config for bonding master could be different from its slave
device(s). With the commit 6a9e461f6fe4 ("bonding: pass link-local
packets to bonding master also."), the packet is reinjected into stack
with skb->dev as bonding master. This potentially triggers the
message:
"bondX received packet on queue Y, but number of RX queues is Z"
whenever the queue that packet is received on is higher than the
numrxqueues on bonding master (Y > Z).
Fixes: 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also.")
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Timer handlers do not imply rcu_read_lock(), so my recent fix
triggered a LOCKDEP warning when SYNACK is retransmit.
Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt
usages instead of guessing what is done by callers, since it is
not worth the pain.
Get rid of ireq_opt_deref() helper since it hides the logic
without real benefit, since it is now a standard rcu_dereference().
Fixes: 1ad98e9d1bdf ("tcp/dccp: fix lockdep issue when SYN is backlogged")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 2d4dd0da45401c7ae7332b4d1eb7bbb1348edde9.
This broke earlycon on all Renesas ARM platforms using a SCIF port for the
serial console (R-Car, RZ/A1, RZ/G1, RZ/G2 SoCs), due to an incorrect value
of port->regshift.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 7acece71a517cad83a0842a94d94c13f271b680c.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit d76c74387e1c978b6c5524a146ab0f3f72206f98.
While commit d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling")
fixes runtime PM handling when using kgdb, it introduces a traceback for
everyone else.
BUG: sleeping function called from invalid context at
/mnt/host/source/src/third_party/kernel/next/drivers/base/power/runtime.c:1034
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0
7 locks held by swapper/0/1:
#0: 000000005ec5bc72 (&dev->mutex){....}, at: __driver_attach+0xb5/0x12b
#1: 000000005d5fa9e5 (&dev->mutex){....}, at: __device_attach+0x3e/0x15b
#2: 0000000047e93286 (serial_mutex){+.+.}, at: serial8250_register_8250_port+0x51/0x8bb
#3: 000000003b328f07 (port_mutex){+.+.}, at: uart_add_one_port+0xab/0x8b0
#4: 00000000fa313d4d (&port->mutex){+.+.}, at: uart_add_one_port+0xcc/0x8b0
#5: 00000000090983ca (console_lock){+.+.}, at: vprintk_emit+0xdb/0x217
#6: 00000000c743e583 (console_owner){-...}, at: console_unlock+0x211/0x60f
irq event stamp: 735222
__down_trylock_console_sem+0x4a/0x84
console_unlock+0x338/0x60f
__do_softirq+0x4a4/0x50d
irq_exit+0x64/0xe2
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5 #6
Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.286.0 03/15/2017
Call Trace:
dump_stack+0x7d/0xbd
___might_sleep+0x238/0x259
__pm_runtime_resume+0x4e/0xa4
? serial8250_rpm_get+0x2e/0x44
serial8250_console_write+0x44/0x301
? lock_acquire+0x1b8/0x1fa
console_unlock+0x577/0x60f
vprintk_emit+0x1f0/0x217
printk+0x52/0x6e
register_console+0x43b/0x524
uart_add_one_port+0x672/0x8b0
? set_io_from_upio+0x150/0x162
serial8250_register_8250_port+0x825/0x8bb
dw8250_probe+0x80c/0x8b0
? dw8250_serial_inq+0x8e/0x8e
? dw8250_check_lcr+0x108/0x108
? dw8250_runtime_resume+0x5b/0x5b
? dw8250_serial_outq+0xa1/0xa1
? dw8250_remove+0x115/0x115
platform_drv_probe+0x76/0xc5
really_probe+0x1f1/0x3ee
? driver_allows_async_probing+0x5d/0x5d
driver_probe_device+0xd6/0x112
? driver_allows_async_probing+0x5d/0x5d
bus_for_each_drv+0xbe/0xe5
__device_attach+0xdd/0x15b
bus_probe_device+0x5a/0x10b
device_add+0x501/0x894
? _raw_write_unlock+0x27/0x3a
platform_device_add+0x224/0x2b7
mfd_add_device+0x718/0x75b
? __kmalloc+0x144/0x16a
? mfd_add_devices+0x38/0xdb
mfd_add_devices+0x9b/0xdb
intel_lpss_probe+0x7d4/0x8ee
intel_lpss_pci_probe+0xac/0xd4
pci_device_probe+0x101/0x18e
...
Revert the offending patch until a more comprehensive solution
is available.
Cc: Tony Lindgren <tony@atomide.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Phil Edworthy <phil.edworthy@renesas.com>
Fixes: d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use memblock_end_of_DRAM which provides correct last low memory
PFN. Without that, DMA32 region becomes empty resulting in zero
pages being allocated for DMA32.
This patch is based on earlier patch from palmer which never
merged into 4.19. I just edited the commit text to make more
sense.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
The recent rework of the TSC calibration code introduced a regression on UV
systems as it added a call to tsc_early_init() which initializes the TSC
ADJUST values before acpi_boot_table_init(). In the case of UV systems,
that is a necessary step that calls uv_system_init(). This informs
tsc_sanitize_first_cpu() that the kernel runs on a platform with async TSC
resets as documented in commit 341102c3ef29 ("x86/tsc: Add option that TSC
on Socket 0 being non-zero is valid")
Fix it by skipping the early tsc initialization on UV systems and let TSC
init tests take place later in tsc_init().
Fixes: cf7a63ef4e02 ("x86/tsc: Calibrate tsc only once")
Suggested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Russ Anderson <rja@hpe.com>
Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Xiaoming Gao <gxm.linux.kernel@gmail.com>
Cc: Rajvi Jingar <rajvi.jingar@intel.com>
Link: https://lkml.kernel.org/r/20181002180144.923579706@stormcage.americas.sgi.com
|
|
Introduce is_early_uv_system() which uses efi.uv_systab to decide early
in the boot process whether the kernel runs on a UV system.
This is needed to skip other early setup/init code that might break
the UV platform if done too early such as before necessary ACPI tables
parsing takes place.
Suggested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Russ Anderson <rja@hpe.com>
Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Xiaoming Gao <gxm.linux.kernel@gmail.com>
Cc: Rajvi Jingar <rajvi.jingar@intel.com>
Link: https://lkml.kernel.org/r/20181002180144.801700401@stormcage.americas.sgi.com
|