summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-02Merge branch 'dpaa2-eth-Add-support-for-Rx-flow-classification'David S. Miller
Ioana Radulescu says: ==================== dpaa2-eth: Add support for Rx flow classification The Management Complex (MC) firmware initially allowed the configuration of a single key to be used both for Rx flow hashing and flow classification. This prevented us from supporting Rx flow classification independently of the hash key configuration. Newer firmware versions expose separate commands for configuring the two types of keys, so we can use them to introduce Rx classification support. For frames that don't match any classification rule, we fall back to statistical distribution based on the current hash key. The first patch in this set updates the Rx hashing code to use the new firmware API for key config. Subsequent patches introduce the firmware API for configuring the classification and actual support for adding and deleting rules via ethtool. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02dpaa2-eth: Add ethtool support for flow classificationIoana Radulescu
Add support for inserting and deleting Rx flow classification rules through ethtool. We support classification based on some header fields for flow-types ether, ip4, tcp4, udp4 and sctp4. Rx queues are core affine, so the action argument effectively selects on which cpu the matching frame will be processed. Discarding the frame is also supported. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02dpaa2-eth: Configure Rx flow classification keyIoana Radulescu
For firmware versions that support it, configure an Rx flow classification key at probe time. Hardware expects all rules in the classification table to share the same key. So we setup a key containing all supported fields at driver init and when a user adds classification rules through ethtool, we will just mask out the unused header fields. Since the key composition process is the same for flow classification and hashing, reuse existing code where possible. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02dpaa2-eth: Rename structureIoana Radulescu
Since the array of supported header fields will be used for Rx flow classification as well, rename it from "hash_fields" to the more inclusive "dist_fields". Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02dpaa2-eth: Use new API for Rx flow hashingIoana Radulescu
The Management Complex (MC) firmware initially allowed the configuration of a single key to be used both for Rx flow hashing and flow classification. This prevented us from supporting Rx flow classification through ethtool. Starting with version 10.7.0, the Management Complex(MC) offers a new set of APIs for separate configuration of Rx hashing and classification keys. Update the Rx flow hashing support to use the new API, if available. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: usbnet: make driver_info constBen Dooks
The driver_info field that is used for describing each of the usb-net drivers using the usbnet.c core all declare their information as const and the usbnet.c itself does not try and modify the struct. It is therefore a good idea to make this const in the usbnet.c structure in case anyone tries to modify it. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02Merge tag 'mlx5-fixes-2018-10-01' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-10-01 This pull request includes some fixes to mlx5 driver, Please pull and let me know if there's any problem. For -stable v4.11: "6e0a4a23c59a ('net/mlx5: E-Switch, Fix out of bound access when setting vport rate')" For -stable v4.18: "98d6627c372a ('net/mlx5e: Set vlan masks for all offloaded TC rules')" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02tcp: do not release socket ownership in tcp_close()Eric Dumazet
syzkaller was able to hit the WARN_ON(sock_owned_by_user(sk)); in tcp_close() While a socket is being closed, it is very possible other threads find it in rtnetlink dump. tcp_get_info() will acquire the socket lock for a short amount of time (slow = lock_sock_fast(sk)/unlock_sock_fast(sk, slow);), enough to trigger the warning. Fixes: 67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling tcp_get_info()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02Merge branch 'rmnet-fixes'David S. Miller
Subash Abhinov Kasiviswanathan says: ==================== net: qualcomm: rmnet: Updates 2018-10-02 This series is a set of small fixes for rmnet driver Patch 1 is a fix for a scenario reported by syzkaller Patch 2 & 3 are fixes for incorrect allocation flags ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: qualcomm: rmnet: Fix incorrect allocation flag in receive pathSubash Abhinov Kasiviswanathan
The incoming skb needs to be reallocated in case the headroom is not sufficient to adjust the ethernet header. This allocation needs to be atomic otherwise it results in this splat [<600601bb>] ___might_sleep+0x185/0x1a3 [<603f6314>] ? _raw_spin_unlock_irqrestore+0x0/0x27 [<60069bb0>] ? __wake_up_common_lock+0x95/0xd1 [<600602b0>] __might_sleep+0xd7/0xe2 [<60065598>] ? enqueue_task_fair+0x112/0x209 [<600eea13>] __kmalloc_track_caller+0x5d/0x124 [<600ee9b6>] ? __kmalloc_track_caller+0x0/0x124 [<602696d5>] __kmalloc_reserve.isra.34+0x30/0x7e [<603f629b>] ? _raw_spin_lock_irqsave+0x0/0x3d [<6026b744>] pskb_expand_head+0xbf/0x310 [<6025ca6a>] rmnet_rx_handler+0x7e/0x16b [<6025c9ec>] ? rmnet_rx_handler+0x0/0x16b [<6027ad0c>] __netif_receive_skb_core+0x301/0x96f [<60033c17>] ? set_signals+0x0/0x40 [<6027bbcb>] __netif_receive_skb+0x24/0x8e Fixes: 74692caf1b0b ("net: qualcomm: rmnet: Process packets over ethernet") Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: qualcomm: rmnet: Fix incorrect allocation flag in transmitSubash Abhinov Kasiviswanathan
The incoming skb needs to be reallocated in case the headroom is not sufficient to add the MAP header. This allocation needs to be atomic otherwise it results in the following splat [32805.801456] BUG: sleeping function called from invalid context [32805.841141] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [32805.904773] task: ffffffd7c5f62280 task.stack: ffffff80464a8000 [32805.910851] pc : ___might_sleep+0x180/0x188 [32805.915143] lr : ___might_sleep+0x180/0x188 [32806.131520] Call trace: [32806.134041] ___might_sleep+0x180/0x188 [32806.137980] __might_sleep+0x50/0x84 [32806.141653] __kmalloc_track_caller+0x80/0x3bc [32806.146215] __kmalloc_reserve+0x3c/0x88 [32806.150241] pskb_expand_head+0x74/0x288 [32806.154269] rmnet_egress_handler+0xb0/0x1d8 [32806.162239] rmnet_vnd_start_xmit+0xc8/0x13c [32806.166627] dev_hard_start_xmit+0x148/0x280 [32806.181181] sch_direct_xmit+0xa4/0x198 [32806.185125] __qdisc_run+0x1f8/0x310 [32806.188803] net_tx_action+0x23c/0x26c [32806.192655] __do_softirq+0x220/0x408 [32806.196420] do_softirq+0x4c/0x70 Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: qualcomm: rmnet: Skip processing loopback packetsSean Tranchetti
RMNET RX handler was processing invalid packets that were originally sent on the real device and were looped back via dev_loopback_xmit(). This was detected using syzkaller. Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03Merge branch 'bpf-sk-lookup'Daniel Borkmann
Joe Stringer says: ==================== This series proposes a new helper for the BPF API which allows BPF programs to perform lookups for sockets in a network namespace. This would allow programs to determine early on in processing whether the stack is expecting to receive the packet, and perform some action (eg drop, forward somewhere) based on this information. The series is structured roughly into: * Misc refactor * Add the socket pointer type * Add reference tracking to ensure that socket references are freed * Extend the BPF API to add sk_lookup_xxx() / sk_release() functions * Add tests/documentation The helper proposed in this series includes a parameter for a tuple which must be filled in by the caller to determine the socket to look up. The simplest case would be filling with the contents of the packet, ie mapping the packet's 5-tuple into the parameter. In common cases, it may alternatively be useful to reverse the direction of the tuple and perform a lookup, to find the socket that initiates this connection; and if the BPF program ever performs a form of IP address translation, it may further be useful to be able to look up arbitrary tuples that are not based upon the packet, but instead based on state held in BPF maps or hardcoded in the BPF program. Currently, access into the socket's fields are limited to those which are otherwise already accessible, and are restricted to read-only access. Changes since v3: * New patch: "bpf: Reuse canonical string formatter for ctx errs" * Add PTR_TO_SOCKET to is_ctx_reg(). * Add a few new checks to prevent mixing of socket/non-socket pointers. * Swap order of checks in sock_filter_is_valid_access(). * Prefix register spill macros with "bpf_". * Add acks from previous round * Rebase Changes since v2: * New patch: "selftests/bpf: Generalize dummy program types". This enables adding verifier tests for socket lookup with tail calls. * Define the semantics of the new helpers more clearly in uAPI header. * Fix release of caller_net when netns is not specified. * Use skb->sk to find caller net when skb->dev is unavailable. * Fix build with !CONFIG_NET. * Replace ptr_id defensive coding when releasing reference state with an internal error (-EFAULT). * Remove flags argument to sk_release(). * Add several new assembly tests suggested by Daniel. * Add a few new C tests. * Fix typo in verifier error message. Changes since v1: * Limit netns_id field to 32 bits * Reuse reg_type_mismatch() in more places * Reduce the number of passes at convert_ctx_access() * Replace ptr_id defensive coding when releasing reference state with an internal error (-EFAULT) * Rework 'struct bpf_sock_tuple' to allow passing a packet pointer * Allow direct packet access from helper * Fix compile error with CONFIG_IPV6 enabled * Improve commit messages Changes since RFC: * Split up sk_lookup() into sk_lookup_tcp(), sk_lookup_udp(). * Only take references on the socket when necessary. * Make sk_release() only free the socket reference in this case. * Fix some runtime reference leaks: * Disallow BPF_LD_[ABS|IND] instructions while holding a reference. * Disallow bpf_tail_call() while holding a reference. * Prevent the same instruction being used for reference and other pointer type. * Simplify locating copies of a reference during helper calls by caching the pointer id from the caller. * Fix kbuild compilation warnings with particular configs. * Improve code comments describing the new verifier pieces. * Tested by Nitin ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03Documentation: Describe bpf reference trackingJoe Stringer
Document the new pointer types in the verifier and how the pointer ID tracking works to ensure that references which are taken are later released. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03selftests/bpf: Add C tests for reference trackingJoe Stringer
Add some tests that demonstrate and test the balanced lookup/free nature of socket lookup. Section names that start with "fail" represent programs that are expected to fail verification; all others should succeed. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03libbpf: Support loading individual progsJoe Stringer
Allow the individual program load to be invoked. This will help with testing, where a single ELF may contain several sections, some of which denote subprograms that are expected to fail verification, along with some which are expected to pass verification. By allowing programs to be iterated and individually loaded, each program can be independently checked against its expected verification result. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03selftests/bpf: Add tests for reference trackingJoe Stringer
reference tracking: leak potential reference reference tracking: leak potential reference on stack reference tracking: leak potential reference on stack 2 reference tracking: zero potential reference reference tracking: copy and zero potential references reference tracking: release reference without check reference tracking: release reference reference tracking: release reference twice reference tracking: release reference twice inside branch reference tracking: alloc, check, free in one subbranch reference tracking: alloc, check, free in both subbranches reference tracking in call: free reference in subprog reference tracking in call: free reference in subprog and outside reference tracking in call: alloc & leak reference in subprog reference tracking in call: alloc in subprog, release outside reference tracking in call: sk_ptr leak into caller stack reference tracking in call: sk_ptr spill into caller stack reference tracking: allow LD_ABS reference tracking: forbid LD_ABS while holding reference reference tracking: allow LD_IND reference tracking: forbid LD_IND while holding reference reference tracking: check reference or tail call reference tracking: release reference then tail call reference tracking: leak possible reference over tail call reference tracking: leak checked reference over tail call reference tracking: mangle and release sock_or_null reference tracking: mangle and release sock reference tracking: access member reference tracking: write to member reference tracking: invalid 64-bit access of member reference tracking: access after release reference tracking: direct access for lookup unpriv: spill/fill of different pointers stx - ctx and sock unpriv: spill/fill of different pointers stx - leak sock unpriv: spill/fill of different pointers stx - sock and ctx (read) unpriv: spill/fill of different pointers stx - sock and ctx (write) Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03selftests/bpf: Generalize dummy program typesJoe Stringer
Don't hardcode the dummy program types to SOCKET_FILTER type, as this prevents testing bpf_tail_call in conjunction with other program types. Instead, use the program type specified in the test case. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add helper to retrieve socket in BPFJoe Stringer
This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and bpf_sk_lookup_udp() which allows BPF programs to find out if there is a socket listening on this host, and returns a socket pointer which the BPF program can then access to determine, for instance, whether to forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the socket, so when a BPF program makes use of this function, it must subsequently pass the returned pointer into the newly added sk_release() to return the reference. By way of example, the following pseudocode would filter inbound connections at XDP if there is no corresponding service listening for the traffic: struct bpf_sock_tuple tuple; struct bpf_sock_ops *sk; populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0); if (!sk) { // Couldn't find a socket listening for this traffic. Drop. return TC_ACT_SHOT; } bpf_sk_release(sk, 0); return TC_ACT_OK; Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add reference tracking to verifierJoe Stringer
Allow helper functions to acquire a reference and return it into a register. Specific pointer types such as the PTR_TO_SOCKET will implicitly represent such a reference. The verifier must ensure that these references are released exactly once in each path through the program. To achieve this, this commit assigns an id to the pointer and tracks it in the 'bpf_func_state', then when the function or program exits, verifies that all of the acquired references have been freed. When the pointer is passed to a function that frees the reference, it is removed from the 'bpf_func_state` and all existing copies of the pointer in registers are marked invalid. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Macrofy stack state copyJoe Stringer
An upcoming commit will need very similar copy/realloc boilerplate, so refactor the existing stack copy/realloc functions into macros to simplify it. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add PTR_TO_SOCKET verifier typeJoe Stringer
Teach the verifier a little bit about a new type of pointer, a PTR_TO_SOCKET. This pointer type is accessed from BPF through the 'struct bpf_sock' structure. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Generalize ptr_or_null regs checkJoe Stringer
This check will be reused by an upcoming commit for conditional jump checks for sockets. Refactor it a bit to simplify the later commit. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Reuse canonical string formatter for ctx errsJoe Stringer
The array "reg_type_str" provides canonical formatting of register types, however a couple of places would previously check whether a register represented the context and write the name "context" directly. An upcoming commit will add another pointer type to these statements, so to provide more accurate error messages in the verifier, update these error messages to use "reg_type_str" instead. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Simplify ptr_min_max_vals adjustmentJoe Stringer
An upcoming commit will add another two pointer types that need very similar behaviour, so generalise this function now. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add iterator for spilled registersJoe Stringer
Add this iterator for spilled registers, it concentrates the details of how to get the current frame's spilled registers into a single macro while clarifying the intention of the code which is calling the macro. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02net: systemport: Fix wake-up interrupt race during resumeFlorian Fainelli
The AON_PM_L2 is normally used to trigger and identify the source of a wake-up event. Since the RX_SYS clock is no longer turned off, we also have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and that interrupt remains active up until the magic packet detector is disabled which happens much later during the driver resumption. The race happens if we have a CPU that is entering the SYSTEMPORT INTRL2_0 handler during resume, and another CPU has managed to clear the wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we have the first CPU stuck in the interrupt handler with an interrupt cause that has been cleared under its feet, and so we keep returning IRQ_NONE and we never make any progress. This was not a problem before because we would always turn off the RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned off as well, thus not latching the interrupt. The fix is to make sure we do not enable either the MPD or BRCM_TAG_MATCH interrupts since those are redundant with what the AON_PM_L2 interrupt controller already processes and they would cause such a race to occur. Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER") Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02smb3: fix lease break problem introduced by compoundingSteve French
Fixes problem (discovered by Aurelien) introduced by recent commit: commit b24df3e30cbf48255db866720fb71f14bf9d2f39 ("cifs: update receive_encrypted_standard to handle compounded responses") which broke the ability to respond to some lease breaks (lease breaks being ignored is a problem since can block server response for duration of the lease break timeout). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-02cifs: only wake the thread for the very last PDU in a compoundRonnie Sahlberg
For compounded PDUs we whould only wake the waiting thread for the very last PDU of the compound. We do this so that we are guaranteed that the demultiplex_thread will not process or access any of those MIDs any more once the send/recv thread starts processing. Else there is a race where at the end of the send/recv processing we will try to delete all the mids of the compound. If the multiplex thread still has other mids to process at this point for this compound this can lead to an oops. Needed to fix recent commit: commit 730928c8f4be88e9d6a027a16b1e8fa9c59fc077 ("cifs: update smb2_queryfs() to use compounding") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-02Merge tag 'wireless-drivers-for-davem-2018-10-01' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.19 First, and also hopefully the last, set of fixes for 4.19. All small but still important fixes mt76x0 * fix a bug when a virtual interface is removed multiple times b43 * fix DMA error related regression with proprietary firmware iwlwifi * fix an oops which was a regression in v4.19-rc1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: inet6_rtm_getroute() - use new style struct initializer instead of memsetMaciej Żenczykowski
Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: rtm_to_fib6_config() - use new style struct initializer instead of memsetMaciej Żenczykowski
(allows for better compiler optimization) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: rtmsg_to_fib6_config() - use new style struct initializer instead of memsetMaciej Żenczykowski
(allows for better compiler optimization) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: ip6_update_pmtu() - use new style struct initializer instead of memsetMaciej Żenczykowski
(allows for better compiler optimization) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: remove 1 always zero parameter from ip6_redirect_no_header()Maciej Żenczykowski
(the parameter in question is mark) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: ip6_redirect_no_header() - use new style struct initializer instead of ↵Maciej Żenczykowski
memset (allows for better compiler optimization) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: ip6_redirect() - use new style struct initializer instead of memsetMaciej Żenczykowski
(allows for better compiler optimization) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: inet_rtm_getroute() - use new style struct initializer instead of memsetMaciej Żenczykowski
Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02net: ip_rt_get_source() - use new style struct initializer instead of memsetMaciej Żenczykowski
(allows for better compiler optimization) Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02cifs: add a warning if we try to to dequeue a deleted midRonnie Sahlberg
cifs_delete_mid() is called once we are finished handling a mid and we expect no more work done on this mid. Needed to fix recent commit: commit 730928c8f4be88e9d6a027a16b1e8fa9c59fc077 ("cifs: update smb2_queryfs() to use compounding") Add a warning if someone tries to dequeue a mid that has already been flagged to be deleted. Also change list_del() to list_del_init() so that if we have similar bugs resurface in the future we will not oops. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-02smb2: fix missing files in root share directory listingAurelien Aptel
When mounting a Windows share that is the root of a drive (eg. C$) the server does not return . and .. directory entries. This results in the smb2 code path erroneously skipping the 2 first entries. Pseudo-code of the readdir() code path: cifs_readdir(struct file, struct dir_context) initiate_cifs_search <-- if no reponse cached yet server->ops->query_dir_first dir_emit_dots dir_emit <-- adds "." and ".." if we're at pos=0 find_cifs_entry initiate_cifs_search <-- if pos < start of current response (restart search) server->ops->query_dir_next <-- if pos > end of current response (fetch next search res) for(...) <-- loops over cur response entries starting at pos cifs_filldir <-- skip . and .., emit entry cifs_fill_dirent dir_emit pos++ A) dir_emit_dots() always adds . & .. and sets the current dir pos to 2 (0 and 1 are done). Therefore we always want the index_to_find to be 2 regardless of if the response has . and .. B) smb1 code initializes index_of_last_entry with a +2 offset in cifssmb.c CIFSFindFirst(): psrch_inf->index_of_last_entry = 2 /* skip . and .. */ + psrch_inf->entries_in_buffer; Later in find_cifs_entry() we want to find the next dir entry at pos=2 as a result of (A) first_entry_in_buffer = cfile->srch_inf.index_of_last_entry - cfile->srch_inf.entries_in_buffer; This var is the dir pos that the first entry in the buffer will have therefore it must be 2 in the first call. If we don't offset index_of_last_entry by 2 (like in (B)), first_entry_in_buffer=0 but we were instructed to get pos=2 so this code in find_cifs_entry() skips the 2 first which is ok for non-root shares, as it skips . and .. from the response but is not ok for root shares where the 2 first are actual files pos_in_buf = index_to_find - first_entry_in_buffer; // pos_in_buf=2 // we skip 2 first response entries :( for (i = 0; (i < (pos_in_buf)) && (cur_ent != NULL); i++) { /* go entry by entry figuring out which is first */ cur_ent = nxt_dir_entry(cur_ent, end_of_smb, cfile->srch_inf.info_level); } C) cifs_filldir() skips . and .. so we can safely ignore them for now. Sample program: int main(int argc, char **argv) { const char *path = argc >= 2 ? argv[1] : "."; DIR *dh; struct dirent *de; printf("listing path <%s>\n", path); dh = opendir(path); if (!dh) { printf("opendir error %d\n", errno); return 1; } while (1) { de = readdir(dh); if (!de) { if (errno) { printf("readdir error %d\n", errno); return 1; } printf("end of listing\n"); break; } printf("off=%lu <%s>\n", de->d_off, de->d_name); } return 0; } Before the fix with SMB1 on root shares: <.> off=1 <..> off=2 <$Recycle.Bin> off=3 <bootmgr> off=4 and on non-root shares: <.> off=1 <..> off=4 <-- after adding .., the offsets jumps to +2 because <2536> off=5 we skipped . and .. from response buffer (C) <411> off=6 but still incremented pos <file> off=7 <fsx> off=8 Therefore the fix for smb2 is to mimic smb1 behaviour and offset the index_of_last_entry by 2. Test results comparing smb1 and smb2 before/after the fix on root share, non-root shares and on large directories (ie. multi-response dir listing): PRE FIX ======= pre-1-root VS pre-2-root: ERR pre-2-root is missing [bootmgr, $Recycle.Bin] pre-1-nonroot VS pre-2-nonroot: OK~ same files, same order, different offsets pre-1-nonroot-large VS pre-2-nonroot-large: OK~ same files, same order, different offsets POST FIX ======== post-1-root VS post-2-root: OK same files, same order, same offsets post-1-nonroot VS post-2-nonroot: OK same files, same order, same offsets post-1-nonroot-large VS post-2-nonroot-large: OK same files, same order, same offsets REGRESSION? =========== pre-1-root VS post-1-root: OK same files, same order, same offsets pre-1-nonroot VS post-1-nonroot: OK same files, same order, same offsets BugLink: https://bugzilla.samba.org/show_bug.cgi?id=13107 Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Paulo Alcantara <palcantara@suse.deR> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2018-10-02rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096Eric Dumazet
We have an impressive number of syzkaller bugs that are linked to the fact that syzbot was able to create a networking device with millions of TX (or RX) queues. Let's limit the number of RX/TX queues to 4096, this really should cover all known cases. A separate patch will add various cond_resched() in the loops handling sysfs entries at device creation and dismantle. Tested: lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap RTNETLINK answers: Invalid argument lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap real 0m0.180s user 0m0.000s sys 0m0.107s Fixes: 76ff5cc91935 ("rtnl: allow to specify number of rx and tx queues on device creation") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02bonding: fix warning messageMahesh Bandewar
RX queue config for bonding master could be different from its slave device(s). With the commit 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also."), the packet is reinjected into stack with skb->dev as bonding master. This potentially triggers the message: "bondX received packet on queue Y, but number of RX queues is Z" whenever the queue that packet is received on is higher than the numrxqueues on bonding master (Y > Z). Fixes: 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also.") Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02inet: make sure to grab rcu_read_lock before using ireq->ireq_optEric Dumazet
Timer handlers do not imply rcu_read_lock(), so my recent fix triggered a LOCKDEP warning when SYNACK is retransmit. Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt usages instead of guessing what is done by callers, since it is not worth the pain. Get rid of ireq_opt_deref() helper since it hides the logic without real benefit, since it is now a standard rcu_dereference(). Fixes: 1ad98e9d1bdf ("tcp/dccp: fix lockdep issue when SYN is backlogged") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02Revert "serial: sh-sci: Allow for compressed SCIF address"Geert Uytterhoeven
This reverts commit 2d4dd0da45401c7ae7332b4d1eb7bbb1348edde9. This broke earlycon on all Renesas ARM platforms using a SCIF port for the serial console (R-Car, RZ/A1, RZ/G1, RZ/G2 SoCs), due to an incorrect value of port->regshift. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-02Revert "serial: sh-sci: Remove SCIx_RZ_SCIFA_REGTYPE"Geert Uytterhoeven
This reverts commit 7acece71a517cad83a0842a94d94c13f271b680c. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-02Revert "serial: 8250_dw: Fix runtime PM handling"Guenter Roeck
This reverts commit d76c74387e1c978b6c5524a146ab0f3f72206f98. While commit d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling") fixes runtime PM handling when using kgdb, it introduces a traceback for everyone else. BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/next/drivers/base/power/runtime.c:1034 in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0 7 locks held by swapper/0/1: #0: 000000005ec5bc72 (&dev->mutex){....}, at: __driver_attach+0xb5/0x12b #1: 000000005d5fa9e5 (&dev->mutex){....}, at: __device_attach+0x3e/0x15b #2: 0000000047e93286 (serial_mutex){+.+.}, at: serial8250_register_8250_port+0x51/0x8bb #3: 000000003b328f07 (port_mutex){+.+.}, at: uart_add_one_port+0xab/0x8b0 #4: 00000000fa313d4d (&port->mutex){+.+.}, at: uart_add_one_port+0xcc/0x8b0 #5: 00000000090983ca (console_lock){+.+.}, at: vprintk_emit+0xdb/0x217 #6: 00000000c743e583 (console_owner){-...}, at: console_unlock+0x211/0x60f irq event stamp: 735222 __down_trylock_console_sem+0x4a/0x84 console_unlock+0x338/0x60f __do_softirq+0x4a4/0x50d irq_exit+0x64/0xe2 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5 #6 Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.286.0 03/15/2017 Call Trace: dump_stack+0x7d/0xbd ___might_sleep+0x238/0x259 __pm_runtime_resume+0x4e/0xa4 ? serial8250_rpm_get+0x2e/0x44 serial8250_console_write+0x44/0x301 ? lock_acquire+0x1b8/0x1fa console_unlock+0x577/0x60f vprintk_emit+0x1f0/0x217 printk+0x52/0x6e register_console+0x43b/0x524 uart_add_one_port+0x672/0x8b0 ? set_io_from_upio+0x150/0x162 serial8250_register_8250_port+0x825/0x8bb dw8250_probe+0x80c/0x8b0 ? dw8250_serial_inq+0x8e/0x8e ? dw8250_check_lcr+0x108/0x108 ? dw8250_runtime_resume+0x5b/0x5b ? dw8250_serial_outq+0xa1/0xa1 ? dw8250_remove+0x115/0x115 platform_drv_probe+0x76/0xc5 really_probe+0x1f1/0x3ee ? driver_allows_async_probing+0x5d/0x5d driver_probe_device+0xd6/0x112 ? driver_allows_async_probing+0x5d/0x5d bus_for_each_drv+0xbe/0xe5 __device_attach+0xdd/0x15b bus_probe_device+0x5a/0x10b device_add+0x501/0x894 ? _raw_write_unlock+0x27/0x3a platform_device_add+0x224/0x2b7 mfd_add_device+0x718/0x75b ? __kmalloc+0x144/0x16a ? mfd_add_devices+0x38/0xdb mfd_add_devices+0x9b/0xdb intel_lpss_probe+0x7d4/0x8ee intel_lpss_pci_probe+0xac/0xd4 pci_device_probe+0x101/0x18e ... Revert the offending patch until a more comprehensive solution is available. Cc: Tony Lindgren <tony@atomide.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Phil Edworthy <phil.edworthy@renesas.com> Fixes: d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-02RISCV: Fix end PFN for low memoryAtish Patra
Use memblock_end_of_DRAM which provides correct last low memory PFN. Without that, DMA32 region becomes empty resulting in zero pages being allocated for DMA32. This patch is based on earlier patch from palmer which never merged into 4.19. I just edited the commit text to make more sense. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-02x86/tsc: Fix UV TSC initializationMike Travis
The recent rework of the TSC calibration code introduced a regression on UV systems as it added a call to tsc_early_init() which initializes the TSC ADJUST values before acpi_boot_table_init(). In the case of UV systems, that is a necessary step that calls uv_system_init(). This informs tsc_sanitize_first_cpu() that the kernel runs on a platform with async TSC resets as documented in commit 341102c3ef29 ("x86/tsc: Add option that TSC on Socket 0 being non-zero is valid") Fix it by skipping the early tsc initialization on UV systems and let TSC init tests take place later in tsc_init(). Fixes: cf7a63ef4e02 ("x86/tsc: Calibrate tsc only once") Suggested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Russ Anderson <rja@hpe.com> Reviewed-by: Dimitri Sivanich <sivanich@hpe.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Len Brown <len.brown@intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Xiaoming Gao <gxm.linux.kernel@gmail.com> Cc: Rajvi Jingar <rajvi.jingar@intel.com> Link: https://lkml.kernel.org/r/20181002180144.923579706@stormcage.americas.sgi.com
2018-10-02x86/platform/uv: Provide is_early_uv_system()Mike Travis
Introduce is_early_uv_system() which uses efi.uv_systab to decide early in the boot process whether the kernel runs on a UV system. This is needed to skip other early setup/init code that might break the UV platform if done too early such as before necessary ACPI tables parsing takes place. Suggested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Russ Anderson <rja@hpe.com> Reviewed-by: Dimitri Sivanich <sivanich@hpe.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Len Brown <len.brown@intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Xiaoming Gao <gxm.linux.kernel@gmail.com> Cc: Rajvi Jingar <rajvi.jingar@intel.com> Link: https://lkml.kernel.org/r/20181002180144.801700401@stormcage.americas.sgi.com