summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-05icmp: remove duplicate codeMatteo Croce
The same code which recognizes ICMP error packets is duplicated several times. Use the icmp_is_err() and icmpv6_is_err() helpers instead, which do the same thing. ip_multipath_l3_keys() and tcf_nat_act() didn't check for all the error types, assume that they should instead. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05icmp: add helpers to recognize ICMP error packetsMatteo Croce
Add two helper functions, one for IPv4 and one for IPv6, to recognize the ICMP packets which are error responses. This packets are special because they have as payload the original header of the packet which generated it (RFC 792 says at least 8 bytes, but Linux actually includes much more than that). Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05Merge branch 'netvsc-RSS-related-patches'David S. Miller
Stephen Hemminger says: ==================== netvsc: RSS related patches Address a couple of issues related to recording RSS hash value in skb. These were found by reviewing RSS support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05hv_netvsc: record hardware hash in skbStephen Hemminger
Since RSS hash is available from the host, record it in the skb. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05hv_netvsc: flag software created hash valueStephen Hemminger
When the driver needs to create a hash value because it was not done at higher level, then the hash should be marked as a software not hardware hash. Fixes: f72860afa2e3 ("hv_netvsc: Exclude non-TCP port numbers from vRSS hashing") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05taprio: fix panic while hw offload sched list swapIvan Khoronzhuk
Don't swap oper and admin schedules too early, it's not correct and causes crash. Steps to reproduce: 1) tc qdisc replace dev eth0 parent root handle 100 taprio \ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 1@2 \ base-time $SOME_BASE_TIME \ sched-entry S 01 80000 \ sched-entry S 02 15000 \ sched-entry S 04 40000 \ flags 2 2) tc qdisc replace dev eth0 parent root handle 100 taprio \ base-time $SOME_BASE_TIME \ sched-entry S 01 90000 \ sched-entry S 02 20000 \ sched-entry S 04 40000 \ flags 2 3) tc qdisc replace dev eth0 parent root handle 100 taprio \ base-time $SOME_BASE_TIME \ sched-entry S 01 150000 \ sched-entry S 02 200000 \ sched-entry S 04 40000 \ flags 2 Do 2 3 2 .. steps more times if not happens and observe: [ 305.832319] Unable to handle kernel write to read-only memory at virtual address ffff0000087ce7f0 [ 305.910887] CPU: 0 PID: 0 Comm: swapper/0 Not tainted [ 305.919306] Hardware name: Texas Instruments AM654 Base Board (DT) [...] [ 306.017119] x1 : ffff800848031d88 x0 : ffff800848031d80 [ 306.022422] Call trace: [ 306.024866] taprio_free_sched_cb+0x4c/0x98 [ 306.029040] rcu_process_callbacks+0x25c/0x410 [ 306.033476] __do_softirq+0x10c/0x208 [ 306.037132] irq_exit+0xb8/0xc8 [ 306.040267] __handle_domain_irq+0x64/0xb8 [ 306.044352] gic_handle_irq+0x7c/0x178 [ 306.048092] el1_irq+0xb0/0x128 [ 306.051227] arch_cpu_idle+0x10/0x18 [ 306.054795] do_idle+0x120/0x138 [ 306.058015] cpu_startup_entry+0x20/0x28 [ 306.061931] rest_init+0xcc/0xd8 [ 306.065154] start_kernel+0x3bc/0x3e4 [ 306.068810] Code: f2fbd5b7 f2fbd5b6 d503201f f9400422 (f9000662) [ 306.074900] ---[ end trace 96c8e2284a9d9d6e ]--- [ 306.079507] Kernel panic - not syncing: Fatal exception in interrupt [ 306.085847] SMP: stopping secondary CPUs [ 306.089765] Kernel Offset: disabled Try to explain one of the possible crash cases: The "real" admin list is assigned when admin_sched is set to new_admin, it happens after "swap", that assigns to oper_sched NULL. Thus if call qdisc show it can crash. Farther, next second time, when sched list is updated, the admin_sched is not NULL and becomes the oper_sched, previous oper_sched was NULL so just skipped. But then admin_sched is assigned new_admin, but schedules to free previous assigned admin_sched (that already became oper_sched). Farther, next third time, when sched list is updated, while one more swap, oper_sched is not null, but it was happy to be freed already (while prev. admin update), so while try to free oper_sched the kernel panic happens at taprio_free_sched_cb(). So, move the "swap emulation" where it should be according to function comment from code. Fixes: 9c66d15646760e ("taprio: Add support for hardware offloading") Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-11-04 This series contains updates to the ice driver only. Anirudh refactors the code to reduce the kernel configuration flags and introduces ice_base.c file. Maciej does additional refactoring on the configuring of transmit rings so that we are not configuring per each traffic class flow. Added support for XDP in the ice driver. Provides additional re-organizing of the code in preparation for adding build_skb() support in the driver. Adjusted the computational padding logic for headroom and tailroom to better support build_skb(), which also aligns with the logic in other Intel LAN drivers. Added build_skb support and make use of the XDP's data_meta. Krzysztof refactors the driver to prepare for AF_XDP support in the driver and then adds support for AF_XDP. v2: Updated patch 3 of the series based on community feedback with the following changes... - return -EOPNOTSUPP instead of ENOTSUPP for too large MTU which makes it impossible to attach XDP prog - don't check for case when there's no XDP prog currently on interface and ice_xdp() is called with NULL bpf_prog; this happens when user does "ip link set eth0 xdp off" and no prog is present on VSI; no need for that as it is handled by higher layer - drop the extack message for unknown xdp->command - use the smp_processor_id() for accessing the XDP Tx ring for XDP_TX action - don't leave the interface in downed state in case of any failure during the XDP Tx resources handling - undo rename of ice_build_ctob The above changes caused a ripple effect in patches 4 & 5 to update references to ice_build_ctob() which are now build_ctob() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2019-11-04 This series contains old Halloween candy updates, yet still sweet, to fm10k, ixgbe and i40e. Jake adds the missing initializers for a couple of the TLV attribute macros. Added support for capturing and reporting statistics for all of the VFs in a given PF. Lastly, bump the version of the fm10k driver to reflect the recent changes. Alex addresses locality issues in the ixgbe driver when it is loaded on a system supporting multiple NUMA nodes. Manjunath Patil provides changes to the ixgbe driver, similar to those made to igb, to prevent transmit packets to request a hardware timestamp when the NIC has not been setup via the SIOCSHWTSTAMP ioctl. Alice adds support for x710 by adding the missing device id's in the appropriate places to ensure all the features are enabled in i40e. Jesse adds support for VF stats gathering in the i40e via the kernel via ndo_get_vf_stats function. v2: Fixed up commit id references in patch 5's description to align with how commit id's should be referenced. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05Merge tag 'linux-can-fixes-for-5.4-20191105' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2019-11-05 this is a pull request of 33 patches for net/master. In the first patch Wen Yang's patch adds a missing of_node_put() to CAN device infrastructure. Navid Emamdoost's patch for the gs_usb driver fixes a memory leak in the gs_can_open() error path. Johan Hovold provides two patches, one for the mcba_usb, the other for the usb_8dev driver. Both fix a use-after-free after USB-disconnect. Joakim Zhang's patch improves the flexcan driver, the ECC mechanism is now completely disabled instead of masking the interrupts. The next three patches all target the peak_usb driver. Stephane Grosjean's patch fixes a potential out-of-sync while decoding packets, Johan Hovold's patch fixes a slab info leak, Jeroen Hofstee's patch adds missing reporting of bus off recovery events. Followed by three patches for the c_can driver. Kurt Van Dijck's patch fixes detection of potential missing status IRQs, Jeroen Hofstee's patches add a chip reset on open and add missing reporting of bus off recovery events. Appana Durga Kedareswara rao's patch for the xilinx driver fixes the flags field initialization for axi CAN. The next seven patches target the rx-offload helper, they are by me and Jeroen Hofstee. The error handling in case of a queue overflow is fixed removing a memory leak. Further the error handling in case of queue overflow and skb OOM is cleaned up. The next two patches are by me and target the flexcan and ti_hecc driver. In case of a error during can_rx_offload_queue_sorted() the error counters in the drivers are incremented. Jeroen Hofstee provides 6 patches for the ti_hecc driver, which properly stop the device in ifdown, improve the rx-offload support (which hit mainline in v5.4-rc1), and add missing FIFO overflow and state change reporting. The following four patches target the j1939 protocol. Colin Ian King's patch fixes a memory leak in the j1939_sk_errqueue() handling. Three patches by Oleksij Rempel fix a memory leak on socket release and fix the EOMA packet in the transport protocol. Timo Schlüßler's patch fixes a potential race condition in the mcp251x driver on after suspend. The last patch is by Yegor Yefremov and updates the SPDX-License-Identifier to v3.0. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06nvme: change nvme_passthru_cmd64 to explicitly mark rsvdCharles Machalow
Changing nvme_passthru_cmd64 to add a field: rsvd2. This field is an explicit marker for the padding space added on certain platforms as a result of the enlargement of the result field from 32 bit to 64 bits in size, and fixes differences in struct size when using compat ioctl for 32-bit binaries on 64-bit architecture. Fixes: 65e68edce0db ("nvme: allow 64-bit results in passthru commands") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Charles Machalow <csm10495@gmail.com> [changelog] Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-05ALSA: hda: hdmi - add Tigerlake supportKai Vehmanen
Add Tigerlake HDMI codec support. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205379 BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=112171 Cc: Pan Xiuli <xiuli.pan@linux.intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20191105161053.22958-1-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-05ASoC: max98373: replace gpio_request with devm_gpio_requestYong Zhi
Use devm_gpio_request() to automatic unroll when fails and avoid resource leaks at error paths. Signed-off-by: Yong Zhi <yong.zhi@intel.com> Link: https://lore.kernel.org/r/1572905399-22402-1-git-send-email-yong.zhi@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-05ASoC: stm32: sai: add restriction on mmap supportOlivier Moysan
Do not support mmap in S/PDIF mode. In S/PDIF mode the buffer has to be copied, to allow the channel status bits insertion. Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Link: https://lore.kernel.org/r/20191104133654.28750-1-olivier.moysan@st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-05Merge tag 'for-linus-2019-11-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull clone3 stack argument update from Christian Brauner: "This changes clone3() to do basic stack validation and to set up the stack depending on whether or not it is growing up or down. With clone3() the expectation is now very simply that the .stack argument points to the lowest address of the stack and that .stack_size specifies the initial stack size. This is diferent from legacy clone() where the "stack" argument had to point to the lowest or highest address of the stack depending on the architecture. clone3() was released with 5.3. Currently, it is not documented and very unclear to userspace how the stack and stack_size argument have to be passed. After talking to glibc folks we concluded that changing clone3() to determine stack direction and doing basic validation is the right course of action. Note, this is a potentially user visible change. In the very unlikely case, that it breaks someone's use-case we will revert. (And then e.g. place the new behavior under an appropriate flag.) Note that passing an empty stack will continue working just as before. Breaking someone's use-case is very unlikely. Neither glibc nor musl currently expose a wrapper for clone3(). There is currently also no real motivation for anyone to use clone3() directly. First, because using clone{3}() with stacks requires some assembly (see glibc and musl). Second, because it does not provide features that legacy clone() doesn't. New features for clone3() will first happen in v5.5 which is why v5.4 is still a good time to try and make that change now and backport it to v5.3. I did a codesearch on https://codesearch.debian.net, github, and gitlab and could not find any software currently relying directly on clone3(). I expect this to change once we land CLONE_CLEAR_SIGHAND which was a request coming from glibc at which point they'll likely start using it" * tag 'for-linus-2019-11-05' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: clone3: validate stack arguments
2019-11-05Merge tag 'gpio-v5.4-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "More GPIO fixes! We found a late regression in the Intel Merrifield driver. Oh well. We fixed it up. - Fix a build error in the tools used for kselftest - A series of reverts to bring the Intel Merrifield back to working. We will likely unrevert the reverts for v5.5 but we can't have v5.4 broken" * tag 'gpio-v5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: Revert "gpio: merrifield: Pass irqchip when adding gpiochip" Revert "gpio: merrifield: Restore use of irq_base" Revert "gpio: merrifield: Move hardware initialization to callback" tools: gpio: Use !building_out_of_srctree to determine srctree
2019-11-06nvme-multipath: fix crash in nvme_mpath_clear_ctrl_pathsAnton Eidelman
nvme_mpath_clear_ctrl_paths() iterates through the ctrl->namespaces list while holding ctrl->scan_lock. This does not seem to be the correct way of protecting from concurrent list modification. Specifically, nvme_scan_work() sorts ctrl->namespaces AFTER unlocking scan_lock. This may result in the following (rare) crash in ctrl disconnect during scan_work: BUG: kernel NULL pointer dereference, address: 0000000000000050 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core] ... Call Trace: nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core] nvme_remove_namespaces+0x35/0xe0 [nvme_core] nvme_do_delete_ctrl+0x47/0x90 [nvme_core] nvme_sysfs_delete+0x49/0x60 [nvme_core] dev_attr_store+0x17/0x30 sysfs_kf_write+0x3e/0x50 kernfs_fop_write+0x11e/0x1a0 __vfs_write+0x1b/0x40 vfs_write+0xb9/0x1a0 ksys_write+0x67/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x5a/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f8d02bfb154 Fix: After taking scan_lock in nvme_mpath_clear_ctrl_paths() down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe. This will not cause deadlocks because taking scan_lock never happens while holding the namespaces_rwsem. Moreover, scan work downs namespaces_rwsem in the same order. Alternative: sort ctrl->namespaces in nvme_scan_work() while still holding the scan_lock. This would leave nvme_mpath_clear_ctrl_paths() without correct protection against ctrl->namespaces modification by anyone other than scan_work. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-06nvme-rdma: fix a segmentation fault during module unloadMax Gurtovoy
In case there are controllers that are not associated with any RDMA device (e.g. during unsuccessful reconnection) and the user will unload the module, these controllers will not be freed and will access already freed memory. The same logic appears in other fabric drivers as well. Fixes: 87fd125344d6 ("nvme-rdma: remove redundant reference between ib_device and tagset") Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-05clone3: validate stack argumentsChristian Brauner
Validate the stack arguments and setup the stack depening on whether or not it is growing down or up. Legacy clone() required userspace to know in which direction the stack is growing and pass down the stack pointer appropriately. To make things more confusing microblaze uses a variant of the clone() syscall selected by CONFIG_CLONE_BACKWARDS3 that takes an additional stack_size argument. IA64 has a separate clone2() syscall which also takes an additional stack_size argument. Finally, parisc has a stack that is growing upwards. Userspace therefore has a lot nasty code like the following: #define __STACK_SIZE (8 * 1024 * 1024) pid_t sys_clone(int (*fn)(void *), void *arg, int flags, int *pidfd) { pid_t ret; void *stack; stack = malloc(__STACK_SIZE); if (!stack) return -ENOMEM; #ifdef __ia64__ ret = __clone2(fn, stack, __STACK_SIZE, flags | SIGCHLD, arg, pidfd); #elif defined(__parisc__) /* stack grows up */ ret = clone(fn, stack, flags | SIGCHLD, arg, pidfd); #else ret = clone(fn, stack + __STACK_SIZE, flags | SIGCHLD, arg, pidfd); #endif return ret; } or even crazier variants such as [3]. With clone3() we have the ability to validate the stack. We can check that when stack_size is passed, the stack pointer is valid and the other way around. We can also check that the memory area userspace gave us is fine to use via access_ok(). Furthermore, we probably should not require userspace to know in which direction the stack is growing. It is easy for us to do this in the kernel and I couldn't find the original reasoning behind exposing this detail to userspace. /* Intentional user visible API change */ clone3() was released with 5.3. Currently, it is not documented and very unclear to userspace how the stack and stack_size argument have to be passed. After talking to glibc folks we concluded that trying to change clone3() to setup the stack instead of requiring userspace to do this is the right course of action. Note, that this is an explicit change in user visible behavior we introduce with this patch. If it breaks someone's use-case we will revert! (And then e.g. place the new behavior under an appropriate flag.) Breaking someone's use-case is very unlikely though. First, neither glibc nor musl currently expose a wrapper for clone3(). Second, there is no real motivation for anyone to use clone3() directly since it does not provide features that legacy clone doesn't. New features for clone3() will first happen in v5.5 which is why v5.4 is still a good time to try and make that change now and backport it to v5.3. Searches on [4] did not reveal any packages calling clone3(). [1]: https://lore.kernel.org/r/CAG48ez3q=BeNcuVTKBN79kJui4vC6nw0Bfq6xc-i0neheT17TA@mail.gmail.com [2]: https://lore.kernel.org/r/20191028172143.4vnnjpdljfnexaq5@wittgenstein [3]: https://github.com/systemd/systemd/blob/5238e9575906297608ff802a27e2ff9effa3b338/src/basic/raw-clone.h#L31 [4]: https://codesearch.debian.net Fixes: 7f192e3cd316 ("fork: add clone3") Cc: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-api@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 5.3 Cc: GNU C Library <libc-alpha@sourceware.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/r/20191031113608.20713-1-christian.brauner@ubuntu.com
2019-11-05ceph: don't allow copy_file_range when stripe_count != 1Luis Henriques
copy_file_range tries to use the OSD 'copy-from' operation, which simply performs a full object copy. Unfortunately, the implementation of this system call assumes that stripe_count is always set to 1 and doesn't take into account that the data may be striped across an object set. If the file layout has stripe_count different from 1, then the destination file data will be corrupted. For example: Consider a 8 MiB file with 4 MiB object size, stripe_count of 2 and stripe_size of 2 MiB; the first half of the file will be filled with 'A's and the second half will be filled with 'B's: 0 4M 8M Obj1 Obj2 +------+------+ +----+ +----+ file: | AAAA | BBBB | | AA | | AA | +------+------+ |----| |----| | BB | | BB | +----+ +----+ If we copy_file_range this file into a new file (which needs to have the same file layout!), then it will start by copying the object starting at file offset 0 (Obj1). And then it will copy the object starting at file offset 4M -- which is Obj1 again. Unfortunately, the solution for this is to not allow remote object copies to be performed when the file layout stripe_count is not 1 and simply fallback to the default (VFS) copy_file_range implementation. Cc: stable@vger.kernel.org Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-11-05ceph: don't try to handle hashed dentries in non-O_CREAT atomic_openJeff Layton
If ceph_atomic_open is handed a !d_in_lookup dentry, then that means that it already passed d_revalidate so we *know* that it's negative (or at least was very recently). Just return -ENOENT in that case. This also addresses a subtle bug in dentry handling. Non-O_CREAT opens call atomic_open with the parent's i_rwsem shared, but calling d_splice_alias on a hashed dentry requires the exclusive lock. If ceph_atomic_open receives a hashed, negative dentry on a non-O_CREAT open, and another client were to race in and create the file before we issue our OPEN, ceph_fill_trace could end up calling d_splice_alias on the dentry with the new inode with insufficient locks. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-11-05ALSA: hda/ca0132 - Fix possible workqueue stallTakashi Iwai
The unsolicited event handler for the headphone jack on CA0132 codec driver tries to reschedule the another delayed work with cancel_delayed_work_sync(). It's no good idea, unfortunately, especially after we changed the work queue to the standard global one; this may lead to a stall because both works are using the same global queue. Fix it by dropping the _sync but does call cancel_delayed_work() instead. Fixes: 993884f6a26c ("ALSA: hda/ca0132 - Delay HP amp turnon.") BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1155836 Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191105134316.19294-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-05scripts/nsdeps: make sure to pass all module source files to spatchJessica Yu
The nsdeps script passes a list of the module source files to generate_deps_for_ns() as a space delimited string named $mod_source_files, which then passes it to spatch. But since $mod_source_files is not encased in quotes, each source file in that string is treated as a separate shell function argument (as $2, $3, $4, etc.). However, the spatch invocation only refers to $2, so only the first file out of $mod_source_files is processed by spatch. This causes problems (namely, the MODULE_IMPORT_NS() statement doesn't get inserted) when a module is composed of many source files and the "main" module file containing the MODULE_LICENSE() statement is not the first file listed in $mod_source_files. Fix this by encasing $mod_source_files in quotes so that the entirety of the string is treated as a single argument and can be referred to as $2. In addition, put quotes in the variable assignment of mod_source_files to prevent any shell interpretation and field splitting. Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-11-05can: don't use deprecated license identifiersYegor Yefremov
The "GPL-2.0" license identifier changed to "GPL-2.0-only" in SPDX v3.0. Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-05can: mcp251x: mcp251x_restart_work_handler(): Fix potential force_quit race ↵Timo Schlüßler
condition In mcp251x_restart_work_handler() the variable to stop the interrupt handler (priv->force_quit) is reset after the chip is restarted and thus a interrupt might occur. This patch fixes the potential race condition by resetting force_quit before enabling interrupts. Signed-off-by: Timo Schlüßler <schluessler@krause.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04drm/i915/dp: Do not switch aux to TBT mode for non-TC portsJosé Roberto de Souza
Non-TC ports always have tc_mode == TC_PORT_TBT_ALT so it was switching aux to TBT mode for all combo-phy ports, happily this did not caused any issue but is better follow BSpec. Also this is reserved bit before ICL. Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Fixes: e9b7e1422d40 ("drm/i915: Sanitize the terminology used for TypeC port modes") Reviewed-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029011014.286885-1-jose.souza@intel.com (cherry picked from commit 49748264826ff4cc7f0ebbdd6b0d1a36b13b1cee) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-04drm/i915: Avoid HPD poll detect triggering a new detect cycleImre Deak
For the HPD interrupt functionality the HW depends on power wells in the display core domain to be on. Accordingly when enabling these power wells the HPD polling logic will force an HPD detection cycle to account for hotplug events that may have happened when such a power well was off. Thus a detect cycle started by polling could start a new detect cycle if a power well in the display core domain gets enabled during detect and stays enabled after detect completes. That in turn can lead to a detection cycle runaway. To prevent re-triggering a poll-detect cycle make sure we drop all power references we acquired during detect synchronously by the end of detect. This will let the poll-detect logic continue with polling (matching the off state of the corresponding power wells) instead of scheduling a new detection cycle. Fixes: 6cfe7ec02e85 ("drm/i915: Remove the unneeded AUX power ref from intel_dp_detect()") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112125 Reported-and-tested-by: Val Kulkov <val.kulkov@gmail.com> Reported-and-tested-by: wangqr <wqr.prg@gmail.com> Cc: Val Kulkov <val.kulkov@gmail.com> Cc: wangqr <wqr.prg@gmail.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191028181517.22602-1-imre.deak@intel.com (cherry picked from commit a8ddac7c9f06a12227a4f5febd1cbe0575a33179) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-04i40e: implement VF stats NDOJesse Brandeburg
Implement the VF stats gathering via the kernel via ndo_get_vf_stats(). The driver will show per-VF stats in the output of the command: ip -s link show dev <PF> Testing Hints: ip -s link show dev eth0 will return non-zero VF stats. ... vf 0 MAC 00:55:aa:00:55:aa, spoof checking on, link-state enable, trust off RX: bytes packets mcast bcast 128000 1000 104 104 TX: bytes packets 128000 1000 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04i40e: enable X710 supportAlice Michael
The I40E_DEV_ID_10G_BASE_T_BC device id was added previously, but was not enabled in all the appropriate places. Adding it to enable it's use. Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ixgbe: protect TX timestamping from API misuseManjunath Patil
HW timestamping can only be requested for a packet if the NIC is first setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the ixgbe driver still allowed TX packets to request HW timestamping. In this situation, we see 'clearing Tx Timestamp hang' noise in the log. Fix this by checking that the NIC is configured for HW TX timestamping before accepting a HW TX timestamping request. Similar-to: commit 26bd4e2db06b ("igb: protect TX timestamping from API misuse") commit 0a6f2f05a2f5 ("igb: Fix a test with HWTSTAMP_TX_ON") Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04fm10k: update driver version to match out-of-treeJacob Keller
An upcoming out-of-tree release will be occurring which will include the recent functionality to support virtual function statistics. Update the kernel driver version to match this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ixgbe: Make use of cpumask_local_spread to improve RSS localityAlexander Duyck
This patch is meant to address locality issues present in the ixgbe driver when it is loaded on a system supporting multiple NUMA nodes and more CPUs then the device can map in a 1:1 fashion. Instead of just arbitrarily mapping itself to CPUs 0-62 it would make much more sense to map itself to the local CPUs first, and then map itself to any remaining CPUs that might be used. The first effect of this is that queue 0 should always be allocated on the local CPU/NUMA node. This is important as it is the default destination if a packet doesn't match any existing flow director filter or RSS rule and as such having it local should help to reduce QPI cross-talk in the event of an unrecognized traffic type. In addition this should increase the likelihood of the RSS queues being allocated and used on CPUs local to the device while the ATR/Flow Director queues would be able to route traffic directly to the CPU that is likely to be processing it. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04fm10k: add support for ndo_get_vf_stats operationJacob Keller
Support capturing and reporting statistics for all of the VFs associated with a given PF device via the ndo_get_vf_stats callback. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04fm10k: add missing field initializers to TLV attributes)Jacob Keller
Add the missing field initializers for a couple of the TLV attribute macros. This resolves the last few -Wmissing-field-initializers warnings for the fm10k Linux driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: allow 3k MTU for XDPMaciej Fijalkowski
At this point ice driver is able to work on order 1 pages that are split onto two 3k buffers. Let's reflect that when user is setting new MTU size and XDP is present on interface. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: add build_skb() supportMaciej Fijalkowski
Driver is now prepared for building the skb around the existing Rx buffer, so introduce the ice_build_skb responsible for it. Make use of XDP's data_meta as well. I've observed around 30% less CPU consumption with build_skb Rx path, in comparison to legacy Rx. What stands behind such result is the avoidance of flow_dissector (which we were diving into via eth_get_headlen) and no memcpy calls. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: introduce frame padding computation logicMaciej Fijalkowski
Take into account the underlying architecture specific settings and based on that calculate the possible padding that can be supplied. Typically, for x86 and standard MTU size we will end up with 192 bytes of headroom. This is the same behavior as our other drivers have and we can dedicate it for XDP purposes. Furthermore, introduce the Rx ring flag for indicating whether build_skb is used on particular. Based on that invoke the routines for padding calculation. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: introduce legacy Rx flagMaciej Fijalkowski
Add an ethtool "legacy-rx" priv flag for toggling the Rx path. This control knob will be mainly used for build_skb usage as well as buffer size/MTU manipulation. In preparation for adding build_skb support in a way that it takes care of how we set the values of max_frame and rx_buf_len fields of struct ice_vsi. Specifically, in this patch mentioned fields are set to values that will allow us to provide headroom and tailroom in-place. This can be mostly broken down onto following: - for legacy-rx "on" ethtool control knob, old behaviour is kept; - for standard 1500 MTU size configure the buffer of size 1536, as network stack is expecting the NET_SKB_PAD to be provided and NET_IP_ALIGN can have a non-zero value (these can be typically equal to 32 and 2, respectively); - for larger MTUs go with max_frame set to 9k and configure the 3k buffer in case when PAGE_SIZE of underlying arch is less than 8k; 3k buffer is implying the need for order 1 page, so that our page recycling scheme can still be applied; With that said, substitute the hardcoded ICE_RXBUF_2048 and PAGE_SIZE values in DMA API that we're making use of with rx_ring->rx_buf_len and ice_rx_pg_size(rx_ring). The latter is an introduced helper for determining the page size based on its order (which was figured out via ice_rx_pg_order). Last but not least, take care of truesize calculation. In the followup patch the headroom/tailroom computation logic will be introduced. This change aligns the buffer and frame configuration with other Intel drivers, most importantly with iavf. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04can: j1939: transport: j1939_xtp_rx_eoma_one(): Add sanity check for correct ↵Oleksij Rempel
total message size We were sending malformed EOMA with total message size set to 0. This issue has been fixed in the previous patch. In this patch a sanity check is added to the RX path and a error message is displayed. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: j1939: transport: j1939_session_fresh_new(): make sure EOMA is send ↵Oleksij Rempel
with the total message size set We were sending malformed EOMA messageswith total message size set to 0. This patch fixes the bug. Reported-by: https://github.com/linux-can/can-utils/issues/159 Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: j1939: fix memory leak if filters was setOleksij Rempel
Filters array is coped from user space and linked to the j1939 socket. On socket release this memory was not freed. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: j1939: fix resource leak of skb on error return pathsColin Ian King
Currently the error return paths do not free skb and this results in a memory leak. Fix this by freeing them before the return. Addresses-Coverity: ("Resource leak") Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: add missing state changesJeroen Hofstee
While the ti_hecc has interrupts to report when the error counters increase to a certain level and which change state it doesn't handle the case that the error counters go down again, so the reported state can actually be wrong. Since there is no interrupt for that, do update state based on the error counters, when the state is not error active and goes down again. Signed-off-by: Jeroen Hofstee <jhofstee@victronenergy.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: properly report state changesJeroen Hofstee
The HECC_CANES register handles the flags specially, it only updates the flags after a one is written to them. Since the interrupt for frame errors is not enabled an old error can hence been seen when a state interrupt arrives. For example if the device is not connected to the CAN-bus the error warning interrupt will have HECC_CANES indicating there is no ack. The error passive interrupt thereafter will have HECC_CANES flagging that there is a warning level. And if thereafter there is a message successfully send HECC_CANES points to an error passive event, while in reality it became error warning again. In summary, the state is not always reported correctly. So handle the state changes and frame errors separately. The state changes are now based on the interrupt flags and handled directly when they occur. The reporting of the frame errors is still done as before, as a side effect of another interrupt. note: the hecc_clear_bit will do a read, modify, write. So it will not only clear the bit, but also reset all other bits being set as a side affect, hence it is replaced with only clearing the flags. note: The HECC_CANMC_CCR is no longer cleared in the state change interrupt, it is completely unrelated. And use net_ratelimit to make checkpatch happy. Signed-off-by: Jeroen Hofstee <jhofstee@victronenergy.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: add fifo overflow error reportingJeroen Hofstee
When the rx FIFO overflows the ti_hecc would silently drop them since the overwrite protection is enabled for all mailboxes. So disable it for the lowest priority mailbox and return a proper error value when receive message lost is set. Drop the message itself in that case, since it might be partially updated. Signed-off-by: Jeroen Hofstee <jhofstee@victronenergy.com> Acked-by: Jeroen Hofstee <jhofstee@victronenergy.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: release the mailbox a bit earlierJeroen Hofstee
Release the mailbox after reading it, so it can be reused a bit earlier. Since "can: rx-offload: continue on error" all pending message bits are cleared directly, so remove clearing them in ti_hecc. Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Jeroen Hofstee <jhofstee@victronenergy.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: keep MIM and MD setJeroen Hofstee
The HECC_CANMIM is set in the xmit path and cleared in the interrupt. Since this is done with a read, modify, write action the register might end up with some more MIM enabled then intended, since it is not protected. That doesn't matter at all, since the tx interrupt disables the mailbox with HECC_CANME (while holding a spinlock). So lets just always keep MIM set. While at it, since the mailbox direction never changes, don't set it every time a message is send, ti_hecc_reset() already sets them to tx. Signed-off-by: Jeroen Hofstee <jhofstee@victronenergy.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: ti_hecc_stop(): stop the CPK on downJeroen Hofstee
When the interface goes down, the CPK should no longer take an active part in the CAN-bus communication, like sending acks and error frames. So enable configuration mode in ti_hecc_stop, so the CPK is no longer active. When a transceiver switch is present the acks and errors don't make it to the bus, but disabling the CPK then does prevent oddities, like ti_hecc_reset() failing, since the CPK can become bus-off and starts counting the 11 bit recessive bits, which seems to block the reset. It can also cause invalid interrupts and disrupt the CAN-bus, since transmission can be stopped in the middle of a message, by disabling the tranceiver while the CPK is sending. Since the CPK is disabled after normal power on, it is typically only seen when the interface is restarted. Signed-off-by: Jeroen Hofstee <jhofstee@victronenergy.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: ti_hecc: ti_hecc_error(): increase error counters if skb enqueueing via ↵Marc Kleine-Budde
can_rx_offload_queue_sorted() fails The call to can_rx_offload_queue_sorted() may fail and return an error (in the current implementation due to resource shortage). The passed skb is consumed. This patch adds incrementing of the appropriate error counters to let the device statistics reflect that there's a problem. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: flexcan: increase error counters if skb enqueueing via ↵Marc Kleine-Budde
can_rx_offload_queue_sorted() fails The call to can_rx_offload_queue_sorted() may fail and return an error (in the current implementation due to resource shortage). The passed skb is consumed. This patch adds incrementing of the appropriate error counters to let the device statistics reflect that there's a problem. Reported-by: Martin Hundebøll <martin@geanix.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-04can: rx-offload: can_rx_offload_irq_offload_fifo(): continue on errorMarc Kleine-Budde
In case of a resource shortage, i.e. the rx_offload queue will overflow or a skb fails to be allocated (due to OOM), can_rx_offload_offload_one() will call mailbox_read() to discard the mailbox and return an ERR_PTR. If the hardware FIFO is empty can_rx_offload_offload_one() will return NULL. In case a CAN frame was read from the hardware, can_rx_offload_offload_one() returns the skb containing it. Without this patch can_rx_offload_irq_offload_fifo() bails out if no skb returned, regardless of the reason. Similar to can_rx_offload_irq_offload_timestamp() in case of a resource shortage the whole FIFO should be discarded, to avoid an IRQ storm and give the system some time to recover. However if the FIFO is empty the loop can be left. With this patch the loop is left in case of empty FIFO, but not on errors. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>