summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2018-08-06KVM: X86: Implement "send IPI" hypercallWanpeng Li
Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Intel guest can enter x2apic cluster mode when interrupt remmaping is enabled in qemu, however, latest AMD EPYC still just supports xapic mode which can get great improvement by Exit-less IPIs. This patchset lets a guest send multicast IPIs, with at most 128 destinations per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode. Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141): x2apic cluster mode, vanilla Dry-run: 0, 2392199 ns Self-IPI: 6907514, 15027589 ns Normal IPI: 223910476, 251301666 ns Broadcast IPI: 0, 9282161150 ns Broadcast lock: 0, 8812934104 ns x2apic cluster mode, pv-ipi Dry-run: 0, 2449341 ns Self-IPI: 6720360, 15028732 ns Normal IPI: 228643307, 255708477 ns Broadcast IPI: 0, 7572293590 ns => 22% performance boost Broadcast lock: 0, 8316124651 ns x2apic physical mode, vanilla Dry-run: 0, 3135933 ns Self-IPI: 8572670, 17901757 ns Normal IPI: 226444334, 255421709 ns Broadcast IPI: 0, 19845070887 ns Broadcast lock: 0, 19827383656 ns x2apic physical mode, pv-ipi Dry-run: 0, 2446381 ns Self-IPI: 6788217, 15021056 ns Normal IPI: 219454441, 249583458 ns Broadcast IPI: 0, 7806540019 ns => 154% performance boost Broadcast lock: 0, 9143618799 ns Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-06KVM: x86: Add tlb remote flush callback in kvm_x86_ops.Tianyu Lan
This patch is to provide a way for platforms to register hv tlb remote flush callback and this helps to optimize operation of tlb flush among vcpus for nested virtualization case. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-06kvm: nVMX: Introduce KVM_CAP_NESTED_STATEJim Mattson
For nested virtualization L0 KVM is managing a bit of state for L2 guests, this state can not be captured through the currently available IOCTLs. In fact the state captured through all of these IOCTLs is usually a mix of L1 and L2 state. It is also dependent on whether the L2 guest was running at the moment when the process was interrupted to save its state. With this capability, there are two new vcpu ioctls: KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE. These can be used for saving and restoring a VM that is in VMX operation. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Jim Mattson <jmattson@google.com> [karahmed@ - rename structs and functions and make them ready for AMD and address previous comments. - handle nested.smm state. - rebase & a bit of refactoring. - Merge 7/8 and 8/8 into one patch. ] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-06KVM: Switch 'requests' to be 64-bit (explicitly)KarimAllah Ahmed
Switch 'requests' to be explicitly 64-bit and update BUILD_BUG_ON check to use the size of "requests" instead of the hard-coded '32'. That gives us a bit more room again for arch-specific requests as we already ran out of space for x86 due to the hard-coded check. The only exception here is ARM32 as it is still 32-bits. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim KrÄmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-06Merge tag 'v4.18-rc6' into HEADPaolo Bonzini
Pull bug fixes into the KVM development tree to avoid nasty conflicts.
2018-08-06btrfs: Get rid of the confusing btrfs_file_extent_inline_lenQu Wenruo
We used to call btrfs_file_extent_inline_len() to get the uncompressed data size of an inlined extent. However this function is hiding evil, for compressed extent, it has no choice but to directly read out ram_bytes from btrfs_file_extent_item. While for uncompressed extent, it uses item size to calculate the real data size, and ignoring ram_bytes completely. In fact, for corrupted ram_bytes, due to above behavior kernel btrfs_print_leaf() can't even print correct ram_bytes to expose the bug. Since we have the tree-checker to verify all EXTENT_DATA, such mismatch can be detected pretty easily, thus we can trust ram_bytes without the evil btrfs_file_extent_inline_len(). Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: remove the logged extents infrastructureJosef Bacik
This is no longer used anywhere, remove all of it. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06PM / reboot: Eliminate race between reboot and suspendPingfan Liu
At present, "systemctl suspend" and "shutdown" can run in parrallel. A system can suspend after devices_shutdown(), and resume. Then the shutdown task goes on to power off. This causes many devices are not really shut off. Hence replacing reboot_mutex with system_transition_mutex (renamed from pm_mutex) to achieve the exclusion. The renaming of pm_mutex as system_transition_mutex can be better to reflect the purpose of the mutex. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-06aio: implement IOCB_CMD_POLLChristoph Hellwig
Simple one-shot poll through the io_submit() interface. To poll for a file descriptor the application should submit an iocb of type IOCB_CMD_POLL. It will poll the fd for the events specified in the the first 32 bits of the aio_buf field of the iocb. Unlike poll or epoll without EPOLLONESHOT this interface always works in one shot mode, that is once the iocb is completed, it will have to be resubmitted. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Avi Kivity <avi@scylladb.com>
2018-08-06Merge back cpufreq changes for 4.19.Rafael J. Wysocki
2018-08-06Merge remote-tracking branch 'net-next/master'Stefan Schmidt
2018-08-05Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-4.19/block2Jens Axboe
Pull NVMe changes from Christoph: "This contains the support for TP4004, Asymmetric Namespace Access, which makes NVMe multipathing usable in practice." * 'nvme-4.19' of git://git.infradead.org/nvme: nvmet: use Retain Async Event bit to clear AEN nvmet: support configuring ANA groups nvmet: add minimal ANA support nvmet: track and limit the number of namespaces per subsystem nvmet: keep a port pointer in nvmet_ctrl nvme: add ANA support nvme: remove nvme_req_needs_failover nvme: simplify the API for getting log pages nvme.h: add ANA definitions nvme.h: add support for the log specific field Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-05Merge tag 'v4.18-rc6' into for-4.19/block2Jens Axboe
Pull in 4.18-rc6 to get the NVMe core AEN change to avoid a merge conflict down the line. Signed-of-by: Jens Axboe <axboe@kernel.dk>
2018-08-05Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2018-08-05 Here's the main bluetooth-next pull request for the 4.19 kernel. - Added support for Bluetooth Advertising Extensions - Added vendor driver support to hci_h5 HCI driver - Added serdev support to hci_h5 driver - Added support for Qualcomm wcn3990 controller - Added support for RTL8723BS and RTL8723DS controllers - btusb: Added new ID for Realtek 8723DE - Several other smaller fixes & cleanups Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ip: use rb trees for IP frag queue.Peter Oskolkov
Similar to TCP OOO RX queue, it makes sense to use rb trees to store IP fragments, so that OOO fragments are inserted faster. Tested: - a follow-up patch contains a rather comprehensive ip defrag self-test (functional) - ran neper `udp_stream -c -H <host> -F 100 -l 300 -T 20`: netstat --statistics Ip: 282078937 total packets received 0 forwarded 0 incoming packets discarded 946760 incoming packets delivered 18743456 requests sent out 101 fragments dropped after timeout 282077129 reassemblies required 944952 packets reassembled ok 262734239 packet reassembles failed (The numbers/stats above are somewhat better re: reassemblies vs a kernel without this patchset. More comprehensive performance testing TBD). Reported-by: Jann Horn <jannh@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05net: modify skb_rbtree_purge to return the truesize of all purged skbs.Peter Oskolkov
Tested: see the next patch is the series. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ip: discard IPv4 datagrams with overlapping segments.Peter Oskolkov
This behavior is required in IPv6, and there is little need to tolerate overlapping fragments in IPv4. This change simplifies the code and eliminates potential DDoS attack vectors. Tested: ran ip_defrag selftest (not yet available uptream). Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ptp_qoriq: support automatic configuration for ptp timerYangbo Lu
This patch is to support automatic configuration for ptp timer. If required ptp dts properties are not provided, driver could try to calculate a set of default configurations to initialize the ptp timer. This makes the driver work for many boards which don't have the required ptp dts properties in current kernel. Also the users could set dts properties by themselves according to their requirement. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree: 1) Support for transparent proxying for nf_tables, from Mate Eckl. 2) Patchset to add OS passive fingerprint recognition for nf_tables, from Fernando Fernandez. This takes common code from xt_osf and place it into the new nfnetlink_osf module for codebase sharing. 3) Lightweight tunneling support for nf_tables. 4) meta and lookup are likely going to be used in rulesets, make them direct calls. From Florian Westphal. A bunch of incremental updates: 5) use PTR_ERR_OR_ZERO() from nft_numgen, from YueHaibing. 6) Use kvmalloc_array() to allocate hashtables, from Li RongQing. 7) Explicit dependencies between nfnetlink_cttimeout and conntrack timeout extensions, from Harsha Sharma. 8) Simplify NLM_F_CREATE handling in nf_tables. 9) Removed unused variable in the get element command, from YueHaibing. 10) Expose bridge hook priorities through uapi, from Mate Eckl. And a few fixes for previous Netfilter batch for net-next: 11) Use per-netns mutex from flowtable event, from Florian Westphal. 12) Remove explicit dependency on iptables CT target from conntrack zones, from Florian. 13) Fix use-after-free in rmmod nf_conntrack path, also from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Lots of overlapping changes, mostly trivial in nature. The mlxsw conflict was resolving using the example resolution at: https://github.com/jpirko/linux_mlxsw/blob/combined_queue/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_actions.c Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge 4.18-rc7 into master to pick up the KVM dependcyThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-04ethtool: Remove trailing semicolon for static inlineFlorian Fainelli
Android's header sanitization tool chokes on static inline functions having a trailing semicolon, leading to an incorrectly parsed header file. While the tool should obviously be fixed, also fix the header files for the two affected functions: ethtool_get_flow_spec_ring() and ethtool_get_flow_spec_ring_vf(). Fixes: 8cf6f497de40 ("ethtool: Add helper routines to pass vf to rx_flow_spec") Reporetd-by: Blair Prescott <blair.prescott@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-04i2c: quirks: add zero length checksWolfram Sang
Some adapters do not support a message length of 0. Add this as a quirk so drivers don't have to open code it. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-04include/net/bond_3ad: Simplify the code by using the ARRAY_SIZEzhong jiang
We prefer to ARRAY_SIZE rather than the open code to calculate size. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-04Merge tag 'qcom-arm64-for-4.19-2' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into next/dt Qualcomm ARM64 Updates for v4.19 - Part 2 * Add thermal nodes for MSM8996 and SDM845 * tag 'qcom-arm64-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux: (21 commits) arm64: dts: sdm845: Add tsens nodes arm64: dts: msm8996: thermal: Initialise via DT and add second controller soc: qcom: rmtfs-mem: fix memleak in probe error paths soc: qcom: llc-slice: Add missing MODULE_LICENSE() drivers: qcom: rpmh: fix unwanted error check for get_tcs_of_type() drivers: qcom: rpmh-rsc: fix the loop index check in get_req_from_tcs firmware: qcom: scm: add a dummy qcom_scm_assign_mem() drivers: qcom: rpmh-rsc: Check cmd_db_ready() to help children drivers: qcom: rpmh-rsc: allow active requests from wake TCS drivers: qcom: rpmh: add support for batch RPMH request drivers: qcom: rpmh: allow requests to be sent asynchronously drivers: qcom: rpmh: cache sleep/wake state requests drivers: qcom: rpmh-rsc: allow invalidation of sleep/wake TCS drivers: qcom: rpmh-rsc: write sleep/wake requests to TCS drivers: qcom: rpmh: add RPMH helper functions drivers: qcom: rpmh-rsc: log RPMH requests in FTRACE dt-bindings: introduce RPMH RSC bindings for Qualcomm SoCs drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCs drivers: soc: Add LLCC driver dt-bindings: Documentation for qcom, llcc ...
2018-08-03fork: Have new threads join on-going signal group stopsEric W. Biederman
There are only two signals that are delivered to every member of a signal group: SIGSTOP and SIGKILL. Signal delivery requires every signal appear to be delivered either before or after a clone syscall. SIGKILL terminates the clone so does not need to be considered. Which leaves only SIGSTOP that needs to be considered when creating new threads. Today in the event of a group stop TIF_SIGPENDING will get set and the fork will restart ensuring the fork syscall participates in the group stop. A fork (especially of a process with a lot of memory) is one of the most expensive system so we really only want to restart a fork when necessary. It is easy so check to see if a SIGSTOP is ongoing and have the new thread join it immediate after the clone completes. Making it appear the clone completed happened just before the SIGSTOP. The calculate_sigpending function will see the bits set in jobctl and set TIF_SIGPENDING to ensure the new task takes the slow path to userspace. V2: The call to task_join_group_stop was moved before the new task is added to the thread group list. This should not matter as sighand->siglock is held over both the addition of the threads, the call to task_join_group_stop and do_signal_stop. But the change is trivial and it is one less thing to worry about when reading the code. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-03fork: Skip setting TIF_SIGPENDING in ptrace_init_taskEric W. Biederman
The code in calculate_sigpending will now handle this so it is just redundant and possibly a little confusing to continue setting TIF_SIGPENDING in ptrace_init_task. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-03signal: Add calculate_sigpending()Eric W. Biederman
Add a function calculate_sigpending to test to see if any signals are pending for a new task immediately following fork. Signals have to happen either before or after fork. Today our practice is to push all of the signals to before the fork, but that has the downside that frequent or periodic signals can make fork take much much longer than normal or prevent fork from completing entirely. So we need move signals that we can after the fork to prevent that. This updates the code to set TIF_SIGPENDING on a new task if there are signals or other activities that have moved so that they appear to happen after the fork. As the code today restarts if it sees any such activity this won't immediately have an effect, as there will be no reason for it to set TIF_SIGPENDING immediately after the fork. Adding calculate_sigpending means the code in fork can safely be changed to not always restart if a signal is pending. The new calculate_sigpending function sets sigpending if there are pending bits in jobctl, pending signals, the freezer needs to freeze the new task or the live kernel patching framework need the new thread to take the slow path to userspace. I have verified that setting TIF_SIGPENDING does make a new process take the slow path to userspace before it executes it's first userspace instruction. I have looked at the callers of signal_wake_up and the code paths setting TIF_SIGPENDING and I don't see anything else that needs to be handled. The code probably doesn't need to set TIF_SIGPENDING for the kernel live patching as it uses a separate thread flag as well. But at this point it seems safer reuse the recalc_sigpending logic and get the kernel live patching folks to sort out their story later. V2: I have moved the test into schedule_tail where siglock can be grabbed and recalc_sigpending can be reused directly. Further as the last action of setting up a new task this guarantees that TIF_SIGPENDING will be properly set in the new process. The helper calculate_sigpending takes the siglock and uncontitionally sets TIF_SIGPENDING and let's recalc_sigpending clear TIF_SIGPENDING if it is unnecessary. This allows reusing the existing code and keeps maintenance of the conditions simple. Oleg Nesterov <oleg@redhat.com> suggested the movement and pointed out the need to take siglock if this code was going to be called while the new task is discoverable. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-03new helper: inode_fake_hash()Al Viro
open-coded in a quite a few places... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-08-03media: vsp1: Support Interlaced display pipelinesKieran Bingham
Calculate the top and bottom fields for the interlaced frames and utilise the extended display list command feature to implement the auto-field operations. This allows the DU to update the VSP2 registers dynamically based upon the currently processing field. Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-03new primitive: discard_new_inode()Al Viro
We don't want open-by-handle picking half-set-up in-core struct inode from e.g. mkdir() having failed halfway through. In other words, we don't want such inodes returned by iget_locked() on their way to extinction. However, we can't just have them unhashed - otherwise open-by-handle immediately *after* that would've ended up creating a new in-core inode over the on-disk one that is in process of being freed right under us. Solution: new flag (I_CREATING) set by insert_inode_locked() and removed by unlock_new_inode() and a new primitive (discard_new_inode()) to be used by such halfway-through-setup failure exits instead of unlock_new_inode() / iput() combinations. That primitive unlocks new inode, but leaves I_CREATING in place. iget_locked() treats finding an I_CREATING inode as failure (-ESTALE, once we sort out the error propagation). insert_inode_locked() treats the same as instant -EBUSY. ilookup() treats those as icache miss. [Fix by Dan Carpenter <dan.carpenter@oracle.com> folded in] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-08-03rxrpc: Push iov_iter up from rxrpc_kernel_recv_data() to callerDavid Howells
Push iov_iter up from rxrpc_kernel_recv_data() to its caller to allow non-contiguous iovs to be passed down, thereby permitting file reading to be simplified in the AFS filesystem in a future patch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03netfilter: bridge: Expose nf_tables bridge hook priorities through uapiMáté Eckl
Netfilter exposes standard hook priorities in case of ipv4, ipv6 and arp but not in case of bridge. This patch exposes the hook priority values of the bridge family (which are different from the formerly mentioned) via uapi so that they can be used by user-space applications just like the others. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: nf_tables: match on tunnel metadataPablo Neira Ayuso
This patch allows us to match on the tunnel metadata that is available of the packet. We can use this to validate if the packet comes from/goes to tunnel and the corresponding tunnel ID. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: nf_tables: add tunnel supportPablo Neira Ayuso
This patch implements the tunnel object type that can be used to configure tunnels via metadata template through the existing lightweight API from the ingress path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03l2tp: ignore L2TP_ATTR_MTUGuillaume Nault
This attribute's handling is broken. It can only be used when creating Ethernet pseudo-wires, in which case its value can be used as the initial MTU for the l2tpeth device. However, when handling update requests, L2TP_ATTR_MTU only modifies session->mtu. This value is never propagated to the l2tpeth device. Dump requests also return the value of session->mtu, which is not synchronised anymore with the device MTU. The same problem occurs if the device MTU is properly updated using the generic IFLA_MTU attribute. In this case, session->mtu is not updated, and L2TP_ATTR_MTU will report an invalid value again when dumping the session. It does not seem worthwhile to complexify l2tp_eth.c to synchronise session->mtu with the device MTU. Even the ip-l2tp manpage advises to use 'ip link' to initialise the MTU of l2tpeth devices (iproute2 does not handle L2TP_ATTR_MTU at all anyway). So let's just ignore it entirely. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03netfilter: nfnetlink_osf: rename nf_osf header file to nfnetlink_osfFernando Fernandez Mancera
The first client of the nf_osf.h userspace header is nft_osf, coming in this batch, rename it to nfnetlink_osf.h as there are no userspace clients for this yet, hence this looks consistent with other nfnetlink subsystem. Suggested-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: nf_osf: move nf_osf_fingers to non-uapi header fileFernando Fernandez Mancera
All warnings (new ones prefixed by >>): >> ./usr/include/linux/netfilter/nf_osf.h:73: userspace cannot reference function or variable defined in the kernel Fixes: f9324952088f ("netfilter: nfnetlink_osf: extract nfnetlink_subsystem code from xt_osf.c") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03netfilter: use kvmalloc_array to allocate memory for hashtableLi RongQing
nf_ct_alloc_hashtable is used to allocate memory for conntrack, NAT bysrc and expectation hashtable. Assuming 64k bucket size, which means 7th order page allocation, __get_free_pages, called by nf_ct_alloc_hashtable, will trigger the direct memory reclaim and stall for a long time, when system has lots of memory stress so replace combination of __get_free_pages and vzalloc with kvmalloc_array, which provides a overflow check and a fallback if no high order memory is available, and do not retry to reclaim memory, reduce stall and remove nf_ct_free_hashtable, since it is just a kvfree Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Wang Li <wangli39@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03mailbox: mediatek: Add Mediatek CMDQ driverHoulong Wei
This patch is first version of Mediatek Command Queue(CMDQ) driver. The CMDQ is used to help write registers with critical time limitation, such as updating display configuration during the vblank. It controls Global Command Engine (GCE) hardware to achieve this requirement. Currently, CMDQ only supports display related hardwares, but we expect it can be extended to other hardwares for future requirements. Signed-off-by: Houlong Wei <houlong.wei@mediatek.com> Signed-off-by: HS Liao <hs.liao@mediatek.com> Signed-off-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2018-08-03dt-bindings: soc: Add documentation for the MediaTek GCE unitHoulong Wei
This adds documentation for the MediaTek Global Command Engine (GCE) unit found in MT8173 SoCs. Signed-off-by: Houlong Wei <houlong.wei@mediatek.com> Signed-off-by: HS Liao <hs.liao@mediatek.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2018-08-03mailbox/omap: switch to SPDX license identifierSuman Anna
Use the appropriate SPDX license identifier in the OMAP Mailbox driver source files and drop the previous boilerplate license text. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2018-08-03crypto: scatterwalk - remove scatterwalk_samebuf()Eric Biggers
scatterwalk_samebuf() is never used. Remove it. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-03crypto: scatterwalk - remove 'chain' argument from scatterwalk_crypto_chain()Eric Biggers
All callers pass chain=0 to scatterwalk_crypto_chain(). Remove this unneeded parameter. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-03crypto: drbg - in-place cipher operation for CTRStephan Müller
The cipher implementations of the kernel crypto API favor in-place cipher operations. Thus, switch the CTR cipher operation in the DRBG to perform in-place operations. This is implemented by using the output buffer as input buffer and zeroizing it before the cipher operation to implement a CTR encryption of a NULL buffer. The speed improvement is quite visibile with the following comparison using the LRNG implementation. Without the patch set: 16 bytes| 12.267661 MB/s| 61338304 bytes | 5000000213 ns 32 bytes| 23.603770 MB/s| 118018848 bytes | 5000000073 ns 64 bytes| 46.732262 MB/s| 233661312 bytes | 5000000241 ns 128 bytes| 90.038042 MB/s| 450190208 bytes | 5000000244 ns 256 bytes| 160.399616 MB/s| 801998080 bytes | 5000000393 ns 512 bytes| 259.878400 MB/s| 1299392000 bytes | 5000001675 ns 1024 bytes| 386.050662 MB/s| 1930253312 bytes | 5000001661 ns 2048 bytes| 493.641728 MB/s| 2468208640 bytes | 5000001598 ns 4096 bytes| 581.835981 MB/s| 2909179904 bytes | 5000003426 ns With the patch set: 16 bytes | 17.051142 MB/s | 85255712 bytes | 5000000854 ns 32 bytes | 32.695898 MB/s | 163479488 bytes | 5000000544 ns 64 bytes | 64.490739 MB/s | 322453696 bytes | 5000000954 ns 128 bytes | 123.285043 MB/s | 616425216 bytes | 5000000201 ns 256 bytes | 233.434573 MB/s | 1167172864 bytes | 5000000573 ns 512 bytes | 384.405197 MB/s | 1922025984 bytes | 5000000671 ns 1024 bytes | 566.313370 MB/s | 2831566848 bytes | 5000001080 ns 2048 bytes | 744.518042 MB/s | 3722590208 bytes | 5000000926 ns 4096 bytes | 867.501670 MB/s | 4337508352 bytes | 5000002181 ns Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linuxHerbert Xu
Merge mainline to pick up c7513c2a2714 ("crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair").
2018-08-03spi: spi-mem: Constify spi_mem->nameBoris Brezillon
There is no reason to make spi_mem->name modifiable. Moreover, spi_mem_ops->get_name() returns a const char *, which generates a gcc warning when assigning the value returned by spi_mem_ops->get_name() to spi_mem->name. Fixes: 5d27a9c8ea9e ("spi: spi-mem: Extend the SPI mem interface to set a custom memory name") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-02RDMA/netdev: Use priv_destructor for netdev cleanupJason Gunthorpe
Now that the unregister_netdev flow for IPoIB no longer relies on external code we can now introduce the use of priv_destructor and needs_free_netdev. The rdma_netdev flow is switched to use the netdev common priv_destructor instead of the special free_rdma_netdev and the IPOIB ULP adjusted: - priv_destructor needs to switch to point to the ULP's destructor which will then call the rdma_ndev's in the right order - We need to be careful around the error unwind of register_netdev as it sometimes calls priv_destructor on failure - ULPs need to use ndo_init/uninit to ensure proper ordering of failures around register_netdev Switching to priv_destructor is a necessary pre-requisite to using the rtnl new_link mechanism. The VNIC user for rdma_netdev should also be revised, but that is left for another patch. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02iw_cxgb4: RDMA write with immediate supportPotnuri Bharat Teja
Adds iw_cxgb4 functionality to support RDMA_WRITE_WITH_IMMEDATE opcode. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02RDMA/hns: Support flush cqe for hip08 in kernel spaceYixian Liu
According to IB protocol, there are some cases that work requests must return the flush error completion status through the completion queue. Due to hardware limitation, the driver needs to assist the flush process. This patch adds the support of flush cqe for hip08 in the cases that needed, such as poll cqe, post send, post recv and aeqe handle. The patch also considered the compatibility between kernel and user space. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>