summaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2017-04-22net/devlink: Add E-Switch encapsulation controlRoi Dayan
This is an e-switch global knob to enable HW support for applying encapsulation/decapsulation to VF traffic as part of SRIOV e-switch offloading. The actual encap/decap is carried out (along with the matching and other actions) per offloaded e-switch rules, e.g as done when offloading the TC tunnel key action. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Both conflict were simple overlapping changes. In the kaweth case, Eric Dumazet's skb_cow() bug fix overlapped the conversion of the driver in net-next to use in-netdev stats. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21net: Remove NET_CORE_BUDGET_USECS from sysctl binary interface.David S. Miller
We are not supposed to add new entries to this thing any more. Thanks to Eric Dumazet for noticing this. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-04-20 This adds the basic infrastructure for IPsec hardware offloading, it creates a configuration API and adjusts the packet path. 1) Add the needed netdev features to configure IPsec offloads. 2) Add the IPsec hardware offloading API. 3) Prepare the ESP packet path for hardware offloading. 4) Add gso handlers for esp4 and esp6, this implements the software fallback for GSO packets. 5) Add xfrm replay handler functions for offloading. 6) Change ESP to use a synchronous crypto algorithm on offloading, we don't have the option for asynchronous returns when we handle IPsec at layer2. 7) Add a xfrm validate function to validate_xmit_skb. This implements the software fallback for non GSO packets. 8) Set the inner_network and inner_transport members of the SKB, as well as encapsulation, to reflect the actual positions of these headers, and removes them only once encryption is done on the payload. From Ilan Tayari. 9) Prepare the ESP GRO codepath for hardware offloading. 10) Fix incorrect null pointer check in esp6. From Colin Ian King. 11) Fix for the GSO software fallback path to detect the fallback correctly. From Ilan Tayari. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21net: ipv6: RTF_PCPU should not be settable from userspaceDavid Ahern
Andrey reported a fault in the IPv6 route code: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff880069809600 task.stack: ffff880062dc8000 RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975 RSP: 0018:ffff880062dced30 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006 RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018 RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0 Call Trace: ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128 ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212 ... Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit set. Flags passed to the kernel are blindly copied to the allocated rt6_info by ip6_route_info_create making a newly inserted route appear as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set and expects rt->dst.from to be set - which it is not since it is not really a per-cpu copy. The subsequent call to __ip6_dst_alloc then generates the fault. Fix by checking for the flag and failing with EINVAL. Fixes: d52d3997f843f ("ipv6: Create percpu rt6_info") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21bpf: add napi_id read access to __sk_buffDaniel Borkmann
Add napi_id access to __sk_buff for socket filter program types, tc program types and other bpf_convert_ctx_access() users. Having access to skb->napi_id is useful for per RX queue listener siloing, f.e. in combination with SO_ATTACH_REUSEPORT_EBPF and when busy polling is used, meaning SO_REUSEPORT enabled listeners can then select the corresponding socket at SYN time already [1]. The skb is marked via skb_mark_napi_id() early in the receive path (e.g., napi_gro_receive()). Currently, sockets can only use SO_INCOMING_NAPI_ID from 6d4339028b35 ("net: Introduce SO_INCOMING_NAPI_ID") as a socket option to look up the NAPI ID associated with the queue for steering, which requires a prior sk_mark_napi_id() after the socket was looked up. Semantics for the __sk_buff napi_id access are similar, meaning if skb->napi_id is < MIN_NAPI_ID (e.g. outgoing packets using sender_cpu), then an invalid napi_id of 0 is returned to the program, otherwise a valid non-zero napi_id. [1] http://netdevconf.org/2.1/slides/apr6/dumazet-BUSY-POLLING-Netdev-2.1.pdf Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuningMatthew Whitehead
Constants used for tuning are generally a bad idea, especially as hardware changes over time. Replace the constant 2 jiffies with sysctl variable netdev_budget_usecs to enable sysadmins to tune the softirq processing. Also document the variable. For example, a very fast machine might tune this to 1000 microseconds, while my regression testing 486DX-25 needs it to be 4000 microseconds on a nearly idle network to prevent time_squeeze from being incremented. Version 2: changed jiffies to microseconds for predictable units. Signed-off-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21ip6_tunnel: Allow policy-based routing through tunnelsCraig Gallek
This feature allows the administrator to set an fwmark for packets traversing a tunnel. This allows the use of independent routing tables for tunneled packets without the use of iptables. Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21IB/core: Introduce drop flow specificationSlava Shwartsman
This flow steering specification identifies flow for drop by the HW. If user create a flow only with the drop specification, then all the packets that hit this flow will be dropped, otherwise the HW will drop only the packets that match the other L2/L3/L4 specifications. Signed-off-by: Slava Shwartsman <slavash@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21kvm: better MWAIT emulation for guestsMichael S. Tsirkin
Guests that are heavy on futexes end up IPI'ing each other a lot. That can lead to significant slowdowns and latency increase for those guests when running within KVM. If only a single guest is needed on a host, we have a lot of spare host CPU time we can throw at the problem. Modern CPUs implement a feature called "MWAIT" which allows guests to wake up sleeping remote CPUs without an IPI - thus without an exit - at the expense of never going out of guest context. The decision whether this is something sensible to use should be up to the VM admin, so to user space. We can however allow MWAIT execution on systems that support it properly hardware wise. This patch adds a CAP to user space and a KVM cpuid leaf to indicate availability of native MWAIT execution. With that enabled, the worst a guest can do is waste as many cycles as a "jmp ." would do, so it's not a privilege problem. We consciously do *not* expose the feature in our CPUID bitmap, as most people will want to benefit from sleeping vCPUs to allow for over commit. Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [agraf: fix amd, change commit message] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21s390/gs: add regset for the guarded storage broadcast control blockMartin Schwidefsky
The guarded storage interface allows to register a control block for each thread that is activated with the guarded storage broadcast event. To retrieve the complete state of a process from the kernel a register set for the stored broadcast control block is required. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-20Merge tag 'mac80211-next-for-davem-2017-04-18' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== My last pull request has been a while, we now have: * connection quality monitoring with multiple thresholds * support for FILS shared key authentication offload * pre-CAC regulatory compliance - only ETSI allows this * sanity check for some rate confusion that hit ChromeOS (but nobody else uses it, evidently) * some documentation updates * lots of cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20uapi: fix linux/raid/md_p.h userspace compilation errorArtur Paszkiewicz
Use __le32 and __le64 instead of u32 and u64. This fixes klibc build error: In file included from /klibc/usr/klibc/../include/sys/md.h:30:0, from /klibc/usr/kinit/do_mounts_md.c:19: /linux-next/usr/include/linux/raid/md_p.h:414:51: error: 'u32' undeclared here (not in a function) (PPL_HEADER_SIZE - PPL_HDR_RESERVED - 4 * sizeof(u32) - sizeof(u64)) Reported-by: Greg Thelen <gthelen@google.com> Reported-by: Nigel Croxon <ncroxon@redhat.com> Tested-by: Greg Thelen <gthelen@google.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>
2017-04-20nubus: Add MVC and VSC video card definitionsFinn Thain
Also move the NUBUS_DRHW_APPLE_JET definition, for numerical order. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2017-04-20KVM: PPC: VFIO: Add in-kernel acceleration for VFIOAlexey Kardashevskiy
This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO without passing them to user space which saves time on switching to user space and back. This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM. KVM tries to handle a TCE request in the real mode, if failed it passes the request to the virtual mode to complete the operation. If it a virtual mode handler fails, the request is passed to the user space; this is not expected to happen though. To avoid dealing with page use counters (which is tricky in real mode), this only accelerates SPAPR TCE IOMMU v2 clients which are required to pre-register the userspace memory. The very first TCE request will be handled in the VFIO SPAPR TCE driver anyway as the userspace view of the TCE table (iommu_table::it_userspace) is not allocated till the very first mapping happens and we cannot call vmalloc in real mode. If we fail to update a hardware IOMMU table unexpected reason, we just clear it and move on as there is nothing really we can do about it - for example, if we hot plug a VFIO device to a guest, existing TCE tables will be mirrored automatically to the hardware and there is no interface to report to the guest about possible failures. This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd and associates a physical IOMMU table with the SPAPR TCE table (which is a guest view of the hardware IOMMU table). The iommu_table object is cached and referenced so we do not have to look up for it in real mode. This does not implement the UNSET counterpart as there is no use for it - once the acceleration is enabled, the existing userspace won't disable it unless a VFIO container is destroyed; this adds necessary cleanup to the KVM_DEV_VFIO_GROUP_DEL handler. This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user space. This adds real mode version of WARN_ON_ONCE() as the generic version causes problems with rcu_sched. Since we testing what vmalloc_to_phys() returns in the code, this also adds a check for already existing vmalloc_to_phys() call in kvmppc_rm_h_put_tce_indirect(). This finally makes use of vfio_external_user_iommu_id() which was introduced quite some time ago and was considered for removal. Tests show that this patch increases transmission speed from 220MB/s to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_VFIO capability numberAlexey Kardashevskiy
This adds a capability number for in-kernel support for VFIO on SPAPR platform. The capability will tell the user space whether in-kernel handlers of H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space must not attempt allocating a TCE table in the host kernel via the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests will not be passed to the user space which is desired action in the situation like that. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-19nefilter: eache: reduce struct size from 32 to 24 byteFlorian Westphal
Only "cache" needs to use ulong (its used with set_bit()), missed can use u16. Also add build-time assertion to ensure event bits fit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: nft_ct: allow to set ctnetlink event types of a connectionFlorian Westphal
By default the kernel emits all ctnetlink events for a connection. This allows to select the types of events to generate. This can be used to e.g. only send DESTROY events but no NEW/UPDATE ones and will work even if sysctl net.netfilter.nf_conntrack_events is set to 0. This was already possible via iptables' CT target, but the nft version has the advantage that it can also be used with already-established conntracks. The added nf_ct_is_template() check isn't a bug fix as we only support mark and labels (and unlike ecache the conntrack core doesn't copy those). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19powerpc/perf: Define big-endian version of perf_mem_data_srcSukadev Bhattiprolu
perf_mem_data_src is a union that is initialized in the kernel via the ->val field and accessed by userspace via the mem_xxx bitfields. For this to work correctly on big endian platforms, we need a big-endian definition for the bitfields. Currently on a big endian system, if a user requests PERF_SAMPLE_DATA_SRC (perf report -d), they will get the default value from perf_sample_data_init(), which is PERF_MEM_NA. The value for PERF_MEM_NA is constructed using shifts: /* TLB access */ #define PERF_MEM_TLB_NA 0x01 /* not available */ ... #define PERF_MEM_TLB_SHIFT 26 #define PERF_MEM_S(a, s) \ (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) #define PERF_MEM_NA (PERF_MEM_S(OP, NA) |\ PERF_MEM_S(LVL, NA) |\ PERF_MEM_S(SNOOP, NA) |\ PERF_MEM_S(LOCK, NA) |\ PERF_MEM_S(TLB, NA)) Which works out as: ((0x01 << 0) | (0x01 << 5) | (0x01 << 19) | (0x01 << 24) | (0x01 << 26)) Which means the PERF_MEM_NA value comes out of the kernel as 0x5080021 in CPU endian. But then in the perf tool, the code uses the bitfields to inspect the value, and currently the bitfields are defined using little endian ordering. So eg. in perf_mem__tlb_scnprintf() we see: data_src->val = 0x5080021 op = 0x0 lvl = 0x0 snoop = 0x0 lock = 0x0 dtlb = 0x0 rsvd = 0x5080021 Because of the way the perf tool code is written this is still displayed to the user as "N/A", so there is no bug visible at the UI level. Currently there are no big endian architectures which export a meaningful value (ie. other than PERF_MEM_NA), so the extent of the bug on big endian platforms is that the PERF_MEM_NA value is exported incorrectly as described above. Subsequent patches will add support on big endian powerpc for populating the data source value. This patch does a minimal fix of adding big endian definition of the bitfields to match the values that are already exported by the kernel on big endian. And it makes no change on little endian. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-19Merge tag 'v4.11-rc7' into drm-nextDave Airlie
Backmerge Linux 4.11-rc7 from Linus tree, to fix some conflicts that were causing problems with the rerere cache in drm-tip.
2017-04-18PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constantMatthias Kaehlcke
A 64-bit value is not needed since a PCI ROM address consists in 32 bits. This fixes a clang warning about "implicit conversion from 'unsigned long' to 'u32'". Also remove now unnecessary casts to u32 from __pci_read_base() and pci_std_update_resource(). Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18usb: fix some references for /proc/bus/usbMauro Carvalho Chehab
Since when we got rid of usbfs, the /proc/bus/usb is now elsewhere. Fix references for it. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18Btrfs: consistent usage of types in balance_argsHans van Kranenburg
The btrfs_balance_args are only used for the balance ioctl, so use __u instead of __le here for consistency. The __le usage was introduced in bc3094673f22d and dee32d0ac3719 and was probably a result of copy/pasting when the code was written. The usage of __le did not break anything, but it's unnecessary. Also, this change makes the code less confusing for the careful reader. Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-17Remove compat_sys_getdents64()Al Viro
Unlike normal compat syscall variants, it is needed only for biarch architectures that have different alignement requirements for u64 in 32bit and 64bit ABI *and* have __put_user() that won't handle a store of 64bit value at 32bit-aligned address. We used to have one such (ia64), but its biarch support has been gone since 2010 (after being broken in 2008, which went unnoticed since nobody had been using it). It had escaped removal at the same time only because back in 2004 a patch that switched several syscalls on amd64 from private wrappers to generic compat ones had switched to use of compat_sys_getdents64(), which hadn't needed (or used) a compat wrapper on amd64. Let's bury it - it's at least 7 years overdue. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-17nbd: add a flag to destroy an nbd device on disconnectJosef Bacik
For ease of management it would be nice for users to specify that the device node for a nbd device is destroyed once it is disconnected and there are no more users. Add a client flag and enable this operation to happen. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-17nbd: add a status netlink commandJosef Bacik
Allow users to query the status of existing nbd devices. Right now this only returns whether or not the device is connected, but could be extended in the future to include more information. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-17nbd: handle dead connectionsJosef Bacik
Sometimes we like to upgrade our server without making all of our clients freak out and reconnect. This patch provides a way to specify a dead connection timeout to allow us to pause all requests and wait for new connections to be opened. With this in place I can take down the nbd server for less than the dead connection timeout time and bring it back up and everything resumes gracefully. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-17nbd: multicast dead link notificationsJosef Bacik
Provide a mechanism to notify userspace that there's been a link problem on a NBD device. This will allow userspace to re-establish a connection and provide the new socket to the device without disrupting the device. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-17nbd: add a reconfigure netlink commandJosef Bacik
We want to be able to reconnect dead connections to existing block devices, so add a reconfigure netlink command. We will also allow users to change their timeout on the fly, but everything else will require a disconnect and reconnect. You won't be able to add more connections either, simply replace dead connections with new more lively connections. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-17nbd: add a basic netlink interfaceJosef Bacik
The existing ioctl interface for configuring NBD devices is a bit cumbersome and hard to extend. The other problem is we leave a userspace app sitting in it's syscall until the device disconnects, which is less than ideal. This patch introduces a netlink interface for adding and disconnecting nbd devices. This has the benefits of being easily extendable without breaking older userspace applications, and allows us to configure a nbd device without leaving a userspace app sitting waiting for the device to disconnect. With this interface we also gain the ability to configure more devices than are preallocated at insmod time. We also have gained the ability to not specify a particular device and be provided one for us so that userspace doesn't need to find a free device to configure. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-16lightnvm: allow to init targets on factory modeJavier González
Target initialization has two responsibilities: creating the target partition and instantiating the target. This patch enables to create a factory partition (e.g., do not trigger recovery on the given target). This is useful for target development and for being able to restore the device state at any moment in time without requiring a full-device erase. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts were simply overlapping changes. In the net/ipv4/route.c case the code had simply moved around a little bit and the same fix was made in both 'net' and 'net-next'. In the net/sched/sch_generic.c case a fix in 'net' happened at the same time that a new argument was added to qdisc_hash_add(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-15drm/i915: Copy user requested buffers into the error stateChris Wilson
Introduce a new execobject.flag (EXEC_OBJECT_CAPTURE) that userspace may use to indicate that it wants the contents of this buffer preserved in the error state (/sys/class/drm/cardN/error) following a GPU hang involving this batch. Use this at your discretion, the contents of the error state. although compressed, are allocated with GFP_ATOMIC (i.e. limited) and kept for all eternity (until the error state is destroyed). Based on an earlier patch by Ben Widawsky <ben@bwidawsk.net> Testcase: igt/gem_exec_capture Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Matt Turner <mattst88@gmail.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170415093902.22581-1-chris@chris-wilson.co.uk
2017-04-15netfilter: kill the fake untracked conntrack objectsFlorian Westphal
resurrect an old patch from Pablo Neira to remove the untracked objects. Currently, there are four possible states of an skb wrt. conntrack. 1. No conntrack attached, ct is NULL. 2. Normal (kmem cache allocated) ct attached. 3. a template (kmalloc'd), not in any hash tables at any point in time 4. the 'untracked' conntrack, a percpu nf_conn object, tagged via IPS_UNTRACKED_BIT in ct->status. Untracked is supposed to be identical to case 1. It exists only so users can check -m conntrack --ctstate UNTRACKED vs. -m conntrack --ctstate INVALID e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is supposed to be a no-op. Thus currently we need to check ct == NULL || nf_ct_is_untracked(ct) in a lot of places in order to avoid altering untracked objects. The other consequence of the percpu untracked object is that all -j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op (inc/dec the untracked conntracks refcount). This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to make the distinction instead. The (few) places that care about packet invalid (ct is NULL) vs. packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED, but all other places can omit the nf_ct_is_untracked() check. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-14[media] videodev.h: add V4L2_CTRL_FLAG_MODIFY_LAYOUTHans Verkuil
Add new flag to indicate that changing this control will change the buffer/mediabus layout as well. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-04-14[media] v4l: Define a pixel format for the R-Car VSP1 2-D histogram engineNiklas Söderlund
The format is used on the R-Car VSP1 video queues that carry 2-D histogram statistics data. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-04-14[media] v4l: Define a pixel format for the R-Car VSP1 1-D histogram engineLaurent Pinchart
The format is used on the R-Car VSP1 video queues that carry 1-D histogram statistics data. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-04-14[media] v4l: Add metadata buffer type and formatLaurent Pinchart
The metadata buffer type is used to transfer metadata between userspace and kernelspace through a V4L2 buffers queue. It comes with a new metadata capture capability and format description. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Tested-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> [hans.verkuil@cisco.com: removed left-over 'experimental' note] [hans.verkuil@cisco.com: add newline after _v4l2-meta-format label] Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-04-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Just a small update to xpad driver to recognize yet another gamepad, and another change making sure userio.h is exported" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: xpad - add support for Razer Wildcat gamepad uapi: add missing install of userio.h
2017-04-14Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael S. Tsirkin: "virtio oops fixes The virtio pci rework using shared interrupts caused a lot of issues. We tried to fix them but run out of time. Revert for now, and revisit the issue for the next kernel. Luckily we are able to do this without loosing automatic interrupt NUMA affinity which was the main motivator for the rework" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-pci: Remove affinity hint before freeing the interrupt Revert "virtio_pci: remove struct virtio_pci_vq_info" Revert "virtio_pci: use shared interrupts for virtqueues" Revert "virtio_pci: don't duplicate the msix_enable flag in struct pci_dev" Revert "virtio_pci: simplify MSI-X setup" Revert "virtio_pci: fix out of bound access for msix_names" MAINTAINERS: fix virtio file pattern virtio_console: fix uninitialized variable use virtio_net: clear MTU when out of range virtio: allow drivers to validate features virtio_net: enable big packets for large MTU values
2017-04-14xfrm: Add an IPsec hardware offloading APISteffen Klassert
This patch adds all the bits that are needed to do IPsec hardware offload for IPsec states and ESP packets. We add xfrmdev_ops to the net_device. xfrmdev_ops has function pointers that are needed to manage the xfrm states in the hardware and to do a per packet offloading decision. Joint work with: Ilan Tayari <ilant@mellanox.com> Guy Shapiro <guysh@mellanox.com> Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-13netlink: allow sending extended ACK with cookie on successJohannes Berg
Now that we have extended error reporting and a new message format for netlink ACK messages, also extend this to be able to return arbitrary cookie data on success. This will allow, for example, nl80211 to not send an extra message for cookies identifying newly created objects, but return those directly in the ACK message. The cookie data size is currently limited to 20 bytes (since Jamal talked about using SHA1 for identifiers.) Thanks to Jamal Hadi Salim for bringing up this idea during the discussions. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netlink: extended ACK reportingJohannes Berg
Add the base infrastructure and UAPI for netlink extended ACK reporting. All "manual" calls to netlink_ack() pass NULL for now and thus don't get extended ACK reporting. Big thanks goes to Pablo Neira Ayuso for not only bringing up the whole topic at netconf (again) but also coming up with the nlattr passing trick and various other ideas. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12Merge branch 'for-4.11/libnvdimm' into for-4.12/daxDan Williams
2017-04-12switchtec: Add IOCTLs to the Switchtec driverLogan Gunthorpe
Add a couple of special IOCTLs to: * Inform userspace of firmware partition locations * Pass event counts and allow userspace to wait on events * Translate PFF numbers used by the switch to port numbers [Dan Carpenter <dan.carpenter@oracle.com>: fix off-by-one in ioctl_event_ctl()] Tested-by: Krishna Dhulipala <krishnad@fb.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Zhang <wzhang@fb.com> Reviewed-by: Jens Axboe <axboe@fb.com>
2017-04-12drm/i915: Treat WC a separate cache domainChris Wilson
When discussing a new WC mmap, we based the interface upon the assumption that GTT was fully coherent. How naive! Commits 3b5724d702ef ("drm/i915: Wait for writes through the GTT to land before reading back") and ed4596ea992d ("drm/i915/guc: WA to address the Ringbuffer coherency issue") demonstrate that writes through the GTT are indeed delayed and may be overtaken by direct WC access. To be safe, if userspace is mixing WC mmaps with other potential GTT access (pwrite, GTT mmaps) it should use set_domain(WC). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96563 Testcase: igt/gem_pwrite/small-gtt* Testcase: igt/drv_selftest/coherency Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170412110111.26626-2-chris@chris-wilson.co.uk
2017-04-11Merge tag 'kvm-s390-next-4.12-1' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux From: Christian Borntraeger <borntraeger@de.ibm.com> KVM: s390: features for 4.12 1. guarded storage support for guests This contains an s390 base Linux feature branch that is necessary to implement the KVM part 2. Provide an interface to implement adapter interruption suppression which is necessary for proper zPCI support 3. Use more defines instead of numbers 4. Provide logging for lazy enablement of runtime instrumentation
2017-04-11usb: gadget: f_fs: Fix ExtCompat documentation in uapi headerVincent Pelletier
The code was fixed in commit 53642399aa71 ("usb: gadget: f_fs: Fix wrong check on reserved1 wof OS_DESC_EXT_COMPAT") but the in-header documentation kept referencing 0 as the expected value. Reference 1 instead as per original commit message. Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-04-11Merge tag 'v4.11-rc6' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11Merge branch 'msm-next' of git://people.freedesktop.org/~robclark/linux into ↵Dave Airlie
drm-next Noteworthy changes this time: 1) 4k support for newer chips (ganging up hwpipes and mixers) 2) using OPP bindings for gpu 3) more prep work towards per-process pagetables * 'msm-next' of git://people.freedesktop.org/~robclark/linux: (47 commits) msm/drm: gpu: Dynamically locate the clocks from the device tree drm/msm: gpu: Use OPP tables if we can drm/msm: Hard code the GPU "slow frequency" drm/msm: Add MSM_PARAM_GMEM_BASE drm/msm: Reference count address spaces drm/msm: Make sure to detach the MMU during GPU cleanup drm/msm/mdp5: Enable 3D mux in mdp5_ctl drm/msm/mdp5: Reset CTL blend registers before configuring them drm/msm/mdp5: Assign 'right' mixer to CRTC state drm/msm/mdp5: Stage border out on base stage if CRTC has 2 LMs drm/msm/mdp5: Stage right side hwpipes on Right-side Layer Mixer drm/msm/mdp5: Prepare Layer Mixers for source split drm/msm/mdp5: Configure 'right' hwpipe drm/msm/mdp5: Assign a 'right hwpipe' to plane state drm/msm/mdp5: Create mdp5_hwpipe_mode_set drm/msm/mdp5: Add optional 'right' Layer Mixer in CRTC state drm/msm/mdp5: Add a CAP for Source Split drm/msm/mdp5: Remove mixer/intf pointers from mdp5_ctl drm/msm/mdp5: Start using parameters from CRTC state drm/msm/mdp5: Add more stuff to CRTC state ...