summaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2017-01-31KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMUPaul Mackerras
This adds two capabilities and two ioctls to allow userspace to find out about and configure the POWER9 MMU in a guest. The two capabilities tell userspace whether KVM can support a guest using the radix MMU, or using the hashed page table (HPT) MMU with a process table and segment tables. (Note that the MMUs in the POWER9 processor cores do not use the process and segment tables when in HPT mode, but the nest MMU does). The KVM_PPC_CONFIGURE_V3_MMU ioctl allows userspace to specify whether a guest will use the radix MMU or the HPT MMU, and to specify the size and location (in guest space) of the process table. The KVM_PPC_GET_RMMU_INFO ioctl gives userspace information about the radix MMU. It returns a list of supported radix tree geometries (base page size and number of bits indexed at each level of the radix tree) and the encoding used to specify the various page sizes for the TLB invalidate entry instruction. Initially, both capabilities return 0 and the ioctls return -EINVAL, until the necessary infrastructure for them to operate correctly is added. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30net: ethtool: add support for 2500BaseT and 5000BaseT link modesPavel Belous
This patch introduce support for 2500BaseT and 5000BaseT link modes. These modes are included in the new IEEE 802.3bz standard. Signed-off-by: Pavel Belous <pavel.s.belous@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-29tcp: record pkts sent and retransmisttedYuchung Cheng
Add two stats in SCM_TIMESTAMPING_OPT_STATS: TCP_NLA_DATA_SEGS_OUT: total data packets sent including retransmission TCP_NLA_TOTAL_RETRANS: total data packets retransmitted The names are picked to be consistent with corresponding fields in TCP_INFO. This allows applications that are using the timestamping API to measure latency stats to also retrive retransmission rate of application write. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP DF on transmit). 2) Netfilter needs to reflect the fwmark when sending resets, from Pau Espin Pedrol. 3) nftable dump OOPS fix from Liping Zhang. 4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit, from Rolf Neugebauer. 5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd Bergmann. 6) Fix regression in handling of NETIF_F_SG in harmonize_features(), from Eric Dumazet. 7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern. 8) tcp_fastopen_create_child() needs to setup tp->max_window, from Alexey Kodanev. 9) Missing kmemdup() failure check in ipv6 segment routing code, from Eric Dumazet. 10) Don't execute unix_bind() under the bindlock, otherwise we deadlock with splice. From WANG Cong. 11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer, therefore callers must reload cached header pointers into that skb. Fix from Eric Dumazet. 12) Fix various bugs in legacy IRQ fallback handling in alx driver, from Tobias Regnery. 13) Do not allow lwtunnel drivers to be unloaded while they are referenced by active instances, from Robert Shearman. 14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven. 15) Fix a few regressions from virtio_net XDP support, from John Fastabend and Jakub Kicinski. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits) ISDN: eicon: silence misleading array-bounds warning net: phy: micrel: add support for KSZ8795 gtp: fix cross netns recv on gtp socket gtp: clear DF bit on GTP packet tx gtp: add genl family modules alias tcp: don't annotate mark on control socket from tcp_v6_send_response() ravb: unmap descriptors when freeing rings virtio_net: reject XDP programs using header adjustment virtio_net: use dev_kfree_skb for small buffer XDP receive r8152: check rx after napi is enabled r8152: re-schedule napi for tx r8152: avoid start_xmit to schedule napi when napi is disabled r8152: avoid start_xmit to call napi_schedule during autosuspend net: dsa: Bring back device detaching in dsa_slave_suspend() net: phy: leds: Fix truncated LED trigger names net: phy: leds: Break dependency of phy.h on phy_led_triggers.h net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash net-next: ethernet: mediatek: change the compatible string Documentation: devicetree: change the mediatek ethernet compatible string bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status(). ...
2017-01-27Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "Second round of -rc fixes for 4.10. This -rc cycle has been slow for the rdma subsystem. I had already sent you the first batch before the Holiday break. After that, we kept only getting a few here or there. Up until this week, when I got a drop of 13 to one driver (qedr). So, here's the -rc patches I have. I currently have none held in reserve, so unless something new comes in, this is it until the next merge window opens. Summary: - series of iw_cxgb4 fixes to make it work with the drain cq API - one or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem, rxe, and ipoib - one big series (13 patches) for the new qedr driver" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (27 commits) RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled IB/rxe: Prevent from completer to operate on non valid QP IB/rxe: Fix rxe dev insertion to rxe_dev_list IB/umem: Release pid in error and ODP flow RDMA/qedr: Dispatch port active event from qedr_add RDMA/qedr: Fix and simplify memory leak in PD alloc RDMA/qedr: Fix RDMA CM loopback RDMA/qedr: Fix formatting RDMA/qedr: Mark three functions as static RDMA/qedr: Don't reset QP when queues aren't flushed RDMA/qedr: Don't spam dmesg if QP is in error state RDMA/qedr: Remove CQ spinlock from CM completion handlers RDMA/qedr: Return max inline data in QP query result RDMA/qedr: Return success when not changing QP state RDMA/qedr: Add uapi header qedr-abi.h RDMA/qedr: Fix MTU returned from QP query RDMA/core: Add the function ib_mtu_int_to_enum IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path IB/vmw_pvrdma: Don't leak info from alloc_ucontext IB/cxgb3: fix misspelling in header guard ...
2017-01-27Merge tag 'media/v4.10-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - fix a regression on tvp5150 causing failures at input selection and image glitches - CEC was moved out of staging for v4.10. Fix some bugs on it while not too late - fix a regression on pctv452e caused by VM stack changes - fix suspend issued with smiapp - fix a regression on cobalt driver - fix some warnings and Kconfig issues with some random configs. * tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] s5k4ecgx: select CRC32 helper [media] dvb: avoid warning in dvb_net [media] v4l: tvp5150: Don't override output pinmuxing at stream on/off time [media] v4l: tvp5150: Fix comment regarding output pin muxing [media] v4l: tvp5150: Reset device at probe time, not in get/set format handlers [media] pctv452e: move buffer to heap, no mutex [media] media/cobalt: use pci_irq_allocate_vectors [media] cec: fix race between configuring and unconfiguring [media] cec: move cec_report_phys_addr into cec_config_thread_func [media] cec: replace cec_report_features by cec_fill_msg_report_features [media] cec: update log_addr[] before finishing configuration [media] cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2 [media] cec: when canceling a message, don't overwrite old status info [media] cec: fix report_current_latency [media] smiapp: Make suspend and resume functions __maybe_unused [media] smiapp: Implement power-on and power-off sequences without runtime PM
2017-01-27Merge tag 'linux-can-next-for-4.11-20170124' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2017-01-24 this is a pull request of 4 patches for net-next/master. The first patch by Oliver Hartkopp adds a netlink API to configure the interface termination of a CAN card. The next two patches are by me and add a netlink API to query and configure CAN interfaces that only support fixed bitrates. The last patch by Colin Ian King simplifies the return path in the softing_cs driver's softingcs_probe() function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27drm/amd/amdgpu: get maximum and used UVD handles (v4)Arindam Nath
Change History -------------- v4: Changes suggested by Emil, Christian - return -ENODATA for asics with unlimited sessions v3: changes suggested by Christian - Add a check for UVD IP block using AMDGPU_HW_IP_UVD query type. - Add a check for asic_type to be less than CHIP_POLARIS10 since starting Polaris, we support unlimited UVD instances. - Add kerneldoc style comment for amdgpu_uvd_used_handles(). v2: as suggested by Christian - Add a new query AMDGPU_INFO_NUM_HANDLES - Create a helper function to return the number of currently used UVD handles. - Modify the logic to count the number of used UVD handles since handles can be freed in non-linear fashion. v1: - User might want to query the maximum number of UVD instances supported by firmware. In addition to that, if there are multiple applications using UVD handles at the same time, he might also want to query the currently used number of handles. For this we add two variables max_handles and used_handles inside drm_amdgpu_info_hw_ip. So now an application (or libdrm) can use AMDGPU_INFO IOCTL with AMDGPU_INFO_HW_IP_INFO query type to get these values. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-01-27net/ipv6: allow sysctl to change link-local address generation modeFelix Jia
The address generation mode for IPv6 link-local can only be configured by netlink messages. This patch adds the ability to change the address generation mode via sysctl. v1 -> v2 Removed the rtnl lock and switch to use RCU lock to iterate through the netdev list. v2 -> v3 Removed the addrgenmode variable from the idev structure and use the systcl storage for the flag. Simplifed the logic for sysctl handling by removing the supported for all operation. Added support for more types of tunnel interfaces for link-local address generation. Based the patches from net-next. v3 -> v4 Removed unnecessary whitespace changes. Signed-off-by: Felix Jia <felix.jia@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27drm/fourcc: add vivante tiled layout format modifiersPhilipp Zabel
Vivante GC hardware uses simple 4x4 tiled and nested 64x64 supertiled formats as well as so-called split-tiled variants for dual-pipe hardware, where even and odd tiles start at different base addresses. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170126153217.26916-1-p.zabel@pengutronix.de
2017-01-27Merge tag 'drm-misc-next-2017-01-23' of ↵Dave Airlie
git://anongit.freedesktop.org/git/drm-misc into drm-next - cleanups&fixes for dw-hdmi bride driver (Laurent) - updates for adv bridge driver (John Stultz) for nexus - drm_crtc_from_index helper rollout (Shawn Guo) - removing drm_framebuffer_unregister_private from drivers&core - target_vblank (Andrey Grodzovsky) - misc tiny stuff * tag 'drm-misc-next-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc: (49 commits) drm: qxl: Open code teardown function for qxl drm: qxl: Open code probing sequence for qxl drm/bridge: adv7511: Re-write the i2c address before EDID probing drm/bridge: adv7511: Reuse __adv7511_power_on/off() when probing EDID drm/bridge: adv7511: Rework adv7511_power_on/off() so they can be reused internally drm/bridge: adv7511: Enable HPD interrupts to support hotplug and improve monitor detection drm/bridge: adv7511: Switch to using drm_kms_helper_hotplug_event() drm/bridge: adv7511: Use work_struct to defer hotplug handing to out of irq context drm: vc4: use crtc helper drm_crtc_from_index() drm: tegra: use crtc helper drm_crtc_from_index() drm: nouveau: use crtc helper drm_crtc_from_index() drm: mediatek: use crtc helper drm_crtc_from_index() drm: kirin: use crtc helper drm_crtc_from_index() drm: exynos: use crtc helper drm_crtc_from_index() dt-bindings: display: dw-hdmi: Clean up DT bindings documentation drm: bridge: dw-hdmi: Assert SVSRET before resetting the PHY drm: bridge: dw-hdmi: Fix the name of the PHY reset macros drm: bridge: dw-hdmi: Define and use macros for PHY register addresses drm: bridge: dw-hdmi: Detect PHY type at runtime drm: bridge: dw-hdmi: Handle overflow workaround based on device version ...
2017-01-27Merge tag 'drm-intel-next-2017-01-23' of ↵Dave Airlie
git://anongit.freedesktop.org/git/drm-intel into drm-next Final block of feature work for 4.11: - gen8 pd cleanup from Matthew Auld - more cleanups for view/vma (Chris) - dmc support on glk (Anusha Srivatsa) - use core crc api (Tomue) - track wedged requests using fence.error (Chris) - lots of psr fixes (Nagaraju, Vathsala) - dp mst support, acked for merging through drm-intel by Takashi (Libin) - huc loading support, including uapi for libva to use it (Anusha Srivatsa) * tag 'drm-intel-next-2017-01-23' of git://anongit.freedesktop.org/git/drm-intel: (111 commits) drm/i915: Update DRIVER_DATE to 20170123 drm/i915: reinstate call to trace_i915_vma_bind drm/i915: Assert that created vma has a whole number of pages drm/i915: Assert the drm_mm_node is allocated when on the VM lists drm/i915: Treat an error from i915_vma_instance() as unlikely drm/i915: Reject vma creation larger than address space drm/i915: Use common LRU inactive vma bumping for unpin_from_display drm/i915: Do an unlocked wait before set-cache-level ioctl drm/i915/huc: Assert that HuC vma is placed in GuC accessible range drm/i915/huc: Avoid attempting to authenticate non-existent fw drm/i915: Set adjustment to zero on Up/Down interrupts if freq is already max/min drm/i915: Remove the double handling of 'flags from intel_mode_from_pipe_config() drm/i915: Remove crtc->config usage from intel_modeset_readout_hw_state() drm/i915: Release temporary load-detect state upon switching drm/i915: Remove i915_gem_object_to_ggtt() drm/i915: Remove i915_vma_create from VMA API drm/i915: Add a check that the VMA instance we lookup matches the request drm/i915: Rename some warts in the VMA API drm/i915: Track pinned vma in intel_plane_state drm/i915/get_params: Add HuC status to getparams ...
2017-01-27Merge branch 'master' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next Backmerge Linus master to get the connector locking revert. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux: (645 commits) sysctl: fix proc_doulongvec_ms_jiffies_minmax() Revert "drm/probe-helpers: Drop locking from poll_enable" MAINTAINERS: add Dan Streetman to zbud maintainers MAINTAINERS: add Dan Streetman to zswap maintainers mm: do not export ioremap_page_range symbol for external module mn10300: fix build error of missing fpu_save() romfs: use different way to generate fsid for BLOCK or MTD frv: add missing atomic64 operations mm, page_alloc: fix premature OOM when racing with cpuset mems update mm, page_alloc: move cpuset seqcount checking to slowpath mm, page_alloc: fix fast-path race with cpuset update or removal mm, page_alloc: fix check for NULL preferred_zone kernel/panic.c: add missing \n fbdev: color map copying bounds checking frv: add atomic64_add_unless() mm/mempolicy.c: do not put mempolicy before using its nodemask radix-tree: fix private list warnings Documentation/filesystems/proc.txt: add VmPin mm, memcg: do not retry precharge charges proc: add a schedule point in proc_pid_readdir() ...
2017-01-26Merge tag 'batadv-next-for-davem-20170126' of ↵David S. Miller
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - ignore self-generated loop detect MAC addresses in translation table, by Simon Wunderlich - install uapi batman_adv.h header, by Sven Eckelmann - bump copyright years, by Sven Eckelmann - Remove an unused variable in translation table code, by Sven Eckelmann - Handle NET_XMIT_CN like NET_XMIT_SUCCESS (revised according to Davids suggestion), and a follow up code clean up, by Gao Feng (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains a large batch with Netfilter fixes for your net tree, they are: 1) Two patches to solve conntrack garbage collector cpu hogging, one to remove GC_MAX_EVICTS and another to look at the ratio (scanned entries vs. evicted entries) to make a decision on whether to reduce or not the scanning interval. From Florian Westphal. 2) Two patches to fix incorrect set element counting if NLM_F_EXCL is is not set. Moreover, don't decrenent set->nelems from abort patch if -ENFILE which leaks a spare slot in the set. This includes a patch to deconstify the set walk callback to update set->ndeact. 3) Two fixes for the fwmark_reflect sysctl feature: Propagate mark to reply packets both from nf_reject and local stack, from Pau Espin Pedrol. 4) Fix incorrect handling of loopback traffic in rpfilter and nf_tables fib expression, from Liping Zhang. 5) Fix oops on stateful objects netlink dump, when no filter is specified. Also from Liping Zhang. 6) Fix a build error if proc is not available in ipt_CLUSTERIP, related to fix that was applied in the previous batch for net. From Arnd Bergmann. 7) Fix lack of string validation in table, chain, set and stateful object names in nf_tables, from Liping Zhang. Moreover, restrict maximum log prefix length to 127 bytes, otherwise explicitly bail out. 8) Two patches to fix spelling and typos in nf_tables uapi header file and Kconfig, patches from Alexander Alemayhu and William Breathitt Gray. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-26batman-adv: update copyright years for 2017Sven Eckelmann
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-01-26uapi: install batman_adv.h headerSven Eckelmann
09748a22f4ab ("batman-adv: add generic netlink family for batman-adv") introduced the new batman_adv.h which describes the netlink attributes and commands of batman-adv. But the Kbuild entry to install the header was not added. All currently known tools ship their own copy of batman_adv.h but it should be installed anyway to later be able to migrate to the system batman_adv.h. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-01-25net/tcp-fastopen: Add new API supportWei Wang
This patch adds a new socket option, TCP_FASTOPEN_CONNECT, as an alternative way to perform Fast Open on the active side (client). Prior to this patch, a client needs to replace the connect() call with sendto(MSG_FASTOPEN). This can be cumbersome for applications who want to use Fast Open: these socket operations are often done in lower layer libraries used by many other applications. Changing these libraries and/or the socket call sequences are not trivial. A more convenient approach is to perform Fast Open by simply enabling a socket option when the socket is created w/o changing other socket calls sequence: s = socket() create a new socket setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT …); newly introduced sockopt If set, new functionality described below will be used. Return ENOTSUPP if TFO is not supported or not enabled in the kernel. connect() With cookie present, return 0 immediately. With no cookie, initiate 3WHS with TFO cookie-request option and return -1 with errno = EINPROGRESS. write()/sendmsg() With cookie present, send out SYN with data and return the number of bytes buffered. With no cookie, and 3WHS not yet completed, return -1 with errno = EINPROGRESS. No MSG_FASTOPEN flag is needed. read() Return -1 with errno = EWOULDBLOCK/EAGAIN if connect() is called but write() is not called yet. Return -1 with errno = EWOULDBLOCK/EAGAIN if connection is established but no msg is received yet. Return number of bytes read if socket is established and there is msg received. The new API simplifies life for applications that always perform a write() immediately after a successful connect(). Such applications can now take advantage of Fast Open by merely making one new setsockopt() call at the time of creating the socket. Nothing else about the application's socket call sequence needs to change. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25Merge tag 'mlx5-updates-2017-01-24' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-24-01 The first seven patches from Or Gerlitz in this series further enhances the mlx5 SRIOV switchdev mode to support offloading IPv6 tunnels using the TC tunnel key set (encap) and unset (decap) actions. Or Gerlitz says: ======================== As part of doing this change, few cleanups are done in the IPv4 code, later we move to use the full tunnel key info provided to the driver as the key for our internal hashing which is used to identify cases where the same tunnel is used for encapsulating multiple flows. As done in the IPv4 case, the control path for offloading IPv6 tunnels uses route/neigh lookups and construction of the IPv6 tunnel headers on the encap path and matching on the outer hears in the decap path. The last patch of the series enlarges the HW FDB size for the switchdev mode, so it has now room to contain offloaded flows as many as min(max number of HW flow counters supported, max HW table size supported). ======================== Next to Or's series you can find several patches handling several topics. From Mohamad, add support for SRIOV VF min rate guarantee by using the TSAR BW share weights mechanism. From Or, Two patches to enable Eth VFs to query their min-inline value for user-space. for that we move a mlx5 low level min inline helper function from mlx5 ethernet driver into the core driver and then use it in mlx5_ib to expose the inline mode to rdma applications through libmlx5. From Kamal Heib, Reduce memory consumption on kdump kernel. From Shaker Daibes, code reuse in CQE compression control logic ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25net sched actions: Add support for user cookiesJamal Hadi Salim
Introduce optional 128-bit action cookie. Like all other cookie schemes in the networking world (eg in protocols like http or existing kernel fib protocol field, etc) the idea is to save user state that when retrieved serves as a correlator. The kernel _should not_ intepret it. The user can store whatever they wish in the 128 bits. Sample exercise(showing variable length use of cookie) .. create an accept action with cookie a1b2c3d4 sudo $TC actions add action ok index 1 cookie a1b2c3d4 .. dump all gact actions.. sudo $TC -s actions ls action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 0 installed 5 sec used 5 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie a1b2c3d4 .. bind the accept action to a filter.. sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1 ... send some traffic.. $ ping 127.0.0.1 -c 3 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25nsfs: Add an ioctl() to return the namespace typeMichael Kerrisk (man-pages)
Linux 4.9 added two ioctl() operations that can be used to discover: * the parental relationships for hierarchical namespaces (user and PID) [NS_GET_PARENT] * the user namespaces that owns a specified non-user-namespace [NS_GET_USERNS] For no good reason that I can glean, NS_GET_USERNS was made synonymous with NS_GET_PARENT for user namespaces. It might have been better if NS_GET_USERNS had returned an error if the supplied file descriptor referred to a user namespace, since it suggests that the caller may be confused. More particularly, if it had generated an error, then I wouldn't need the new ioctl() operation proposed here. (On the other hand, what I propose here may be more generally useful.) I would like to write code that discovers namespace relationships for the purpose of understanding the namespace setup on a running system. In particular, given a file descriptor (or pathname) for a namespace, N, I'd like to obtain the corresponding user namespace. Namespace N might be a user namespace (in which case my code would just use N) or a non-user namespace (in which case my code will use NS_GET_USERNS to get the user namespace associated with N). The problem is that there is no way to tell the difference by looking at the file descriptor (and if I try to use NS_GET_USERNS on an N that is a user namespace, I get the parent user namespace of N, which is not what I want). This patch therefore adds a new ioctl(), NS_GET_NSTYPE, which, given a file descriptor that refers to a user namespace, returns the namespace type (one of the CLONE_NEW* constants). Signed-off-by: Michael Kerrisk <mtk-manpages@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-01-24netfilter: nft_log: restrict the log prefix length to 127Liping Zhang
First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127, at nf_log_packet(), so the extra part is useless. Second, after adding a log rule with a very very long prefix, we will fail to dump the nft rules after this _special_ one, but acctually, they do exist. For example: # name_65000=$(printf "%0.sQ" {1..65000}) # nft add rule filter output log prefix "$name_65000" # nft add rule filter output counter # nft add rule filter output counter # nft list chain filter output table ip filter { chain output { type filter hook output priority 0; policy accept; } } So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-24RDMA/qedr: Add uapi header qedr-abi.hAmrani, Ram
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24bpf: allow option for setting bpf_l4_csum_replace from scratchDaniel Borkmann
When programs need to calculate the csum from scratch for small UDP packets and use bpf_l4_csum_replace() to feed the result from helpers like bpf_csum_diff(), then we need a flag besides BPF_F_MARK_MANGLED_0 that would ignore the case of current csum being 0, and which would still allow for the helper to set the csum and transform when needed to CSUM_MANGLED_0. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24IB/mlx5: Enable Eth VFs to query their min-inline value for user-spaceOr Gerlitz
For some mlx5 HW models (CX4, CX4Lx), the VF driver needs to put part of the packet headers on the TX descriptor so the e-switch can do proper matching and steering. This is called "min-inline", it's advertized to the VF by the FW and also enforced on them by the HW, such that if they don't obey, their packets are dropped. SRIOV VF libmlx5 instances should take into account the min-inline value of their vports. For that end, we provide this value through the vendor response part of init_ucontext command. The min inline value is reported in a way which will let newer libmlx5 instances realize that they are running over an older kernel and act accordingly (e.g apply some educated guess). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-24net/sched: Introduce sample tc actionYotam Gigi
This action allows the user to sample traffic matched by tc classifier. The sampling consists of choosing packets randomly and sampling them using the psample module. The user can configure the psample group number, the sampling rate and the packet's truncation (to save kernel-user traffic). Example: To sample ingress traffic from interface eth1, one may use the commands: tc qdisc add dev eth1 handle ffff: ingress tc filter add dev eth1 parent ffff: \ matchall action sample rate 12 group 4 Where the first command adds an ingress qdisc and the second starts sampling randomly with an average of one sampled packet per 12 packets on dev eth1 to psample group 4. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24net: Introduce psample, a new genetlink channel for packet samplingYotam Gigi
Add a general way for kernel modules to sample packets, without being tied to any specific subsystem. This netlink channel can be used by tc, iptables, etc. and allow to standardize packet sampling in the kernel. For every sampled packet, the psample module adds the following metadata fields: PSAMPLE_ATTR_IIFINDEX - the packets input ifindex, if applicable PSAMPLE_ATTR_OIFINDEX - the packet output ifindex, if applicable PSAMPLE_ATTR_ORIGSIZE - the packet's original size, in case it has been truncated during sampling PSAMPLE_ATTR_SAMPLE_GROUP - the packet's sample group, which is set by the user who initiated the sampling. This field allows the user to differentiate between several samplers working simultaneously and filter packets relevant to him PSAMPLE_ATTR_GROUP_SEQ - sequence counter of last sent packet. The sequence is kept for each group PSAMPLE_ATTR_SAMPLE_RATE - the sampling rate used for sampling the packets PSAMPLE_ATTR_DATA - the actual packet bits The sampled packets are sent to the PSAMPLE_NL_MCGRP_SAMPLE multicast group. In addition, add the GET_GROUPS netlink command which allows the user to see the current sample groups, their refcount and sequence number. This command currently supports only netlink dump mode. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24bridge: multicast to unicastFelix Fietkau
Implements an optional, per bridge port flag and feature to deliver multicast packets to any host on the according port via unicast individually. This is done by copying the packet per host and changing the multicast destination MAC to a unicast one accordingly. multicast-to-unicast works on top of the multicast snooping feature of the bridge. Which means unicast copies are only delivered to hosts which are interested in it and signalized this via IGMP/MLD reports previously. This feature is intended for interface types which have a more reliable and/or efficient way to deliver unicast packets than broadcast ones (e.g. wifi). However, it should only be enabled on interfaces where no IGMPv2/MLDv1 report suppression takes place. This feature is disabled by default. The initial patch and idea is from Felix Fietkau. Signed-off-by: Felix Fietkau <nbd@nbd.name> [linus.luessing@c0d3.blue: various bug + style fixes, commit message] Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24IB/cxgb3: fix misspelling in header guardNicolas Iooss
Use CXGB3_... instead of CXBG3_... Fixes: a85fb3383340 ("IB/cxgb3: Move user vendor structures") Cc: stable@vger.kernel.org # 4.9 Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Steve Wise <swise@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24can: dev: add CAN interface API for fixed bitratesMarc Kleine-Budde
Some CAN interfaces only support fixed fixed bitrates. This patch adds a netlink interface to get the list of the CAN interface's fixed bitrates and data bitrates. Inside the driver arrays of supported data- bitrate values are defined. const u32 drvname_bitrate[] = { 20000, 50000, 100000 }; const u32 drvname_data_bitrate[] = { 200000, 500000, 1000000 }; struct drvname_priv *priv; priv = netdev_priv(dev); priv->bitrate_const = drvname_bitrate; priv->bitrate_const_cnt = ARRAY_SIZE(drvname_bitrate); priv->data_bitrate_const = drvname_data_bitrate; priv->data_bitrate_const_cnt = ARRAY_SIZE(drvname_data_bitrate); Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-01-24can: dev: add CAN interface termination APIOliver Hartkopp
This patch adds a netlink interface to configure the CAN bus termination of CAN interfaces. Inside the driver an array of supported termination values is defined: const u16 drvname_termination[] = { 60, 120, CAN_TERMINATION_DISABLED }; struct drvname_priv *priv; priv = netdev_priv(dev); priv->termination_const = drvname_termination; priv->termination_const_cnt = ARRAY_SIZE(drvname_termination); priv->termination = CAN_TERMINATION_DISABLED; And the funtion to set the value has to be defined: priv->do_set_termination = drvname_set_termination; Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reviewed-by: Ramesh Shanmugasundaram <Ramesh.shanmugasundaram@bp.renesas.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-01-23bpf: add a longest prefix match trie map implementationDaniel Mack
This trie implements a longest prefix match algorithm that can be used to match IP addresses to a stored set of ranges. Internally, data is stored in an unbalanced trie of nodes that has a maximum height of n, where n is the prefixlen the trie was created with. Tries may be created with prefix lengths that are multiples of 8, in the range from 8 to 2048. The key used for lookup and update operations is a struct bpf_lpm_trie_key, and the value is a uint64_t. The code carries more information about the internal implementation. Signed-off-by: Daniel Mack <daniel@zonque.org> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23Merge tag 'omapdrm-4.11' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-next omapdrm changes for 4.11 The main change here is the IRQ code cleanup, which gives us properly working vblank counts and timestamps. We also get much less calls to runtime PM gets & puts. * tag 'omapdrm-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (26 commits) drm/omap: panel-sony-acx565akm.c: Add MODULE_ALIAS drm/omap: dsi: fix compile errors when enabling debug prints drm: omapdrm: Perform initialization/cleanup at probe/remove time drm: Move vblank cleanup from unregister to release drm: omapdrm: Use sizeof(*var) instead of sizeof(type) for structures drm: omapdrm: Remove global variables drm: omapdrm: Simplify IRQ wait implementation drm: omapdrm: Inline the pipe2vbl function drm: omapdrm: Don't call DISPC power handling in IRQ wait functions drm: omapdrm: Remove unused parameter from omap_drm_irq handler drm: omapdrm: Don't expose the omap_irq_(un)register() functions drm: omapdrm: Keep vblank interrupt enabled while CRTC is active drm: omapdrm: Use a spinlock to protect the CRTC pending flag drm: omapdrm: Prevent processing the same event multiple times drm: omapdrm: Check the CRTC software state at enable/disable time drm: omapdrm: Let the DRM core skip plane commit on inactive CRTCs drm: omapdrm: Replace DSS manager state check with omapdrm CRTC state drm: omapdrm: Handle OCP error IRQ directly drm: omapdrm: Handle CRTC error IRQs directly drm: omapdrm: Handle FIFO underflow IRQs internally ...
2017-01-20tipc: make replicast a user selectable optionJon Paul Maloy
If the bearer carrying multicast messages supports broadcast, those messages will be sent to all cluster nodes, irrespective of whether these nodes host any actual destinations socket or not. This is clearly wasteful if the cluster is large and there are only a few real destinations for the message being sent. In this commit we extend the eligibility of the newly introduced "replicast" transmit option. We now make it possible for a user to select which method he wants to be used, either as a mandatory setting via setsockopt(), or as a relative setting where we let the broadcast layer decide which method to use based on the ratio between cluster size and the message's actual number of destination nodes. In the latter case, a sending socket must stick to a previously selected method until it enters an idle period of at least 5 seconds. This eliminates the risk of message reordering caused by method change, i.e., when changes to cluster size or number of destinations would otherwise mandate a new method to be used. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20bpf: add bpf_probe_read_str helperGianluca Borello
Provide a simple helper with the same semantics of strncpy_from_unsafe(): int bpf_probe_read_str(void *dst, int size, const void *unsafe_addr) This gives more flexibility to a bpf program. A typical use case is intercepting a file name during sys_open(). The current approach is: SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs *ctx) { char buf[PATHLEN]; // PATHLEN is defined to 256 bpf_probe_read(buf, sizeof(buf), ctx->di); /* consume buf */ } This is suboptimal because the size of the string needs to be estimated at compile time, causing more memory to be copied than often necessary, and can become more problematic if further processing on buf is done, for example by pushing it to userspace via bpf_perf_event_output(), since the real length of the string is unknown and the entire buffer must be copied (and defining an unrolled strnlen() inside the bpf program is a very inefficient and unfeasible approach). With the new helper, the code can easily operate on the actual string length rather than the buffer size: SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs *ctx) { char buf[PATHLEN]; // PATHLEN is defined to 256 int res = bpf_probe_read_str(buf, sizeof(buf), ctx->di); /* consume buf, for example push it to userspace via * bpf_perf_event_output(), but this time we can use * res (the string length) as event size, after checking * its boundaries. */ } Another useful use case is when parsing individual process arguments or individual environment variables navigating current->mm->arg_start and current->mm->env_start: using this helper and the return value, one can quickly iterate at the right offset of the memory area. The code changes simply leverage the already existent strncpy_from_unsafe() kernel function, which is safe to be called from a bpf program as it is used in bpf_trace_printk(). Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19Merge tag 'iio-for-4.11a' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next Jonathan writes: First round of new device support, features and cleanups for IIO in the 4.11 cycle. It's shaping to be another fairly busy cycle. Lots more on the way! New device support * ads7950 - new driver supporting ads7950, ads7951, ads7952, ads7953, ads7954, ads7955, ads7956, ads7957, ads7958, ads7959, ads7960, and ads7961 ADCs. * cm3605 - New driver for this light sensor and proximity sensor which is an analog part with some additional digital controls. * hx711 - New driver. Core new stuff * Gravity sensor type. This is a processed datastream in which the device will try to work out which way is down. * Split the buffer.h file into two parts. One provides the interface to 'use' a buffer, the second provides the internals of the buffer functionality as needed by implementations of buffers. - Move documentation inline so as to allow use of private: tag when generating documentation. - Add some utility functions for the few things that are directly done with the buffers. - Stop exporting functions that no-one uses outside of the core code. - Push docs down by the code in the c file where they should have always been. - Fix typo in kernel-doc for buffer. - push down some includes that were previously happening implicitly. - stop enabling the timestamp of the dummy device. Features and cleanups * ad5592r - ACPI support * ad5593r -ACPI support. * ad5933 - Fix a false comment about size of a particular register. * ad7150 - replace S_IRUGO | S_IWUSR with 0644. I'm not that keen on these patches in general, but as it was nicely presented I took this one anyway. As a general rule will only take these as part of a larger driver cleanup. - don't eat an error but rather reutnr it in the write_event_config callback. * ad7606 - replace non standard range attibute with _scale * ade7753 - use usleep_range for short sleeps * ade7754 - use usleep_range for short sleeps * ade7758 - use usleep_range for short sleeps * ade7759 - use usleep_range for short sleeps * ade7854 - use usleep_range for short sleeps * adis16201 - fix description * adis16203 - fix description - fix copyright year * adis16209 - fix description * adt7316 - Add braces to arms of if else statement (for consistency) - Alignment fixes. * axp288 - Fix up an issue with accidental overwrites of data. * bmi160 - add deivce tables for i2c and spi to support correctly identifying the full dt name (including manufacturer). - device tree binding. * bmp280 - use usleep_range for short sleeps. * cm3232 - return error from cm3232_reg_init rather than eating it if the last write fails. * dummy driver - remove a semicolor found at end of a function defintition. * exynos-adc - use usleep_range for short sleeps. * hid-sensor (accel) - Add timestamp support. The hardware can provide timestamps so lets support them. If not fall back to timestamps estimated in kernel. * hid-sensor (light) - Add a duplicate ID for the light channels so as to keep existing interface whilst also using the more standard IIO interface. * hts221 - acpi probing * imx25-gcq - Add a macro call to allow this driver to be automatically loaded. * isl29028 - reorganise code to avoid deep nesting of if statements. - move chip test and default regs into a function suitable or sharing with power management code. - tidy up some code alignment. * lidar-lite-v3 - introduce compatible strings that make it clear Garmin have consideral friends. * mma8452 - avoid returning signed value when unsigned is appropriate * spmi-vadc - Update function for generic voltage conversion to take into account that different channels on this device should be handled differently. - Rework code to allow per channel voltage scaling and support the standard options for this hardware. - Fixup three minor issues with the above patches for this part. These all effect test builds rather than the native builds for the part, but good to clean them up anyway. * st_sensors - support device matching from the ACPI DST tables. - acpi based probing for accelerometers - acpi based probing for pressure sensors - Allow pressure sensors to read negative values. - Export sampling frequency for lps25h and lps331ap. - Add support for the old DT bindings from the period when these deivces were often supported through windows. Docs fixup: * typo in sysfs-bus-iio
2017-01-19drm/i915/get_params: Add HuC status to getparamsAnusha Srivatsa
This patch will allow for getparams to return the status of the HuC. As the HuC has to be validated by the GuC this patch uses the validated status to show when the HuC is loaded and ready for use. You cannot use the loaded status as with the GuC as the HuC is verified after it is loaded and is not usable until it is verified. v2: removed the forewakes as the registers are already force-woken. (T.Ursulin) v3: rebased on top of drm-tip. Removed any reference to intel_huc.h v4: rebased. Rename I915_PARAM_HAS_HUC to I915_PARAM_HUC_STATUS. Remove intel_is_huc_valid() since it is used only in one place. Put the case of I915_PARAM_HAS_HUC() in the right place. v5: rebased. Add a comment to specify that I915_READ(reg) does not read garbage value. The register HUC_STATUS2 is force woken and no rpm is needed. Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Peter Antoine <peter.antoine@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484755558-1234-6-git-send-email-anusha.srivatsa@intel.com
2017-01-18sctp: implement sender-side procedures for SSN Reset Request ParameterXin Long
This patch is to implement sender-side procedures for the Outgoing and Incoming SSN Reset Request Parameter described in rfc6525 section 5.1.2 and 5.1.3. It is also add sockopt SCTP_RESET_STREAMS in rfc6525 section 6.3.2 for users. Note that the new asoc member strreset_outstanding is to make sure only one reconf request chunk on the fly as rfc6525 section 5.1.1 demands. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18sctp: add sockopt SCTP_ENABLE_STREAM_RESETXin Long
This patch is to add sockopt SCTP_ENABLE_STREAM_RESET to get/set strreset_enable to indicate which reconf request type it supports, which is described in rfc6525 section 6.3.1. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18audit: add feature audit_lost resetRichard Guy Briggs
Add a method to reset the audit_lost value. An AUDIT_SET message with the AUDIT_STATUS_LOST flag set by itself will return a positive value repesenting the current audit_lost value and reset the counter to zero. If AUDIT_STATUS_LOST is not the only flag set, the reset command will be ignored. The value sent with the command is ignored. The return value will be the +ve lost value at reset time. An AUDIT_CONFIG_CHANGE message will be queued to the listening audit daemon. The message will be a standard CONFIG_CHANGE message with the fields "lost=0" and "old=" with the latter containing the value of audit_lost at reset time. See: https://github.com/linux-audit/audit-kernel/issues/3 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-18rpmsg: Driver for user space endpoint interfaceBjorn Andersson
This driver allows rpmsg instances to expose access to rpmsg endpoints to user space processes. It provides a control interface, allowing userspace to export endpoints and an endpoint interface for each exposed endpoint. The implementation is based on prior art by Texas Instrument, Google, PetaLogix and was derived from a FreeRTOS performance statistics driver written by Michal Simek. The control interface provides a "create endpoint" ioctl, which is fed a name, source and destination address. The three values are used to create the endpoint, in a backend-specific way, and a rpmsg endpoint device is created - with the three parameters are available in sysfs for udev usage. E.g. to create an endpoint device for one of the Qualcomm SMD channel related to DIAG one would issue: struct rpmsg_endpoint_info info = { "DIAG_CNTL", 0, 0 }; int fd = open("/dev/rpmsg_ctrl0", O_RDWR); ioctl(fd, RPMSG_CREATE_EPT_IOCTL, &info); Each created endpoint device shows up as an individual character device in /dev, allowing permission to be controlled on a per-endpoint basis. The rpmsg endpoint will be created and destroyed following the opening and closing of the endpoint device, allowing rpmsg backends to open and close the physical channel, if supported by the wire protocol. Cc: Marek Novak <marek.novak@nxp.com> Cc: Matteo Sartori <matteo.sartori@t3lab.it> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-01-17bridge: sparse fixes in br_ip6_multicast_alloc_query()Lance Richardson
Changed type of csum field in struct igmpv3_query from __be16 to __sum16 to eliminate type warning, made same change in struct igmpv3_report for consistency. Fixed up an ntohs() where htons() should have been used instead. Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-01-17mpls: Packet statsRobert Shearman
Having MPLS packet stats is useful for observing network operation and for diagnosing network problems. In the absence of anything better, RFC2863 and RFC3813 are used for guidance for which stats to expose and the semantics of them. In particular rx_noroutes maps to in unknown protos in RFC2863. The stats are exposed to userspace via AF_MPLS attributes embedded in the IFLA_STATS_AF_SPEC attribute of RTM_GETSTATS messages. All the introduced fields are 64-bit, even error ones, to ensure no overflow with long uptimes. Per-CPU counters are used to avoid cache-line contention on the commonly used fields. The other fields have also been made per-CPU for code to avoid performance problems in error conditions on the assumption that on some platforms the cost of atomic operations could be more expensive than sending the packet (which is what would be done in the success case). If that's not the case, we could instead not use per-CPU counters for these fields. Only unicast and non-fragment are exposed at the moment, but other counters can be exposed in the future either by adding to the end of struct mpls_link_stats or by additional netlink attributes in the AF_MPLS IFLA_STATS_AF_SPEC nested attribute. Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17net: AF-specific RTM_GETSTATS attributesRobert Shearman
Add the functionality for including address-family-specific per-link stats in RTM_GETSTATS messages. This is done through adding a new IFLA_STATS_AF_SPEC attribute under which address family attributes are nested and then the AF-specific attributes can be further nested. This follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the advantage of presenting an easily extended hierarchy. The rtnl_af_ops structure is extended to provide AFs with the opportunity to fill and provide the size of their stats attributes. One alternative would have been to provide AFs with the ability to add attributes directly into the RTM_GETSTATS message without a nested hierarchy. I discounted this approach as it increases the rate at which the 32 attribute number space is used up and it makes implementation a little more tricky for stats dump resuming (at the moment the order in which attributes are added to the message has to match the numeric order of the attributes). Another alternative would have been to register per-AF RTM_GETSTATS handlers. I discounted this approach as I perceived a common use-case to be getting all the stats for an interface and this approach would necessitate multiple requests/dumps to retrieve them all. Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-16ipv6: sr: add missing Kbuild export for header filesDavid Lebrun
Add missing IPv6-SR header files in include/uapi/linux/Kbuild. Also, prevent seg6_lwt_headroom() from being exported and add missing linux/types.h include. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-16bpf: rework prog_digest into prog_tagDaniel Borkmann
Commit 7bd509e311f4 ("bpf: add prog_digest and expose it via fdinfo/netlink") was recently discussed, partially due to admittedly suboptimal name of "prog_digest" in combination with sha1 hash usage, thus inevitably and rightfully concerns about its security in terms of collision resistance were raised with regards to use-cases. The intended use cases are for debugging resp. introspection only for providing a stable "tag" over the instruction sequence that both kernel and user space can calculate independently. It's not usable at all for making a security relevant decision. So collisions where two different instruction sequences generate the same tag can happen, but ideally at a rather low rate. The "tag" will be dumped in hex and is short enough to introspect in tracepoints or kallsyms output along with other data such as stack trace, etc. Thus, this patch performs a rename into prog_tag and truncates the tag to a short output (64 bits) to make it obvious it's not collision-free. Should in future a hash or facility be needed with a security relevant focus, then we can think about requirements, constraints, etc that would fit to that situation. For now, rework the exposed parts for the current use cases as long as nothing has been released yet. Tested on x86_64 and s390x. Fixes: 7bd509e311f4 ("bpf: add prog_digest and expose it via fdinfo/netlink") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-16Merge 4.10-rc4 into tty-nextGreg Kroah-Hartman
We want the serial/tty fixes in here as well to build on top of. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-16netfilter: nf_tables: fix spelling mistakesAlexander Alemayhu
o s/numerice/numeric o s/opertaor/operator Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>