summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2016-12-05bpf: add prog_digest and expose it via fdinfo/netlinkDaniel Borkmann
When loading a BPF program via bpf(2), calculate the digest over the program's instruction stream and store it in struct bpf_prog's digest member. This is done at a point in time before any instructions are rewritten by the verifier. Any unstable map file descriptor number part of the imm field will be zeroed for the hash. fdinfo example output for progs: # cat /proc/1590/fdinfo/5 pos: 0 flags: 02000002 mnt_id: 11 prog_type: 1 prog_jited: 1 prog_digest: b27e8b06da22707513aa97363dfb11c7c3675d28 memlock: 4096 When programs are pinned and retrieved by an ELF loader, the loader can check the program's digest through fdinfo and compare it against one that was generated over the ELF file's program section to see if the program needs to be reloaded. Furthermore, this can also be exposed through other means such as netlink in case of a tc cls/act dump (or xdp in future), but also through tracepoints or other facilities to identify the program. Other than that, the digest can also serve as a base name for the work in progress kallsyms support of programs. The digest doesn't depend/select the crypto layer, since we need to keep dependencies to a minimum. iproute2 will get support for this facility. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only ↵Al Viro
on success Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-05[iov_iter] new primitives - copy_from_iter_full() and friendsAl Viro
copy_from_iter_full(), copy_from_iter_full_nocache() and csum_and_copy_from_iter_full() - counterparts of copy_from_iter() et.al., advancing iterator only in case of successful full copy and returning whether it had been successful or not. Convert some obvious users. *NOTE* - do not blindly assume that something is a good candidate for those unless you are sure that not advancing iov_iter in failure case is the right thing in this case. Anything that does short read/short write kind of stuff (or is in a loop, etc.) is unlikely to be a good one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-05ahci-remap.h: add ahci remapping definitionsDan Williams
Signed-off-by: Dan Williams <dan.j.williams@intel.com> [hch: split into a separate header and commit] Signed-off-by: Christoph Hellwig <hch@lst.de> [tj: dropped duplicate definition of AHCI_VSCAP spotted by Sergei] Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-12-05nvme: move NVMe class code to pci_ids.hChristoph Hellwig
We'll need to check for it in the AHCI drivers (yes, really) soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-12-05tcp: tsq: move tsq_flags close to sk_wmem_allocEric Dumazet
tsq_flags being in the same cache line than sk_wmem_alloc makes a lot of sense. Both fields are changed from tcp_wfree() and more generally by various TSQ related functions. Prior patch made room in struct sock and added sk_tsq_flags, this patch deletes tsq_flags from struct tcp_sock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05tcp: tsq: add tsq_flags / tsq_enumEric Dumazet
This is a cleanup, to ease code review of following patches. Old 'enum tsq_flags' is renamed, and a new enumeration is added with the flags used in cmpxchg() operations as opposed to single bit operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05usb: hcd.h: construct hub class request constants from simpler constantsTal Shorer
Currently, each hub class request constant is defined by a line like: #define ClearHubFeature (0x2000 | USB_REQ_CLEAR_FEATURE) The "magic" number for the high byte is one of 0x20, 0xa0, 0x23, 0xa3. The 0x80 bit that changes inditace USB_DIR_IN, and the 0x03 that pops up is the difference between USB_RECIP_DEVICE (0x00) and USB_RECIP_OTHER (0x03). The constant 0x20 bit is USB_TYPE_CLASS. This patch eliminates those magic numbers by defining a macro to help construct these hub class request from simpler constants. Note that USB_RT_HUB is defined as (USB_TYPE_CLASS | USB_RECIP_DEVICE) and that USB_RT_PORT is defined as (USB_TYPE_CLASS | USB_RECIP_OTHER). Signed-off-by: Tal Shorer <tal.shorer@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-05fsl/usb: Workarourd for USB erratum-A005697Changming Huang
The EHCI specification states the following in the SUSP bit description: In the Suspend state, the port is sensitive to resume detection. Note that the bit status does not change until the port is suspended and that there may be a delay in suspending a port if there is a transaction currently in progress on the USB. However, in NXP USBDR controller, the PORTSCx[SUSP] bit changes immediately when the application sets it and not when the port is actually suspended. So the application must wait for at least 10 milliseconds after a port indicates that it is suspended, to make sure this port has entered suspended state before initiating this port resume using the Force Port Resume bit. This bit is for NXP controller, not EHCI compatible. Signed-off-by: Changming Huang <jerry.huang@nxp.com> Signed-off-by: Ramneek Mehresh <ramneek.mehresh@nxp.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-05driver core: Silence device links sphinx warningLukas Wunner
Silence this warning emitted by sphinx: include/linux/device.h:938: warning: No description found for parameter 'links' While at it, fix typos in comments of device links code. Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Silvio Fricke <silvio.fricke@gmail.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-05mmc: mmc: Introduce mmc_abort_tuning()Adrian Hunter
If a tuning command times out, the card could still be processing it, which will cause problems for recovery. The eMMC specification says that CMD12 can be used to stop CMD21, so add a function that does that. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-12-05regulator: Fix regulator_get_error_flags() signature mismatchDavid Lechner
The function signature of does not match regulator_get_error_flags() when CONFIG_REGULATOR is not defined vs. when it is not defined. This makes both declarations match to prevent compiler errors. Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-05mmc: mmc: Add Command Queue definitionsAdrian Hunter
Add definitions relating to Command Queuing. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-12-05mmc: queue: Fix queue thread wake-upAdrian Hunter
The only time the driver sleeps expecting to be woken upon the arrival of a new request, is when the dispatch queue is empty. The only time that it is known whether the dispatch queue is empty is after NULL is returned from blk_fetch_request() while under the queue lock. Recognizing those facts, simplify the synchronization between the queue thread and the request function. A couple of flags tell the request function what to do, and the queue lock and barriers associated with wake-ups ensure synchronization. The result is simpler and allows the removal of the context_info lock. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Harjani Ritesh <riteshh@codeaurora.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-12-05Backmerge tag 'v4.9-rc8' into drm-nextDave Airlie
Linux 4.9-rc8 Daniel requested this so we could apply some follow on fixes cleanly to -next.
2016-12-04NFS: Only look at the change attribute cache state in nfs_check_verifierTrond Myklebust
When looking at whether or not our dcache is valid, we really don't care about the general state of the directory attribute cache. Instead, we we only care about the state of the change attribute. This fixes a performance issue when the client is responsible for changing the directory contents; a number of NFSv4 operations will atomically update the directory change attribute, but may not return all the other attributes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-04netfilter: conntrack: built-in support for DCCPDavide Caratti
CONFIG_NF_CT_PROTO_DCCP is no more a tristate. When set to y, connection tracking support for DCCP protocol is built-in into nf_conntrack.ko. footprint test: $ ls -l net/netfilter/nf_conntrack{_proto_dccp,}.ko \ net/ipv4/netfilter/nf_conntrack_ipv4.ko \ net/ipv6/netfilter/nf_conntrack_ipv6.ko (builtin)|| dccp | ipv4 | ipv6 | nf_conntrack ---------++--------+--------+--------+-------------- none || 469140 | 828755 | 828676 | 6141434 DCCP || - | 830566 | 829935 | 6533526 Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-03ipv6 addrconf: Implemented enhanced DAD (RFC7527)Erik Nordmark
Implemented RFC7527 Enhanced DAD. IPv6 duplicate address detection can fail if there is some temporary loopback of Ethernet frames. RFC7527 solves this by including a random nonce in the NS messages used for DAD, and if an NS is received with the same nonce it is assumed to be a looped back DAD probe and is ignored. RFC7527 is enabled by default. Can be disabled by setting both of conf/{all,interface}/enhanced_dad to zero. Signed-off-by: Erik Nordmark <nordmark@arista.com> Signed-off-by: Bob Gilligan <gilligan@arista.com> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03vfs: remove unused have_submounts() functionIan Kent
Now that path_has_submounts() has been added have_submounts() is no longer used so remove it. Link: http://lkml.kernel.org/r/20161011053428.27645.12310.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Omar Sandoval <osandov@osandov.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-03vfs: add path_has_submounts()Ian Kent
d_mountpoint() can only be used reliably to establish if a dentry is not mounted in any namespace. It isn't aware of the possibility there may be multiple mounts using the given dentry, possibly in a different namespace. Add function, path_has_submounts(), that checks is a struct path contains mounts (or is a mountpoint itself) to handle this case. Link: http://lkml.kernel.org/r/20161011053403.27645.55242.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Omar Sandoval <osandov@osandov.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-03vfs: add path_is_mountpoint() helperIan Kent
d_mountpoint() can only be used reliably to establish if a dentry is not mounted in any namespace. It isn't aware of the possibility there may be multiple mounts using a given dentry that may be in a different namespace. Add helper functions, path_is_mountpoint(), that checks if a struct path is a mountpoint for this case. Link: http://lkml.kernel.org/r/20161011053358.27645.9729.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Omar Sandoval <osandov@osandov.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-03vmxnet3: Move PCI Id to pci_ids.hAdit Ranadive
The VMXNet3 PCI Id will be shared with our paravirtual RDMA driver. Moved it to the shared location in pci_ids.h. Suggested-by: Leon Romanovsky <leon@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03pNFS/flexfiles: Minor refactoring before adding iostats to layoutreturnTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Couple conflicts resolved here: 1) In the MACB driver, a bug fix to properly initialize the RX tail pointer properly overlapped with some changes to support variable sized rings. 2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix overlapping with a reorganization of the driver to support ACPI, OF, as well as PCI variants of the chip. 3) In 'net' we had several probe error path bug fixes to the stmmac driver, meanwhile a lot of this code was cleaned up and reorganized in 'net-next'. 4) The cls_flower classifier obtained a helper function in 'net-next' called __fl_delete() and this overlapped with Daniel Borkamann's bug fix to use RCU for object destruction in 'net'. It also overlapped with Jiri's change to guard the rhashtable_remove_fast() call with a check against tc_skip_sw(). 5) In mlx4, a revert bug fix in 'net' overlapped with some unrelated changes in 'net-next'. 6) In geneve, a stale header pointer after pskb_expand_head() bug fix in 'net' overlapped with a large reorganization of the same code in 'net-next'. Since the 'net-next' code no longer had the bug in question, there was nothing to do other than to simply take the 'net-next' hunks. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02pNFS: Allow layout drivers to manage private data in struct nfs4_layoutreturnTrond Myklebust
Cleanup to allow layout drivers to attach private data to layoutreturn, and manage the data. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02vfs: change d_manage() to take a struct pathIan Kent
For the autofs module to be able to reliably check if a dentry is a mountpoint in a multiple namespace environment the ->d_manage() dentry operation will need to take a path argument instead of a dentry. Link: http://lkml.kernel.org/r/20161011053352.27645.83962.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Omar Sandoval <osandov@osandov.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-02Input: synaptics-rmi4 - store the attn data in the driverBenjamin Tissoires
Now that we have a proper API to set the attention data, there is no point in keeping it in the transport driver. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: Andrew Duggan <aduggan@synaptics.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-12-02Input: synaptics-rmi4 - allow to add attention dataBenjamin Tissoires
The HID implementation of RMI4 provides the data during the interrupt (in the input report). We need to provide a way for this transport driver to provide the attention data while calling an IRQ. We use a fifo in rmi_core to not lose any incoming event. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: Andrew Duggan <aduggan@synaptics.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-12-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Lots more phydev and probe error path leaks in various drivers by Johan Hovold. 2) Fix race in packet_set_ring(), from Philip Pettersson. 3) Use after free in dccp_invalid_packet(), from Eric Dumazet. 4) Signnedness overflow in SO_{SND,RCV}BUFFORCE, also from Eric Dumazet. 5) When tunneling between ipv4 and ipv6 we can be left with the wrong skb->protocol value as we enter the IPSEC engine and this causes all kinds of problems. Set it before the output path does any dst_output() calls, from Eli Cooper. 6) bcmgenet uses wrong device struct pointer in DMA API calls, fix from Florian Fainelli. 7) Various netfilter nat bug fixes from FLorian Westphal. 8) Fix memory leak in ipvlan_link_new(), from Gao Feng. 9) Locking fixes, particularly wrt. socket lookups, in l2tp from Guillaume Nault. 10) Avoid invoking rhash teardowns in atomic context by moving netlink cb->done() dump completion from a worker thread. Fix from Herbert Xu. 11) Buffer refcount problems in tun and macvtap on errors, from Jason Wang. 12) We don't set Kconfig symbol DEFAULT_TCP_CONG properly when the user selects BBR. Fix from Julian Wollrath. 13) Fix deadlock in transmit path on altera TSE driver, from Lino Sanfilippo. 14) Fix unbalanced reference counting in dsa_switch_tree, from Nikita Yushchenko. 15) tc_tunnel_key needs to be properly exported to userspace via uapi, fix from Roi Dayan. 16) rds_tcp_init_net() doesn't unregister notifier in error path, fix from Sowmini Varadhan. 17) Stale packet header pointer access after pskb_expand_head() in genenve driver, fix from Sabrina Dubroca. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits) net: avoid signed overflows for SO_{SND|RCV}BUFFORCE geneve: avoid use-after-free of skb->data tipc: check minimum bearer MTU net: renesas: ravb: unintialized return value sh_eth: remove unchecked interrupts for RZ/A1 net: bcmgenet: Utilize correct struct device for all DMA operations NET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040 cdc_ether: Fix handling connection notification ip6_offload: check segs for NULL in ipv6_gso_segment. RDS: TCP: unregister_netdevice_notifier() in error path of rds_tcp_init_net Revert: "ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()" ipv6: Set skb->protocol properly for local output ipv4: Set skb->protocol properly for local output packet: fix race condition in packet_set_ring net: ethernet: altera: TSE: do not use tx queue lock in tx completion handler net: ethernet: altera: TSE: Remove unneeded dma sync for tx buffers net: ethernet: stmmac: fix of-node and fixed-link-phydev leaks net: ethernet: stmmac: platform: fix outdated function header net: ethernet: stmmac: dwmac-meson8b: fix probe error path net: ethernet: stmmac: dwmac-generic: fix probe error path ...
2016-12-02bpf: Add new cgroup attach type to enable sock modificationsDavid Ahern
Add new cgroup based program type, BPF_PROG_TYPE_CGROUP_SOCK. Similar to BPF_PROG_TYPE_CGROUP_SKB programs can be attached to a cgroup and run any time a process in the cgroup opens an AF_INET or AF_INET6 socket. Currently only sk_bound_dev_if is exported to userspace for modification by a bpf program. This allows a cgroup to be configured such that AF_INET{6} sockets opened by processes are automatically bound to a specific device. In turn, this enables the running of programs that do not support SO_BINDTODEVICE in a specific VRF context / L3 domain. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02bpf: Refactor cgroups code in prep for new typeDavid Ahern
Code move and rename only; no functional change intended. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02net/sched: cls_flower: Add offload support using egress Hardware deviceHadar Hen Zion
In order to support hardware offloading when the device given by the tc rule is different from the Hardware underline device, extract the mirred (egress) device from the tc action when a filter is added, using the new tc_action_ops, get_dev(). Flower caches the information about the mirred device and use it for calling ndo_setup_tc in filter change, update stats and delete. Calling ndo_setup_tc of the mirred (egress) device instead of the ingress device will allow a resolution between the software ingress device and the underline hardware device. The resolution will take place inside the offloading driver using 'egress_device' flag added to tc_to_netdev struct which is provided to the offloading driver. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02tcp: randomize tcp timestamp offsets for each connectionFlorian Westphal
jiffies based timestamps allow for easy inference of number of devices behind NAT translators and also makes tracking of hosts simpler. commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset") added the main infrastructure that is needed for per-connection ts randomization, in particular writing/reading the on-wire tcp header format takes the offset into account so rest of stack can use normal tcp_time_stamp (jiffies). So only two items are left: - add a tsoffset for request sockets - extend the tcp isn generator to also return another 32bit number in addition to the ISN. Re-use of ISN generator also means timestamps are still monotonically increasing for same connection quadruple, i.e. PAWS will still work. Includes fixes from Eric Dumazet. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02qed: Add support for hardware offloaded iSCSI.Yuval Mintz
This adds the backbone required for the various HW initalizations which are necessary for the iSCSI driver (qedi) for QLogic FastLinQ 4xxxx line of adapters - FW notification, resource initializations, etc. Signed-off-by: Arun Easi <arun.easi@cavium.com> Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02NFSv4: Add a generic structure for managing layout-private informationTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02bpf, xdp: drop rcu_read_lock from bpf_prog_run_xdp and move to callerDaniel Borkmann
After 326fe02d1ed6 ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"), the rcu_read_lock() in bpf_prog_run_xdp() is superfluous, since callers need to hold rcu_read_lock() already to make sure BPF program doesn't get released in the background. Thus, drop it from bpf_prog_run_xdp(), as it can otherwise be misleading. Still keeping the bpf_prog_run_xdp() is useful as it allows for grepping in XDP supported drivers and to keep the typecheck on the context intact. For mlx4, this means we don't have a double rcu_read_lock() anymore. nfp can just make use of bpf_prog_run_xdp(), too. For qede, just move rcu_read_lock() out of the helper. When the driver gets atomic replace support, this will move to call-sites eventually. mlx5 needs actual fixing as it has the same issue as described already in 326fe02d1ed6 ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"), that is, we're under RCU bh at this time, BPF programs are released via call_rcu(), and call_rcu() != call_rcu_bh(), so we need to properly mark read side as programs can get xchg()'ed in mlx5e_xdp_set() without queue reset. Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02bpf: BPF for lightweight tunnel infrastructureThomas Graf
Registers new BPF program types which correspond to the LWT hooks: - BPF_PROG_TYPE_LWT_IN => dst_input() - BPF_PROG_TYPE_LWT_OUT => dst_output() - BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit() The separate program types are required to differentiate between the capabilities each LWT hook allows: * Programs attached to dst_input() or dst_output() are restricted and may only read the data of an skb. This prevent modification and possible invalidation of already validated packet headers on receive and the construction of illegal headers while the IP headers are still being assembled. * Programs attached to lwtunnel_xmit() are allowed to modify packet content as well as prepending an L2 header via a newly introduced helper bpf_skb_change_head(). This is safe as lwtunnel_xmit() is invoked after the IP header has been assembled completely. All BPF programs receive an skb with L3 headers attached and may return one of the following error codes: BPF_OK - Continue routing as per nexthop BPF_DROP - Drop skb and return EPERM BPF_REDIRECT - Redirect skb to device as per redirect() helper. (Only valid in lwtunnel_xmit() context) The return codes are binary compatible with their TC_ACT_ relatives to ease compatibility. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02net/mlx5e: Implement Fragmented Work Queue (WQ)Tariq Toukan
Add new type of struct mlx5_frag_buf which is used to allocate fragmented buffers rather than contiguous, and make the Completion Queues (CQs) use it as they are big (default of 2MB per CQ in Striding RQ). This fixes the failures of type: "mlx5e_open_locked: mlx5e_open_channels failed, -12" due to dma_zalloc_coherent insufficient contiguous coherent memory to satisfy the driver's request when the user tries to setup more or larger rings. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02Merge branch 'locking/urgent' into locking/core, to pick up dependent fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-01Merge tag 'pci-v4.9-fixes-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "PCI fixes: - Fix Read Completion Boundary setting, which fixes a boot failure on IBM x3850 with Mellanox MT27500 ConnectX-3 - Update some MAINTAINERS entries and email addresses" * tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX) PCI: Export pcie_find_root_port PCI: designware-plat: Update author email PCI: designware: Change maintainer to Joao Pinto MAINTAINERS: Add devicetree binding to PCI i.MX6 entry MAINTAINERS: Update Richard Zhu's email address
2016-12-02zram: Convert to hotplug state machineAnna-Maria Gleixner
Install the callbacks via the state machine with multi instance support and let the core invoke the callbacks on the already online CPUs. [bigeasy: wire up the multi instance stuff] Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: rt@linutronix.de Cc: Nitin Gupta <ngupta@vflare.org> Link: http://lkml.kernel.org/r/20161126231350.10321-19-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02KVM/PPC/Book3S HV: Convert to hotplug state machineAnna-Maria Gleixner
Install the callbacks via the state machine. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: kvm-ppc@vger.kernel.org Cc: Paul Mackerras <paulus@samba.org> Cc: rt@linutronix.de Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alexander Graf <agraf@suse.com> Link: http://lkml.kernel.org/r/20161126231350.10321-18-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02iommu/vt-d: Convert to hotplug state machineAnna-Maria Gleixner
Install the callbacks via the state machine. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: rt@linutronix.de Cc: David Woodhouse <dwmw2@infradead.org> Link: http://lkml.kernel.org/r/20161126231350.10321-14-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02mm/zswap: Convert pool to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine. Multi state is used to address the per-pool notifier. Uppon adding of the intance the callback is invoked for all online CPUs so the manual init can go. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-mm@kvack.org Cc: Seth Jennings <sjenning@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161126231350.10321-13-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02mm/zswap: Convert dst-mem to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-mm@kvack.org Cc: Seth Jennings <sjenning@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161126231350.10321-12-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02mm/zsmalloc: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: linux-mm@kvack.org Cc: Minchan Kim <minchan@kernel.org> Cc: rt@linutronix.de Cc: Nitin Gupta <ngupta@vflare.org> Link: http://lkml.kernel.org/r/20161126231350.10321-11-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02mm/vmstat: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine, but do not invoke them as we can initialize the node state without calling the callbacks on all online CPUs. start_shepherd_timer() is now called outside the get_online_cpus() block which is safe as it only operates on cpu possible mask. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Cc: rt@linutronix.de Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20161129145221.ffc3kg3hd7lxiwj6@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02tracing/rb: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine. The notifier in struct ring_buffer is replaced by the multi instance interface. Upon __ring_buffer_alloc() invocation, cpuhp_state_add_instance() will invoke the trace_rb_cpu_prepare() on each CPU. This callback may now fail. This means __ring_buffer_alloc() will fail and cleanup (like previously) and during a CPU up event this failure will not allow the CPU to come up. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161126231350.10321-7-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-01NFS: discard nfs_lockowner structure.NeilBrown
It now has only one field and is only used in one structure. So replaced it in that structure by the field it contains. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01NFSv4: add flock_owner to open contextNeilBrown
An open file description (struct file) in a given process can be associated with two different lock owners. It can have a Posix lock owner which will be different in each process that has a fd on the file. It can have a Flock owner which will be the same in all processes. When searching for a lock stateid to use, we need to consider both of these owners So add a new "flock_owner" to the "nfs_open_context" (of which there is one for each open file description). This flock_owner does not need to be reference-counted as there is a 1-1 relation between 'struct file' and nfs open contexts, and it will never be part of a list of contexts. So there is no need for a 'flock_context' - just the owner is enough. The io_count included in the (Posix) lock_context provides no guarantee that all read-aheads that could use the state have completed, so not supporting it for flock locks in not a serious problem. Synchronization between flock and read-ahead can be added later if needed. When creating an open_context for a non-openning create call, we don't have a 'struct file' to pass in, so the lock context gets initialized with a NULL owner, but this will never be used. The flock_owner is not used at all in this patch, that will come later. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>