summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-01-20ptrace: use fsuid, fsgid, effective creds for fs access checksJann Horn
By checking the effective credentials instead of the real UID / permitted capabilities, ensure that the calling process actually intended to use its credentials. To ensure that all ptrace checks use the correct caller credentials (e.g. in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS flag), use two new flags and require one of them to be set. The problem was that when a privileged task had temporarily dropped its privileges, e.g. by calling setreuid(0, user_uid), with the intent to perform following syscalls with the credentials of a user, it still passed ptrace access checks that the user would not be able to pass. While an attacker should not be able to convince the privileged task to perform a ptrace() syscall, this is a problem because the ptrace access check is reused for things in procfs. In particular, the following somewhat interesting procfs entries only rely on ptrace access checks: /proc/$pid/stat - uses the check for determining whether pointers should be visible, useful for bypassing ASLR /proc/$pid/maps - also useful for bypassing ASLR /proc/$pid/cwd - useful for gaining access to restricted directories that contain files with lax permissions, e.g. in this scenario: lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar drwx------ root root /root drwxr-xr-x root root /root/foobar -rw-r--r-- root root /root/foobar/secret Therefore, on a system where a root-owned mode 6755 binary changes its effective credentials as described and then dumps a user-specified file, this could be used by an attacker to reveal the memory layout of root's processes or reveal the contents of files he is not allowed to access (through /proc/$pid/cwd). [akpm@linux-foundation.org: fix warning] Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20epoll: add EPOLLEXCLUSIVE flagJason Baron
Currently, epoll file descriptors or epfds (the fd returned from epoll_create[1]()) that are added to a shared wakeup source are always added in a non-exclusive manner. This means that when we have multiple epfds attached to a shared fd source they are all woken up. This creates thundering herd type behavior. Introduce a new 'EPOLLEXCLUSIVE' flag that can be passed as part of the 'event' argument during an epoll_ctl() EPOLL_CTL_ADD operation. This new flag allows for exclusive wakeups when there are multiple epfds attached to a shared fd event source. The implementation walks the list of exclusive waiters, and queues an event to each epfd, until it finds the first waiter that has threads blocked on it via epoll_wait(). The idea is to search for threads which are idle and ready to process the wakeup events. Thus, we queue an event to at least 1 epfd, but may still potentially queue an event to all epfds that are attached to the shared fd source. Performance testing was done by Madars Vitolins using a modified version of Enduro/X. The use of the 'EPOLLEXCLUSIVE' flag reduce the length of this particular workload from 860s down to 24s. Sample epoll_clt text: EPOLLEXCLUSIVE Sets an exclusive wakeup mode for the epfd file descriptor that is being attached to the target file descriptor, fd. Thus, when an event occurs and multiple epfd file descriptors are attached to the same target file using EPOLLEXCLUSIVE, one or more epfds will receive an event with epoll_wait(2). The default in this scenario (when EPOLLEXCLUSIVE is not set) is for all epfds to receive an event. EPOLLEXCLUSIVE may only be specified with the op EPOLL_CTL_ADD. Signed-off-by: Jason Baron <jbaron@akamai.com> Tested-by: Madars Vitolins <m@silodev.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Eric Wong <normalperson@yhbt.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20include/linux/radix-tree.h: fix error in docs about locksAdam Barth
This text refers to the "first 7 functions", which was correct when written but became incorrect when Johannes Weiner added another function to the list in 139e561660fe ("lib: radix_tree: tree node interface"). Change the text to correctly refer to the first 8 functions. Signed-off-by: Adam Barth <aurorean@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20lib/iomap_copy.c: add __ioread32_copy()Stephen Boyd
Some drivers need to read data out of iomem areas 32-bits at a time. Add an API to do this. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com> Cc: <zajec5@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Paul Walmsley <paul@pwsan.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-21Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: cpuidle: menu: Avoid pointless checks in menu_select() sched / idle: Drop default_idle_call() fallback from call_cpuidle() cpuidle: Don't enable all governors by default cpuidle: Default to ladder governor on ticking systems time: nohz: Expose tick_nohz_enabled cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
2016-01-21Merge branch 'pm-devfreq'Rafael J. Wysocki
* pm-devfreq: MAINTAINERS: Add devfreq-event entry MAINTAINERS: Add missing git repository and directory for devfreq PM / devfreq: Do not show statistics if it's not ready. PM / devfreq: Modify the indentation of trans_stat sysfs for readability PM / devfreq: Set the freq_table of devfreq device PM / devfreq: Add show_one macro to delete the duplicate code PM / devfreq: event: Fix the error and warning from script/checkpatch.pl PM / devfreq: event: Remove the error log of devfreq_event_get_edev_by_phandle()
2016-01-21Merge branch 'pm-core'Rafael J. Wysocki
* pm-core: driver core: Avoid NULL pointer dereferences in device_is_bound() platform: Do not detach from PM domains on shutdown USB / PM: Allow USB devices to remain runtime-suspended when sleeping PM / sleep: Go direct_complete if driver has no callbacks PM / Domains: add setter for dev.pm_domain device core: add device_is_bound()
2016-01-21Merge branches 'acpica', 'acpi-video' and 'acpi-fan'Rafael J. Wysocki
* acpica: ACPICA: Update version to 20160108 ACPICA: Silence a -Wbad-function-cast warning when acpi_uintptr_t is 'uintptr_t' ACPICA: Additional 2016 copyright changes ACPICA: Reduce regression fix divergence from upstream ACPICA * acpi-video: ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830 ACPI / video: Revert "thinkpad_acpi: Use acpi_video_handles_brightness_key_presses()" ACPI / video: Document acpi_video_handles_brightness_key_presses() a bit ACPI / video: Fix using an uninitialized mutex / list_head in acpi_video_handles_brightness_key_presses() ACPI / video: Revert "ACPI / video: driver must be registered before checking for keypresses" ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700 * acpi-fan: ACPI / fan: Improve acpi_device_update_power error message
2016-01-20swiotlb: Make linux/swiotlb.h standalone includibleThierry Reding
This header file uses the enum dma_data_direction and struct page types without explicitly including the corresponding header files. This makes it rely on the includer to have included the proper headers before. To fix this, include linux/dma-direction.h and forward-declare struct page. The swiotlb_free() function is also annotated __init, therefore requires linux/init.h to be included as well. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-01-20Merge branch 'pci/trivial' into nextBjorn Helgaas
* pci/trivial: PCI: shpchp: Constify hpc_ops structure PCI: Use kobj_to_dev() instead of open-coding it PCI: Use to_pci_dev() instead of open-coding it PCI: Fix all whitespace issues PCI/MSI: Fix typos in <linux/msi.h>
2016-01-20Merge branches 'pci/iommu' and 'pci/misc' into nextBjorn Helgaas
* pci/iommu: PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 * pci/misc: PCI: Limit config space size for Netronome NFP4000 PCI: Add Netronome NFP4000 PF device ID
2016-01-20netfilter: nf_conntrack: use safer way to lock all bucketsSasha Levin
When we need to lock all buckets in the connection hashtable we'd attempt to lock 1024 spinlocks, which is way more preemption levels than supported by the kernel. Furthermore, this behavior was hidden by checking if lockdep is enabled, and if it was - use only 8 buckets(!). Fix this by using a global lock and synchronize all buckets on it when we need to lock them all. This is pretty heavyweight, but is only done when we need to resize the hashtable, and that doesn't happen often enough (or at all). Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-20target: Obtain se_node_acl->acl_kref during get_initiator_node_aclNicholas Bellinger
This patch addresses a long standing race where obtaining se_node_acl->acl_kref in __transport_register_session() happens a bit too late, and leaves open the potential for core_tpg_del_initiator_node_acl() to hit a NULL pointer dereference. Instead, take ->acl_kref in core_tpg_get_initiator_node_acl() while se_portal_group->acl_node_mutex is held, and move the final target_put_nacl() from transport_deregister_session() into transport_free_session() so that fabric driver login failure handling using the modern method to still work as expected. Also, update core_tpg_get_initiator_node_acl() to take an extra reference for dynamically generated acls for demo-mode, before returning to fabric caller. Also update iscsi-target sendtargets special case handling to use target_tpg_has_node_acl() when checking if demo_mode_discovery == true during discovery lookup. Note the existing wait_for_completion(&acl->acl_free_comp) in core_tpg_del_initiator_node_acl() does not change. Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-01-20target: Convert ACL change queue_depth se_session reference usageNicholas Bellinger
This patch converts core_tpg_set_initiator_node_queue_depth() to use struct se_node_acl->acl_sess_list when performing explicit se_tpg_tfo->shutdown_session() for active sessions, in order for new se_node_acl->queue_depth to take effect. This follows how core_tpg_del_initiator_node_acl() currently works when invoking se_tpg_tfo->shutdown-session(), and ahead of the next patch to take se_node_acl->acl_kref during lookup, the extra get_initiator_node_acl() can go away. In order to achieve this, go ahead and change target_get_session() to use kref_get_unless_zero() and propigate up the return value to know when a session is already being released. This is because se_node_acl->acl_group is already protecting se_node_acl->acl_group reference via configfs, and shutdown within core_tpg_del_initiator_node_acl() won't occur until sys_write() to core_tpg_set_initiator_node_queue_depth() attribute returns back to user-space. Also, drop the left-over iscsi-target hack, and obtain se_portal_group->session_lock in lio_tpg_shutdown_session() internally. Remove iscsi-target wrapper and unused se_tpg + force parameters and associated code. Reported-by: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-01-20drm/atomic-helper: Export framebuffer_changed()John Keeping
The Rockchip driver cannot use drm_atomic_helper_wait_for_vblanks() because it has hardware counters for neither vblanks nor scanlines. In order to simplify re-implementing the functionality for this driver, export the framebuffer_changed() helper so it can be reused. Signed-off-by: John Keeping <john@metanate.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-01-19pipe: limit the per-user amount of pages allocated in pipesWilly Tarreau
On no-so-small systems, it is possible for a single process to cause an OOM condition by filling large pipes with data that are never read. A typical process filling 4000 pipes with 1 MB of data will use 4 GB of memory. On small systems it may be tricky to set the pipe max size to prevent this from happening. This patch makes it possible to enforce a per-user soft limit above which new pipes will be limited to a single page, effectively limiting them to 4 kB each, as well as a hard limit above which no new pipes may be created for this user. This has the effect of protecting the system against memory abuse without hurting other users, and still allowing pipes to work correctly though with less data at once. The limit are controlled by two new sysctls : pipe-user-pages-soft, and pipe-user-pages-hard. Both may be disabled by setting them to zero. The default soft limit allows the default number of FDs per process (1024) to create pipes of the default size (64kB), thus reaching a limit of 64MB before starting to create only smaller pipes. With 256 processes limited to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB = 1084 MB of memory allocated for a user. The hard limit is disabled by default to avoid breaking existing applications that make intensive use of pipes (eg: for splicing). Reported-by: socketpair@gmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Mitigates: CVE-2013-4312 (Linux 2.0+) Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-19Merge branch 'for-4.5/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block updates from Jens Axboe: "We don't have a lot of core changes this time around, it's mostly in drivers, which will come in a subsequent pull. The cores changes include: - blk-mq - Prep patch from Christoph, changing blk_mq_alloc_request() to take flags instead of just using gfp_t for sleep/nosleep. - Doc patch from me, clarifying the difference between legacy and blk-mq for timer usage. - Fixes from Raghavendra for memory-less numa nodes, and a reuse of CPU masks. - Cleanup from Geliang Tang, using offset_in_page() instead of open coding it. - From Ilya, rename request_queue slab to it reflects what it holds, and a fix for proper use of bdgrab/put. - A real fix for the split across stripe boundaries from Keith. We yanked a broken version of this from 4.4-rc final, this one works. - From Mike Krinkin, emit a trace message when we split. - From Wei Tang, two small cleanups, not explicitly clearing memory that is already cleared" * 'for-4.5/core' of git://git.kernel.dk/linux-block: block: use bd{grab,put}() instead of open-coding block: split bios to max possible length block: add call to split trace point blk-mq: Avoid memoryless numa node encoded in hctx numa_node blk-mq: Reuse hardware context cpumask for tags blk-mq: add a flags parameter to blk_mq_alloc_request Revert "blk-flush: Queue through IO scheduler when flush not required" block: clarify blk_add_timer() use case for blk-mq bio: use offset_in_page macro block: do not initialise statics to 0 or NULL block: do not initialise globals to 0 or NULL block: rename request_queue slab cache
2016-01-19IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headersMoni Shoua
RoCEv2 packets are sent over IP/UDP protocols. The mlx4 driver uses a type of RAW QP to send packets for QP1 and therefore needs to build the network headers below BTH in software. This patch adds option to build QP1 packets with IP and UDP headers if RoCEv2 is requested. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/mlx4: Support modify_qp for RoCE v2Moni Shoua
In order to support modify_qp for RoCE v2, we need to set the gid_type in the QP context. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/core: Add definition for the standard RoCE V2 UDP portMoni Shoua
This will be used in hardware device driver when building QP or AH contexts. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19net/mlx4_core: Add support for RoCE v2 entropyMoni Shoua
In RoCE v2 we need to choose a source UDP port, we do so by using entropy over the source and dest QPNs. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19net/mlx4_core: Add support for configuring RoCE v2 UDP portMoni Shoua
In order to support RoCE v2, the hardware needs to be configured to classify certain UDP packets as RoCE v2 packets and pass it through its RoCE pipeline. This patch enables configuring this UDP port. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/mlx4: Add support for setting RoCEv2 gids in hardwareMoni Shoua
To tell hardware about a gid with type RoCEv2, software needs a new modifier to the SET_PORT command: MLX4_SET_PORT_ROCE_ADDR. This can replace the old method, MLX4_SET_PORT_GID_TABLE, for RoCEv1 gids. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19net/mlx4: Query RoCE supportMoni Shoua
Query the RoCE support from firmware using the appropriate firmware commands. Downstream patches will read these capabilities and act accordingly. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svc_rdma: use local_dma_lkeyChristoph Hellwig
We now alwasy have a per-PD local_dma_lkey available. Make use of that fact in svc_rdma and stop registering our own MR. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Add class for RDMA backwards direction transportChuck Lever
To support the server-side of an NFSv4.1 backchannel on RDMA connections, add a transport class that enables backward direction messages on an existing forward channel connection. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Define maximum number of backchannel requestsChuck Lever
Extra resources for handling backchannel requests have to be pre-allocated when a transport instance is created. Set up additional fields in svcxprt_rdma to track these resources. The max_requests fields are elements of the RPC-over-RDMA protocol, so they should be u32. To ensure that unsigned arithmetic is used everywhere, some other fields in the svcxprt_rdma struct are updated. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Make map_xdr non-staticChuck Lever
Pre-requisite to use map_xdr in the backchannel code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Add gfp flags to svc_rdma_post_recv()Chuck Lever
svc_rdma_post_recv() allocates pages for receive buffers on-demand. It uses GFP_KERNEL so the allocator tries hard, and may sleep. But I'm about to add a call to svc_rdma_post_recv() from a function that may not sleep. Since all svc_rdma_post_recv() call sites can tolerate its failure, allow it to fail if the page allocator returns nothing. Longer term, receive buffers, being a finite resource per-connection, should be pre-allocated and re-used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Remove unused req_map and ctxt kmem_cachesChuck Lever
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Improve allocation of struct svc_rdma_req_mapChuck Lever
To ensure this allocation cannot fail and will not sleep, pre-allocate the req_map structures per-connection. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19svcrdma: Improve allocation of struct svc_rdma_op_ctxtChuck Lever
When the maximum payload size of NFS READ and WRITE was increased by commit cc9a903d915c ("svcrdma: Change maximum server payload back to RPCSVC_MAXPAYLOAD"), the size of struct svc_rdma_op_ctxt increased to over 6KB (on x86_64). That makes allocating one of these from a kmem_cache more likely to fail in situations when system memory is exhausted. Since I'm about to add a caller where this allocation must always work _and_ it cannot sleep, pre-allocate ctxts for each connection. Another motivation for this change is that NFSv4.x servers are required by specification not to drop NFS requests. Pre-allocating memory resources reduces the likelihood of a drop. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/core: Use hop-limit from IP stack for RoCEMatan Barak
Previously, IPV6_DEFAULT_HOPLIMIT was used as the hop limit value for RoCE. Fixing that by taking ip4_dst_hoplimit and ip6_dst_hoplimit as hop limit values. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/core: Rename rdma_addr_find_dmac_by_grhMatan Barak
rdma_addr_find_dmac_by_grh resolves dmac, vlan_id and if_index and downsteram patch will also add hop_limit as an output parameter, thus we rename it to rdma_addr_find_l2_eth_by_grh. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/mad: pass ib_mad_send_buf explicitly to the recv_handlerChristoph Hellwig
Stop abusing wr_id and just pass the parameter explicitly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19net/mlx4: Remove unused macroMoni Shoua
The macro mlx4_foreach_non_ib_transport_port() is not used anywhere. Remove it. Fixes: aa9a2d51a3e7 ("mlx4: Activate RoCE/SRIOV") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19soreuseport: fix NULL ptr dereference SO_REUSEPORT after bindCraig Gallek
Marc Dionne discovered a NULL pointer dereference when setting SO_REUSEPORT on a socket after it is bound. This patch removes the assumption that at least one socket in the reuseport group is bound with the SO_REUSEPORT option before other bind calls occur. Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection") Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: Craig Gallek <kraig@google.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-18Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio barrier rework+fixes from Michael Tsirkin: "This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits) checkpatch: add virt barriers checkpatch: check for __smp outside barrier.h checkpatch.pl: add missing memory barriers virtio: make find_vqs() checkpatch.pl-friendly virtio_balloon: fix race between migration and ballooning virtio_balloon: fix race by fill and leak s390: more efficient smp barriers s390: use generic memory barriers xen/events: use virt_xxx barriers xen/io: use virt_xxx barriers xenbus: use virt_xxx barriers virtio_ring: use virt_store_mb sh: move xchg_cmpxchg to a header by itself sh: support 1 and 2 byte xchg virtio_ring: update weak barriers to use virt_xxx Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb" asm-generic: implement virt_xxx memory barriers x86: define __smp_xxx xtensa: define __smp_xxx tile: define __smp_xxx ...
2016-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds
Pull arch/tile updates from Chris Metcalf: "This is a grab bag of changes that includes some NOHZ and context-tracking related changes, some debugging improvements, JUMP_LABEL support, and some fixes for tilepro allmodconfig support. We also remove the now-unused node_has_online_mem() definitions both for tile's asm/topology.h as well as in linux/topology.h itself" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: numa: remove stale node_has_online_mem() define arch/tile: move user_exit() to early kernel entry sequence tile: fix bug in setting PT_FLAGS_DISABLE_IRQ on kernel entry tile: fix tilepro casts for readl, writel, etc tile: fix a -Wframe-larger-than warning tile: include the syscall number in the backtrace MAINTAINERS: add git URL for tile arch/tile: adopt prepare_exit_to_usermode() model from x86 tile/jump_label: add jump label support for TILE-Gx tile: define a macro ktext_writable_addr to get writable kernel text address
2016-01-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32 Pull AVR32 updates from Hans-Christian Noren Egtvedt. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32: mmc: atmel: get rid of struct mci_dma_data mmc: atmel-mci: restore dma on AVR32 avr32: wire up missing syscalls avr32: wire up accept4 syscall
2016-01-18Merge branch 'for-linus-4.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This has our usual assortment of fixes and cleanups, but the biggest change included is Omar Sandoval's free space tree. It's not the default yet, mounting -o space_cache=v2 enables it and sets a readonly compat bit. The tree can actually be deleted and regenerated if there are any problems, but it has held up really well in testing so far. For very large filesystems (30T+) our existing free space caching code can end up taking a huge amount of time during commits. The new tree based code is faster and less work overall to update as the commit progresses. Omar worked on this during the summer and we'll hammer on it in production here at FB over the next few months" * 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (73 commits) Btrfs: fix fitrim discarding device area reserved for boot loader's use Btrfs: Check metadata redundancy on balance btrfs: statfs: report zero available if metadata are exhausted btrfs: preallocate path for snapshot creation at ioctl time btrfs: allocate root item at snapshot ioctl time btrfs: do an allocation earlier during snapshot creation btrfs: use smaller type for btrfs_path locks btrfs: use smaller type for btrfs_path lowest_level btrfs: use smaller type for btrfs_path reada btrfs: cleanup, use enum values for btrfs_path reada btrfs: constify static arrays btrfs: constify remaining structs with function pointers btrfs tests: replace whole ops structure for free space tests btrfs: use list_for_each_entry* in backref.c btrfs: use list_for_each_entry_safe in free-space-cache.c btrfs: use list_for_each_entry* in check-integrity.c Btrfs: use linux/sizes.h to represent constants btrfs: cleanup, remove stray return statements btrfs: zero out delayed node upon allocation btrfs: pass proper enum type to start_transaction() ...
2016-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull more networking fixes from David Miller: 1) Fix brcmfmac build with older gcc, from Arend van Spriel. 2) IRQ values unintentionally truncated to u8 in mlx5 driver, from Doron Tsur. 3) Fix build warnings wrt tcp cgroup changes, from Geert Uytterhoeven. 4) Limit deep recursion in ovs stack, from Hannes Frederic Sowa. 5) at803x phy driver bug fixes from, Martin Blumenstingl. 6) Fix TSO handling in hns driver, from Daode Huang * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits) ovs: limit ovs recursions in ovs_execute_actions to not corrupt stack team: Replace rcu_read_lock with a mutex in team_vlan_rx_kill_vid net: hns: bug fix about hisilicon TSO BD mode brcmfmac: fix BRCMF_FW_NVRAM_DEF macro for older gcc compilers net: phy: at803x: Add the interrupt register bit definitions net: phy: at803x: Clean up duplicate register definitions net: phy: at803x: Allow specifying the RGMII RX clock delay via phy mode net: phy: at803x: Don't set gbit features for the AR8030 phy arm64: bpf: add extra pass to handle faulty codegen arm64: insn: remove BUG_ON from codegen sctp: the temp asoc's transports should not be hashed/unhashed net/mlx5_core: Fix trimming down IRQ number tcp_memcontrol: Forward declare cgroup_subsys and mem_cgroup stucts batman-adv: Drop immediate orig_node free function batman-adv: Drop immediate batadv_hard_iface free function batman-adv: Drop immediate neigh_ifinfo free function batman-adv: Drop immediate batadv_hardif_neigh_node free function batman-adv: Drop immediate batadv_neigh_node free function batman-adv: Drop immediate batadv_orig_ifinfo free function batman-adv: Avoid recursive call_rcu for batadv_nc_node ...
2016-01-18Merge tag 'rtc-4.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Core: - fix module reference count in rtc-proc - Replace simple_strtoul by kstrtoul New driver: - Epson RX8010SJ Subsystem wide cleanups: - use %ph for short hex dumps - constify *_chip_ops structures Drivers: - abx80x: Microcrystal rv1805 support, alarm support - cmos: prevent kernel warning on IRQ flags mismatch - s5m: various cleanups - rv8803: rx8900 compatibility, small error path fix - sunxi: various cleanups - lpc32xx: remove irq > NR_IRQS check from probe() - imxdi: fix spelling mistake in warning message - ds1685: don't try to micromanage sysfs output size - da9063: avoid writing undefined data to rtc - gemini: Remove unnecessary platform_set_drvdata() - efi: add efi_procfs in efi_rtc_ops - pcf8523: refuse to write dates later than 2099" * tag 'rtc-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (24 commits) rtc: cmos: prevent kernel warning on IRQ flags mismatch rtc: rtc-ds2404: constify ds2404_chip_ops structures rtc: s5m: Make register configuration per S2MPS device to remove exceptions rtc: s5m: Add separate field for storing auto-cleared mask in register config rtc: s5m: Cleanup by removing useless 'rtc' prefix from fields rtc: Replace simple_strtoul by kstrtoul rtc: abx80x: add alarm support rtc: abx80x: Add Microcrystal rv1805 support rtc: v3020: constify v3020_chip_ops structures rtc: rv8803: Extend compatibility with the rx8900 rtc: rv8803: fix handling return value of i2c_smbus_read_byte_data rtc: Add Epson RX8010SJ RTC driver rtc: lpc32xx: remove irq > NR_IRQS check from probe() rtc: imxdi: fix spelling mistake in warning message rtc: ds1685: don't try to micromanage sysfs output size rtc: use %ph for short hex dumps rtc: da9063: avoid writing undefined data to rtc rtc: sunxi: use of_device_get_match_data rtc: sunxi: constify the data_year_param structure rtc: sunxi: fix signedness issues ...
2016-01-18Merge tag 'fbdev-4.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux Pull fbdev updates from Tomi Valkeinen: "Summary: - pxafb: device-tree support - An unsafe kernel parameter 'lockless_register_fb' for debugging problems happening while inside the console lock - Small miscellaneous fixes & cleanups - omapdss: add writeback support functions - Separation of omapfb and omapdrm (see below) About the separation of omapfb and omapdrm, see http://permalink.gmane.org/gmane.comp.video.dri.devel/143151 for longer story. The short version: omapfb and omapdrm have shared low level drivers (omapdss and panel drivers), making further development of omapdrm difficult. After these patches omapfb and omapdrm have their own versions of the drivers, which are more or less direct copies for now but will diverge soon. This also means that omapfb (everything under drivers/video/fbdev/omap2/) is now in maintenance mode, and all new development will be done for omapdrm (drivers/gpu/drm/omapdrm/)" * tag 'fbdev-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (49 commits) video: fbdev: pxafb: fix out of memory error path drm/omap: make omapdrm select OMAP2_DSS drm/omap: move omapdss & displays under omapdrm omapfb: move vrfb into omapfb omapfb: take omapfb's private omapdss into use omapfb/displays: change CONFIG_DISPLAY_* to CONFIG_FB_OMAP2_* omapfb/dss: change CONFIG_OMAP* to CONFIG_FB_OMAP* omapdss: remove CONFIG_OMAP2_DSS_VENC from omapdss.h omapfb: copy omapdss & displays for omapfb omapfb: allow compilation only if DRM_OMAP is disabled fbdev: omap2: panel-dpi: simplify gpio setting fbdev: omap2: panel-dpi: in .disable first disable backlight then display OMAPDSS: DSS: fix a warning message video: omapdss: delete unneeded of_node_put OMAPDSS: DISPC: Remove boolean comparisons OMAPDSS: DSI: cleanup DSI_IRQ_ERROR_MASK define OMAPDSS: remove extra out == NULL checks OMAPDSS: change internal dispc functions to static OMAPDSS: make a two dss feat funcs internal to omapdss OMAPDSS: remove extra EXPORT_SYMBOLs ...
2016-01-18numa: remove stale node_has_online_mem() defineChris Metcalf
This isn't used anywhere, so delete it. Looks like the last usage (in x86-specific code) was removed by Tejun in 2011 in commit bd6709a91a59 ("x86, NUMA: Make 32bit use common NUMA init path"). Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2016-01-18crypto: af_alg - Allow af_af_alg_release_parent to be called on nokey pathHerbert Xu
This patch allows af_alg_release_parent to be called even for nokey sockets. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-18crypto: skcipher - Add crypto_skcipher_has_setkeyHerbert Xu
This patch adds a way for skcipher users to determine whether a key is required by a transform. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-18crypto: hash - Add crypto_ahash_has_setkeyHerbert Xu
This patch adds a way for ahash users to determine whether a key is required by a crypto_ahash transform. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-18crypto: af_alg - Add nokey compatibility pathHerbert Xu
This patch adds a compatibility path to support old applications that do acept(2) before setkey. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-18crypto: af_alg - Disallow bind/setkey/... after accept(2)Herbert Xu
Each af_alg parent socket obtained by socket(2) corresponds to a tfm object once bind(2) has succeeded. An accept(2) call on that parent socket creates a context which then uses the tfm object. Therefore as long as any child sockets created by accept(2) exist the parent socket must not be modified or freed. This patch guarantees this by using locks and a reference count on the parent socket. Any attempt to modify the parent socket will fail with EBUSY. Cc: stable@vger.kernel.org Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>