summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-07-30libceph: define ceph_extract_encoded_string()Alex Elder
This adds a new utility routine which will return a dynamically- allocated buffer containing a string that has been decoded from ceph over-the-wire format. It also returns the length of the string if the address of a size variable is supplied to receive it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-07-30rbd: drop a useless local variableAlex Elder
In rbd_req_sync_notify_ack(), a local variable was needlessly being used to hold a null pointer. Just pass NULL instead. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30libceph: fix off-by-one bug in ceph_encode_filepath()Alex Elder
There is a BUG_ON() call that doesn't account for the single byte structure version at the start of an encoded filepath in ceph_encode_filepath(). Fix that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30ceph: clean up useless d_parent checksSage Weil
d_parent is never NULL, and IS_ROOT() is the proper way to check for a (non-self-referential) parent. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30libceph: prevent the race of incoming work during teardownGuanjun He
Add an atomic variable 'stopping' as flag in struct ceph_messenger, set this flag to 1 in function ceph_destroy_client(), and add the condition code in function ceph_data_ready() to test the flag value, if true(1), just return. Signed-off-by: Guanjun He <gjhe@suse.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-07-30libceph: fix messenger retrySage Weil
In ancient times, the messenger could both initiate and accept connections. An artifact if that was data structures to store/process an incoming ceph_msg_connect request and send an outgoing ceph_msg_connect_reply. Sadly, the negotiation code was referencing those structures and ignoring important information (like the peer's connect_seq) from the correct ones. Among other things, this fixes tight reconnect loops where the server sends RETRY_SESSION and we (the client) retries with the same connect_seq as last time. This bug pretty easily triggered by injecting socket failures on the MDS and running some fs workload like workunits/direct_io/test_sync_io. Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30libceph: initialize rb, list nodes in ceph_osd_requestSage Weil
These don't strictly need to be initialized based on how they are used, but it is good practice to do so. Reported-by: Alex Elder <elder@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30libceph: initialize msgpool message typesSage Weil
Initialize the type field for messages in a msgpool. The caller was doing this for osd ops, but not for the reply messages. Reported-by: Alex Elder <elder@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30Merge branch 'for-3.6' of git://gitorious.org/linux-pwm/linux-pwmLinus Torvalds
Pull PWM subsystem from Thierry Reding: "The new PWM subsystem aims at collecting all implementations of the legacy PWM API and to eventually replace it completely. The subsystem has been in development for over half a year now and many drivers have already been converted. It has been in linux-next for a couple of weeks and there have been no major issues so I think it is ready for inclusion in your tree." Arnd Bergmann <arnd@arndb.de>: "Very much Ack on the new subsystem. It uses the interface declarations as the previously separate pwm drivers, so nothing changes for now in the drivers using it, although it enables us to change those more easily in the future if we want to. This work is also one of the missing pieces that are required to eventually build ARM kernels for multiple platforms, which is currently prohibited (amongs other things) by the fact that you cannot have more than one driver exporting the pwm functions." Tested-and-acked-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Philip, Avinash <avinashphilip@ti.com> # TI's AM33xx platforms Acked-By: Alexandre Pereira da Silva <aletes.xgr@gmail.com> # LPC32XX Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sachin Kamat <sachin.kamat@linaro.org> Fix up trivial conflicts with other cleanups and DT updates. * 'for-3.6' of git://gitorious.org/linux-pwm/linux-pwm: (36 commits) pwm: pwm-tiehrpwm: PWM driver support for EHRPWM pwm: pwm-tiecap: PWM driver support for ECAP APWM pwm: fix used-uninitialized warning in pwm_get() pwm: add lpc32xx PWM support pwm_backlight: pass correct brightness to callback pwm: Use pr_* functions in pwm-samsung.c file pwm: Convert pwm-samsung to use devm_* APIs pwm: Convert pwm-tegra to use devm_clk_get() pwm: pwm-mxs: Return proper error if pwmchip_remove() fails pwm: pwm-bfin: Return proper error if pwmchip_remove() fails pwm: pxa: Propagate pwmchip_remove() error pwm: Convert pwm-pxa to use devm_* APIs pwm: Convert pwm-vt8500 to use devm_* APIs pwm: Convert pwm-imx to use devm_* APIs pwm: Conflict with legacy PWM API pwm: pwm-mxs: add pinctrl support pwm: pwm-mxs: use devm_* managed functions pwm: pwm-mxs: use global reset function stmp_reset_block pwm: pwm-mxs: encode soc name in compatible string pwm: Take over maintainership of the PWM subsystem ...
2012-07-30[media] Fix VIDIOC_TRY_EXT_CTRLS regressionHans Verkuil
Fixes an omission in the new v4l2_ioctls table: VIDIOC_TRY_EXT_CTRLS must get the INFO_FL_CTRL flag, just like all the other control related ioctls, otherwise the ioctl core won't know it also has to check whether v4l2_fh->ctrl_handler is non-zero before it can decide that this ioctl is not implemented. Caught by v4l2-compliance while I was testing the mem2mem_testdev driver. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-07-30ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attributeMarek Szyprowski
This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for dma_(un)map_(single,page,sg) functions family. It lets dma mapping clients to create a mapping for the buffer for the given device without performing a CPU cache synchronization. CPU cache synchronization can be skipped for the buffers which it is known that they are already in 'device' domain (CPU caches have been already synchronized or there are only coherent mappings for the buffer). For advanced users only, please use it with care. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attributeMarek Szyprowski
This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping subsystem. By default dma_map_{single,page,sg} functions family transfer a given buffer from CPU domain to device domain. Some advanced use cases might require sharing a buffer between more than one device. This requires having a mapping created separately for each device and is usually performed by calling dma_map_{single,page,sg} function more than once for the given buffer with device pointer to each device taking part in the buffer sharing. The first call transfers a buffer from 'CPU' domain to 'device' domain, what synchronizes CPU caches for the given region (usually it means that the cache has been flushed or invalidated depending on the dma direction). However, next calls to dma_map_{single,page,sg}() for other devices will perform exactly the same sychronization operation on the CPU cache. CPU cache sychronization might be a time consuming operation, especially if the buffers are large, so it is highly recommended to avoid it if possible. DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of the CPU cache for the given buffer assuming that it has been already transferred to 'device' domain. This attribute can be also used for dma_unmap_{single,page,sg} functions family to force buffer to stay in device domain after releasing a mapping for it. Use this attribute with care! Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30ARM: dma-mapping: add support for dma_get_sgtable()Marek Szyprowski
This patch adds support for dma_get_sgtable() function which is required to let drivers to share the buffers allocated by DMA-mapping subsystem. Generic implementation based on virt_to_page() is not suitable for ARM dma-mapping subsystem. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30common: dma-mapping: introduce dma_get_sgtable() functionMarek Szyprowski
This patch adds dma_get_sgtable() function which is required to let drivers to share the buffers allocated by DMA-mapping subsystem. Right now the driver gets a dma address of the allocated buffer and the kernel virtual mapping for it. If it wants to share it with other device (= map into its dma address space) it usually hacks around kernel virtual addresses to get pointers to pages or assumes that both devices share the DMA address space. Both solutions are just hacks for the special cases, which should be avoided in the final version of buffer sharing. To solve this issue in a generic way, a new call to DMA mapping has been introduced - dma_get_sgtable(). It allocates a scatter-list which describes the allocated buffer and lets the driver(s) to use it with other device(s) by calling dma_map_sg() on it. This patch provides a generic implementation based on virt_to_page() call. Architectures which require more sophisticated translation might provide their own get_sgtable() methods. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-30ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attributeMarek Szyprowski
This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for IOMMU allocations, what let drivers to save precious kernel virtual address space for large buffers that are intended to be accessed only from userspace. This patch is heavily based on initial work kindly provided by Abhinav Kochhar <abhinav.k@samsung.com>. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attributeMarek Szyprowski
This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the platform to avoid creating a kernel virtual mapping for the allocated buffer. On some architectures creating such mapping is non-trivial task and consumes very limited resources (like kernel virtual address space or dma consistent address space). Buffers allocated with this attribute can be only passed to user space by calling dma_mmap_attrs(). Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-30common: dma-mapping: add support for generic dma_mmap_* callsMarek Szyprowski
Commit 9adc5374 ('common: dma-mapping: introduce mmap method') added a generic method for implementing mmap user call to dma_map_ops structure. This patch converts ARM and PowerPC architectures (the only providers of dma_mmap_coherent/dma_mmap_writecombine calls) to use this generic dma_map_ops based call and adds a generic cross architecture definition for dma_mmap_attrs, dma_mmap_coherent, dma_mmap_writecombine functions. The generic mmap virt_to_page-based fallback implementation is provided for architectures which don't provide their own implementation for mmap method. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30ARM: dma-mapping: fix error path for memory allocation failureMarek Szyprowski
This patch fixes incorrect check in error path. When the allocation of first page fails, the kernel ops appears due to accessing -1 element of the pages array. Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-30ARM: dma-mapping: add more sanity checks in arm_dma_mmap()Marek Szyprowski
Add some sanity checks and forbid mmaping of buffers into vma areas larger than allocated dma buffer. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-30ARM: dma-mapping: remove custom consistent dma regionMarek Szyprowski
This patch changes dma-mapping subsystem to use generic vmalloc areas for all consistent dma allocations. This increases the total size limit of the consistent allocations and removes platform hacks and a lot of duplicated code. Atomic allocations are served from special pool preallocated on boot, because vmalloc areas cannot be reliably created in atomic context. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Minchan Kim <minchan@kernel.org>
2012-07-30mm: vmalloc: use const void * for caller argumentMarek Szyprowski
'const void *' is a safer type for caller function type. This patch updates all references to caller function type. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Minchan Kim <minchan@kernel.org>
2012-07-30scatterlist: add sg_alloc_table_from_pages functionTomasz Stanislawski
This patch adds a new constructor for an sg table. The table is constructed from an array of struct pages. All contiguous chunks of the pages are merged into a single sg nodes. A user may provide an offset and a size of a buffer if the buffer is not page-aligned. The function is dedicated for DMABUF exporters which often perform conversion from an page array to a scatterlist. Moreover the scatterlist should be squashed in order to save memory and to speed-up the process of DMA mapping using dma_map_sg. The code is based on the patch 'v4l: vb2-dma-contig: add support for scatterlist in userptr mode' and hints from Laurent Pinchart. Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> CC: Andrew Morton <akpm@linux-foundation.org>
2012-07-30mm: Fix build warning in kmem_cache_create()Shuah Khan
The label oops is used in CONFIG_DEBUG_VM ifdef block and is defined outside ifdef CONFIG_DEBUG_VM block. This results in the following build warning when built with CONFIG_DEBUG_VM disabled. Fix to move label oops definition to inside a CONFIG_DEBUG_VM block. mm/slab_common.c: In function ‘kmem_cache_create’: mm/slab_common.c:101:1: warning: label ‘oops’ defined but not used [-Wunused-label] Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-07-30hwmon: struct x86_cpu_id arrays can be __initconstJan Beulich
... as being referenced from __init code only. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-07-30uprobes: __replace_page() needs munlock_vma_page()Oleg Nesterov
Like do_wp_page(), __replace_page() should do munlock_vma_page() for the case when the old page still has other !VM_LOCKED mappings. Unfortunately this needs mm/internal.h. Also, move put_page() outside of ptl lock. This doesn't really matter but looks a bit better. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182249.GA20372@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Rename vma_address() and make it return "unsigned long"Oleg Nesterov
1. vma_address() returns loff_t, this looks confusing and this is unnecessary after the previous change. Make it return "ulong", all callers truncate the result anyway. 2. Its name conflicts with mm/rmap.c:vma_address(), rename it to offset_to_vaddr(), this matches vaddr_to_offset(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182247.GA20365@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Fix register_for_each_vma()->vma_address() checkOleg Nesterov
1. register_for_each_vma() checks that vma_address() == vaddr, but this is not enough. We should also ensure that vaddr >= vm_start, find_vma() guarantees "vaddr < vm_end" only. 2. After the prevous changes, register_for_each_vma() is the only reason why vma_address() has to return loff_t, all other users know that we have the valid mapping at this offset and thus the overflow is not possible. Change the code to use vaddr_to_offset() instead, imho this looks more clean/understandable and now we can change vma_address(). 3. While at it, remove the unnecessary type-cast. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182244.GA20362@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Introduce vaddr_to_offset(vma, vaddr)Oleg Nesterov
Add the new helper, vaddr_to_offset(vma, vaddr) which returns the offset in vma->vm_file this vaddr is mapped at. Change build_probe_list() and find_active_uprobe() to use the new helper, the next patch adds another user. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182242.GA20355@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Teach build_probe_list() to consider the rangeOleg Nesterov
Currently build_probe_list() builds the list of all uprobes attached to the given inode, and the caller should filter out those who don't fall into the [start,end) range, this is sub-optimal. This patch turns find_least_offset_node() into find_node_in_range() which returns the first node inside the [min,max] range, and changes build_probe_list() to use this node as a starting point for rb_prev() and rb_next() to find all other nodes the caller needs. The resulting list is no longer sorted but we do not care. This can speed up both build_probe_list() and the callers, but there is another reason to introduce find_node_in_range(). It can be used to figure out whether the given vma has uprobes or not, this will be needed soon. While at it, shift INIT_LIST_HEAD(tmp_list) into build_probe_list(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182240.GA20352@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Remove insert_vm_struct()->uprobe_mmap()Oleg Nesterov
Remove insert_vm_struct()->uprobe_mmap(). It is not needed, nobody except arch/ia64/kernel/perfmon.c uses insert_vm_struct(vma) with vma->vm_file != NULL. And it is wrong. Again, get_user_pages() can not succeed before vma_link(vma) makes is visible to find_vma(). And even if this worked, we must not insert the new bp before this mapping is visible to vma_prio_tree_foreach() for uprobe_unregister(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182238.GA20349@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Remove copy_vma()->uprobe_mmap()Oleg Nesterov
Remove copy_vma()->uprobe_mmap(new_vma), it is absolutely wrong. This new_vma was just initialized to represent the new unmapped area, [vm_start, vm_end) was returned by get_unmapped_area() in the caller. This means that uprobe_mmap()->get_user_pages() will fail for sure, simply because find_vma() can never succeed. And I verified that sys_mremap()->mremap_to() indeed always fails with the wrong ENOMEM code if [addr, addr+old_len] is probed. And why this uprobe_mmap() was added? I believe the intent was wrong. Note that the caller is going to do move_page_tables(), all registered uprobes are already faulted in, we only change the virtual addresses. NOTE: However, somehow we need to close the race with uprobe_register() which relies on map_info->vaddr. This needs another fix I'll try to do later. Probably we need uprobe_mmap() in move_vma() but we can not do this right now, this can confuse uprobes_state.counter (which I still hope we are going to kill). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182236.GA20342@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Fix overflow in vma_address()/find_active_uprobe()Oleg Nesterov
vma->vm_pgoff is "unsigned long", it should be promoted to loff_t before the multiplication to avoid the overflow. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182233.GA20339@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Suppress uprobe_munmap() from mmput()Oleg Nesterov
uprobe_munmap() does get_user_pages() and it is also called from the final mmput()->exit_mmap() path. This slows down exit/mmput() for no reason, and I think it is simply dangerous/wrong to try to fault-in a page into the dying mm. If nothing else, this happens after the last sync_mm_rss(), afaics handle_mm_fault() can change the task->rss_stat and make the subsequent check_mm() unhappy. Change uprobe_munmap() to check mm->mm_users != 0. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182231.GA20336@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Uprobe_mmap/munmap needs list_for_each_entry_safe()Oleg Nesterov
The bug was introduced by me in 449d0d7c ("uprobes: Simplify the usage of uprobe->pending_list"). Yes, we do not care about uprobe->pending_list after return and nobody can remove the current list entry, but put_uprobe(uprobe) can actually free it and thus we need list_for_each_safe(). Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Link: http://lkml.kernel.org/r/20120729182229.GA20329@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Clean up and document write_opcode()->lock_page(old_page)Oleg Nesterov
The comment above write_opcode()->lock_page(old_page) tells about the race with do_wp_page(). I don't really understand which exactly race it means, but afaics this lock_page() was not enough to close all races with do_wp_page(). Anyway, since: 77fc4af1b59d uprobes: Change register_for_each_vma() to take mm->mmap_sem for writing this code is always called with ->mmap_sem held for writing, so we can forget about do_wp_page(). However, we can't simply remove this lock_page(), and the only (afaics) reason is __replace_page()->try_to_free_swap(). Nothing in write_opcode() needs it, move it into __replace_page() and fix the comment. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182220.GA20322@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Kill write_opcode()->lock_page(new_page)Oleg Nesterov
write_opcode() does lock_page(new_page) for no reason. Nobody can see this page until __replace_page() exposes it under ptl lock, and we do nothing with this page after pte_unmap_unlock(). If nothing else, the similar code in do_wp_page() doesn't lock the new page for page_add_new_anon_rmap/set_pte_at_notify. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182218.GA20315@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: __replace_page() should not use page_address_in_vma()Oleg Nesterov
page_address_in_vma(old_page) in __replace_page() is ugly and wrong. The caller already knows the correct virtual address, this page was found by get_user_pages(vaddr). However, page_address_in_vma() can actually fail if page->mapping was cleared by __delete_from_page_cache() after get_user_pages() returns. But this means the race with page reclaim, write_opcode() should not fail, it should retry and read this page again. Probably the race with remove_mapping() is not possible due to page_freeze_refs() logic, but afaics at least shmem_writepage()->shmem_delete_from_page_cache() can clear ->mapping. We could change __replace_page() to return -EAGAIN in this case, but it would be better to simply use the caller's vaddr and rely on page_check_address(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182216.GA20311@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30uprobes: Don't recheck vma/f_mapping in write_opcode()Oleg Nesterov
write_opcode() rechecks valid_vma() and ->f_mapping, this is pointless. The caller, register_for_each_vma() or uprobe_mmap(), has already done these checks under mmap_sem. To clarify, uprobe_mmap() checks valid_vma() only, but we can rely on build_probe_list(vm_file->f_mapping->host). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182212.GA20304@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30s390: make use of user_mode() macro where possibleHeiko Carstens
We use the user_mode() helper already at several places but also have the open coded variant at other places. Convert the code to always use the helper function. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-07-30s390/mm: rename user_mode variable to addressing_modeHeiko Carstens
Fix name clash with user_mode() define which is also used in common code. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-07-30s390/mm: fix fault handling for page table walk caseHeiko Carstens
Make sure the kernel does not incorrectly create a SIGBUS signal during user space accesses: For user space accesses in the switched addressing mode case the kernel may walk page tables and access user address space via the kernel mapping. If a page table entry is invalid the function __handle_fault() gets called in order to emulate a page fault and trigger all the usual actions like paging in a missing page etc. by calling handle_mm_fault(). If handle_mm_fault() returns with an error fixup handling is necessary. For the switched addressing mode case all errors need to be mapped to -EFAULT, so that the calling uaccess function can return -EFAULT to user space. Unfortunately the __handle_fault() incorrectly calls do_sigbus() if VM_FAULT_SIGBUS is set. This however should only happen if a page fault was triggered by a user space instruction. For kernel mode uaccesses the correct action is to only return -EFAULT. So user space may incorrectly see SIGBUS signals because of this bug. For current machines this would only be possible for the switched addressing mode case in conjunction with futex operations. Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-07-30s390/mm: make page faults killableHeiko Carstens
This is the s390 variant of 37b23e05 "x86,mm: make pagefault killable". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-07-29net/tun: fix ioctl() based info leaksMathias Krause
The tun module leaks up to 36 bytes of memory by not fully initializing a structure located on the stack that gets copied to user memory by the TUNGETIFF and SIOCGIFHWADDR ioctl()s. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29tg3: Update version to 3.124Michael Chan
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29tg3: Fix race condition in tg3_get_stats64()Michael Chan
Spinlock should be taken before checking for tp->hw_stats. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29tg3: Add New 5719 Read DMA workaroundMichael Chan
After Power-on-reset, the 5719's TX DMA length registers may contain uninitialized values and cause TX DMA to stall. Check for invalid values and set a register bit to flush the TX channels. The bit needs to be turned off after the DMA channels have been flushed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29tg3: Fix Read DMA workaround for 5719 A0.Michael Chan
The workaround was mis-applied to all 5719 and 5720 chips. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29tg3: Request APE_LOCK_PHY before PHY accessMichael Chan
to prevent PHY access conflict with APE firmware. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29ipv6: fix incorrect route 'expires' value passed to userspaceLi Wei
When userspace use RTM_GETROUTE to dump route table, with an already expired route entry, we always got an 'expires' value(2147157) calculated base on INT_MAX. The reason of this problem is in the following satement: rt->dst.expires - jiffies < INT_MAX gcc promoted the type of both sides of '<' to unsigned long, thus a small negative value would be considered greater than INT_MAX. With the help of Eric Dumazet, do the out of bound checks in rtnl_put_cacheinfo(), _after_ conversion to clock_t. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29mISDN: Bugfix only few bytes are transfered on a connectionKarsten Keil
The test for the fillempty condition was wrong in one place. Changed the variable to the right boolean type. Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>