linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2014-01-27	ocfs2: do not log ENOENT in unlink()	Xiaowei.Hu
	Suppress log message like this: (open_delete,8328,0):ocfs2_unlink:951 ERROR: status = -2 Orabug:17445485 Signed-off-by: Xiaowei Hu <xiaowei.hu@oracle.com> Cc: Joe Jin <joe.jin@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	mm: bring back /sys/kernel/mm	Hugh Dickins
	Commit da29bd36224b ("mm/mm_init.c: make creation of the mm_kobj happen earlier than device_initcall") changed to pure_initcall(mm_sysfs_init). That's too early: mm_sysfs_init() depends on core_initcall(ksysfs_init) to have made the kernel_kobj directory "kernel" in which to create "mm". Make it postcore_initcall(mm_sysfs_init). We could use core_initcall(), and depend upon Makefile link order kernel/ mm/ fs/ ipc/ security/ ... as core_initcall(debugfs_init) and core_initcall(securityfs_init) do; but better not. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	arch/unicore32/kernel/early_printk.c:setup_early_printk: missing initialization	Heinrich Schuchardt
	It is based on uninitialized value keep_early. This leads to unpredictable result. [akpm@linux-foundation.org: simplify code] Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	Revert "mm/vmalloc: interchage the implementation of vmalloc_to_{pfn,page}"	malc
	Revert commit ece86e222db4, which was intended as a small performance improvement. Despite the claim that the patch doesn't introduce any functional changes in fact it does. The "no page" path behaves different now. Originally, vmalloc_to_page might return NULL under some conditions, with new implementation it returns pfn_to_page(0) which is not the same as NULL. Simple test shows the difference. test.c #include <linux/kernel.h> #include <linux/module.h> #include <linux/vmalloc.h> #include <linux/mm.h> int __init myi(void) { struct page p; void v; v = vmalloc(PAGE_SIZE); /* trigger the "no page" path in vmalloc_to_page*/ vfree(v); p = vmalloc_to_page(v); pr_err("expected val = NULL, returned val = %p", p); return -EBUSY; } void __exit mye(void) { } module_init(myi) module_exit(mye) Before interchange: expected val = NULL, returned val = (null) After interchange: expected val = NULL, returned val = c7ebe000 Signed-off-by: Vladimir Murzin <murzin.v@gmail.com> Cc: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	memblock: don't silently align size in memblock_virt_alloc()	Yinghai Lu
	In original __alloc_memory_core_early() for bootmem wrapper, we do not align size silently. We should not do that, as later free with old size will leave some range not freed. It's obvious that code is copied from memblock_base_nid(), and that code is wrong for the same reason. Also remove that in memblock_alloc_base. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	x86: revert wrong memblock current limit setting	Yinghai Lu
	Dave reported big numa system booting is broken. It turns out that commit 5b6e529521d3 ("x86: memblock: set current limit to max low memory address") sets the limit to low wrongly. max_low_pfn_mapped is different from max_pfn_mapped. max_low_pfn_mapped is always under 4G. That will memblock_alloc_nid all go under 4G. Revert 5b6e529521d3 to fix a no-boot regression which was triggered by 457ff1de2d24 ("lib/swiotlb.c: use memblock apis for early memory allocations"). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Reported-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	memblock, nobootmem: add memblock_virt_alloc_low()	Yinghai Lu
	The new memblock_virt APIs are used to replaced old bootmem API. We need to allocate page below 4G for swiotlb. That should fix regression on Andrew's system that is using swiotlb. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27	net: Document promote_secondaries	Martin Schwenke
	From 038a821667f62c496f2bbae27081b1b612122a97 Mon Sep 17 00:00:00 2001 From: Martin Schwenke <martin@meltin.net> Date: Tue, 28 Jan 2014 15:16:49 +1100 Subject: [PATCH] net: Document promote_secondaries This option was added a long time ago... commit 8f937c6099858eee15fae14009dcbd05177fa91d Author: Harald Welte <laforge@gnumonks.org> Date: Sun May 29 20:23:46 2005 -0700 [IPV4]: Primary and secondary addresses Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	net: gre: use icmp_hdr() to get inner ip header	Duan Jiong
	When dealing with icmp messages, the skb->data points the ip header that triggered the sending of the icmp message. In gre_cisco_err(), the parse_gre_header() is called, and the iptunnel_pull_header() is called to pull the skb at the end of the parse_gre_header(), so the skb->data doesn't point the inner ip header. Unfortunately, the ipgre_err still needs those ip addresses in inner ip header to look up tunnel by ip_tunnel_lookup(). So just use icmp_hdr() to get inner ip header instead of skb->data. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	i40e: Add missing braces to i40e_dcb_need_reconfig()	Dave Jones
	Indentation mismatch spotted with Coverity. Introduced in 4e3b35b044ea ("i40e: add DCB and DCBNL support") Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	xen-netfront: fix resource leak in netfront	Annie Li
	This patch removes grant transfer releasing code from netfront, and uses gnttab_end_foreign_access to end grant access since gnttab_end_foreign_access_ref may fail when the grant entry is currently used for reading or writing. * clean up grant transfer code kept from old netfront(2.6.18) which grants pages for access/map and transfer. But grant transfer is deprecated in current netfront, so remove corresponding release code for transfer. * fix resource leak, release grant access (through gnttab_end_foreign_access) and skb for tx/rx path, use get_page to ensure page is released when grant access is completed successfully. Xen-blkfront/xen-tpmfront/xen-pcifront also have similar issue, but patches for them will be created separately. V6: Correct subject line and commit message. V5: Remove unecessary change in xennet_end_access. V4: Revert put_page in gnttab_end_foreign_access, and keep netfront change in single patch. V3: Changes as suggestion from David Vrabel, ensure pages are not freed untill grant acess is ended. V2: Improve patch comments. Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-28	powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked()	Michael Ellerman
	At a glance these are just the inverse of each other. The one subtlety is that arch_spin_value_unlocked() takes the lock by value, rather than as a pointer, which is important for the lockref code. On the other hand arch_spin_is_locked() doesn't really care, so implement it in terms of arch_spin_value_unlocked(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-28	powerpc: Add support for the optimised lockref implementation	Michael Ellerman
	This commit adds the architecture support required to enable the optimised implementation of lockrefs. That's as simple as defining arch_spin_value_unlocked() and selecting the Kconfig option. We also define cmpxchg64_relaxed(), because the lockref code does not need the cmpxchg to have barrier semantics. Using Linus' test case[1] on one system I see a 4x improvement for the basic enablement, and a further 1.3x for cmpxchg64_relaxed(), for a total of 5.3x vs the baseline. On another system I see more like 2x improvement. [1]: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-28	dma: pl08x: Export pl08x_filter_id	Sachin Kamat
	Export the symbol so that it is accessible to modules. Fixes the following error: ERROR: "pl08x_filter_id" [sound/soc/samsung/snd-soc-s3c-dma.ko] undefined! Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-01-27	leds: s3c24xx: Remove hardware.h inclusion	Sachin Kamat
	The contents of this header file is not referenced in the led driver. Remove its inclusion. While at it, re-arrange the headers as per the category. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: replace list_for_each with list_for_each_entry	ZHAO Gang
	Use the more convenient macro. Signed-off-by: ZHAO Gang <gamerh2o@gmail.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: kirkwood: Cleanup in header files	Sachin Kamat
	Commit c02cecb92ed4 ("ARM: orion: move platform_data definitions") moved the files to the current location but forgot to remove the pointer to its previous location. Clean it up. While at it also change the header file protection macros appropriately. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: pwm: Remove a warning on non-DT platforms	Olof Johansson
	This removes a warning on non-DT-enabled platforms: drivers/leds/leds-pwm.c: In function 'led_pwm_create_of': drivers/leds/leds-pwm.c:88:22: warning: unused variable 'node' Really caused by the local variable that is assigned to and then never used. Just do away with the local var, it's not needed. Technically this code path can never be entered without DT enabled, since there's an earlier check about number of children in the calling function, but the compiler can't see that. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: leds-pwm: fix duty time overflow.	Xiubo Li
	Overflow maybe occurs when calculates the duty time. For instance, the period time is 990000000ns, and the max_brightness is 127, when setting the brightness to 12, the duty value will be 25906026ns, but it should be 93543307ns. Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: leds-mc13783: Remove unneeded mc13xxx_{un}lock	Alexander Shiyan
	LED registers are used only in this driver, so no additional locking is needed. Read-Modify-Write cycle in workqueue is already protected by regmap. Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: leds-mc13783: Remove duplicate field in platform data	Alexander Shiyan
	LED platform data are overwhelmed by excessive field "max_cur" which just replicates few bits of "led_control" field. This patch removes this field and adds a definition for the current settings in the header. Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	drivers: leds: leds-tca6507: check CONFIG_GPIOLIB whether defined for ↵	Chen Gang
	'gpio_base' Need check CONFIG_GPIOLIB whether defined, just like another area has done within this file. Or can not pass compiling when CONFIG_GPIOLIB disabled. The related error (with allmodconfig for metag): CC [M] drivers/leds/leds-tca6507.o drivers/leds/leds-tca6507.c: In function 'tca6507_led_dt_init': drivers/leds/leds-tca6507.c:731: error: 'struct tca6507_platform_data' has no member named 'gpio_base' Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: lp5523: Support LED MUX configuration on running a pattern	Milo Kim
	There are two ways to run a pattern in LP5523. One is using legacy sysfs files such as 'enginex_mode','enginex_load' and 'enginex_leds'. ('x' is from 1 to 3). Among them, 'enginex_leds' are used for selecting specific LED channel MUX. (MUX means which LEDs are used for running a pattern from LED 1 to 9.) The other way is using the firmware interface. In this mode, the default LED MUX strings are used. In other words, LED MUX is not configurable on the fly. This patch enables dynamic LED MUX configuration when the firmware is loaded. By accessing the sysfs file 'enginex_leds', the LED channels can be configured. To synchronize the operation mode, each engine mode should be set to 'LOAD'. The documentation is updated as well. Cc: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	leds: lp5521/5523: Fix multiple engine usage bug	Milo Kim
	Whenever the engine is loaded by the user-application, the operation mode is reset first. But it has a problem in case of multiple engine used because previous engine settings are cleared. The driver should update not whole 8bits but each engine bit by masking. On the other hands, whole engines should be reset when the driver is unloaded and on initializing the LP5523 driver. So, new functions are used for this handling - lp5521/5523_stop_all_engines(). Cc: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	LEDS: tca6507 - fix up some comments.	NeilBrown
	In particular fix the capitalisation of GPIO and LED and correct TCA6507_MAKE_CPIO, but also rewrite the comment about platform-data to include reference to devicetree. Also re-wrap comments to fit 80 columns. Reported-by: Bryan Wu <cooloney@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	LEDS: tca6507: add device-tree support for GPIO configuration.	NeilBrown
	The 7 lines driven by the TCA6507 can either drive LEDs or act as output-only GPIOs. To make this distinction in devicetree we use the "compatible" property. If the device attached to a line is "compatible" with "gpio", we treat it like a GPIO. If it is "compatible" with "led" (or if no "compatible" value is set) we treat it like an LED. (cooloney@gmail.com: fix typo in the subject) Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	LEDS: tca6507 - fix bugs in parsing of device-tree configuration.	NeilBrown
	1/ The led_info array must be allocated to allow the full number of LEDs even if not all are present. The array maybe be sparsely filled but it is indexed by device address so we must at least allocate as many slots as the highest address used. It is easiest just to allocate all 7. 2/ range check the 'reg' value properly. 3/ led.flags must be initialised to zero, else all leds could be treated as GPIOs (depending on what happens to be on the stack). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-01-27	net: 6lowpan: fixup for code movement	Stephen Rothwell
	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	hyperv: Add support for physically discontinuous receive buffer	Haiyang Zhang
	This will allow us to use bigger receive buffer, and prevent allocation failure due to fragmented memory. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	sky2: initialize napi before registering device	Stanislaw Gruszka
	There is race condition when call netif_napi_add() after register_netdevice(), as ->open() can be called without napi initialized and trigger BUG_ON() on napi_enable(), like on below messages: [ 9.699863] sky2: driver version 1.30 [ 9.699960] sky2 0000:02:00.0: Yukon-2 EC Ultra chip revision 2 [ 9.700020] sky2 0000:02:00.0: irq 45 for MSI/MSI-X [ 9.700498] ------------[ cut here ]------------ [ 9.703391] kernel BUG at include/linux/netdevice.h:501! [ 9.703391] invalid opcode: 0000 [#1] PREEMPT SMP <snip> [ 9.830018] Call Trace: [ 9.830018] [<fa996169>] sky2_open+0x309/0x360 [sky2] [ 9.830018] [<c1007210>] ? via_no_dac+0x40/0x40 [ 9.830018] [<c1007210>] ? via_no_dac+0x40/0x40 [ 9.830018] [<c135ed4b>] __dev_open+0x9b/0x120 [ 9.830018] [<c1431cbe>] ? _raw_spin_unlock_bh+0x1e/0x20 [ 9.830018] [<c135efd9>] __dev_change_flags+0x89/0x150 [ 9.830018] [<c135f148>] dev_change_flags+0x18/0x50 [ 9.830018] [<c13bb8e0>] devinet_ioctl+0x5d0/0x6e0 [ 9.830018] [<c13bcced>] inet_ioctl+0x6d/0xa0 To fix the problem patch changes the order of initialization. Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=67151 Reported-and-tested-by: ebrahim.azarisooreh@gmail.com Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	net: Fix memory leak if TPROXY used with TCP early demux	Holger Eitzenberger
	I see a memory leak when using a transparent HTTP proxy using TPROXY together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable): unreferenced object 0xffff88008cba4a40 (size 1696): comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s) hex dump (first 32 bytes): 0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00 .. j@..7..2..... 02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff810b710a>] kmem_cache_alloc+0xad/0xb9 [<ffffffff81270185>] sk_prot_alloc+0x29/0xc5 [<ffffffff812702cf>] sk_clone_lock+0x14/0x283 [<ffffffff812aaf3a>] inet_csk_clone_lock+0xf/0x7b [<ffffffff8129a893>] netlink_broadcast+0x14/0x16 [<ffffffff812c1573>] tcp_create_openreq_child+0x1b/0x4c3 [<ffffffff812c033e>] tcp_v4_syn_recv_sock+0x38/0x25d [<ffffffff812c13e4>] tcp_check_req+0x25c/0x3d0 [<ffffffff812bf87a>] tcp_v4_do_rcv+0x287/0x40e [<ffffffff812a08a7>] ip_route_input_noref+0x843/0xa55 [<ffffffff812bfeca>] tcp_v4_rcv+0x4c9/0x725 [<ffffffff812a26f4>] ip_local_deliver_finish+0xe9/0x154 [<ffffffff8127a927>] __netif_receive_skb+0x4b2/0x514 [<ffffffff8127aa77>] process_backlog+0xee/0x1c5 [<ffffffff8127c949>] net_rx_action+0xa7/0x200 [<ffffffff81209d86>] add_interrupt_randomness+0x39/0x157 But there are many more, resulting in the machine going OOM after some days. From looking at the TPROXY code, and with help from Florian, I see that the memory leak is introduced in tcp_v4_early_demux(): void tcp_v4_early_demux(struct sk_buff skb) { / ... */ iph = ip_hdr(skb); th = tcp_hdr(skb); if (th->doff < sizeof(struct tcphdr) / 4) return; sk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo, iph->saddr, th->source, iph->daddr, ntohs(th->dest), skb->skb_iif); if (sk) { skb->sk = sk; where the socket is assigned unconditionally to skb->sk, also bumping the refcnt on it. This is problematic, because in our case the skb has already a socket assigned in the TPROXY target. This then results in the leak I see. The very same issue seems to be with IPv6, but haven't tested. Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	DRM: armada: fix missing DRM_KMS_FB_HELPER select	Russell King
	Commit 92b6f89f6b8f (drm: Add separate Kconfig option for fbdev helpers) happened in parallel with the inclusion of Armada DRM into mainline, and so missed this update. Add the necessary select statement to avoid build errors. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-27	nfsd: consider CLAIM_FH when handing out delegation	Ming Chen
	CLAIM_FH was added by NFSv4.1. It is the same as CLAIM_NULL except that it uses only current FH to identify the file to be opened. The NFS client is using CLAIM_FH if the FH is available when opening a file. Currently, we cannot get any delegation if we stat a file before open it because the server delegation code does not recognize CLAIM_FH. We tested this patch and found delegation can be handed out now when claim is CLAIM_FH. See http://marc.info/?l=linux-nfs&m=136369847801388&w=2 and http://www.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues#New_open_claim_types Signed-off-by: Ming Chen <mchen@cs.stonybrook.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-27	libceph: support CEPH_FEATURE_OSD_CACHEPOOL feature	Ilya Dryomov
	Announce our (limited, see previous commit) support for CACHEPOOL feature. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: follow redirect replies from osds	Ilya Dryomov
	Follow redirect replies from osds, for details see ceph.git commit fbbe3ad1220799b7bb00ea30fce581c5eadaf034. v1 (current) version of redirect reply consists of oloc and oid, which expands to pool, key, nspace, hash and oid. However, server-side code that would populate anything other than pool doesn't exist yet, and hence this commit adds support for pool redirects only. To make sure that future server-side updates don't break us, we decode all fields and, if any of key, nspace, hash or oid have a non-default value, error out with "corrupt osd_op_reply ..." message. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid}	Ilya Dryomov
	Rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid} before introducing r_target_{oloc,oid} needed for redirects. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: follow {read,write}_tier fields on osd request submission	Ilya Dryomov
	Overwrite ceph_osd_request::r_oloc.pool with read_tier for read ops and write_tier for write and read+write ops (aka basic tiering support). {read,write}_tier are part of pg_pool_t since v9. This commit bumps our pg_pool_t decode compat version from v7 to v9, all new fields except for {read,write}_tier are ignored. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: add ceph_pg_pool_by_id()	Ilya Dryomov
	"Lookup pool info by ID" function is hidden in osdmap.c. Expose it to the rest of libceph. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: CEPH_OSD_FLAG_* enum update	Ilya Dryomov
	Update CEPH_OSD_FLAG_* enum. (We need CEPH_OSD_FLAG_IGNORE_OVERLAY to support tiering). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: replace ceph_calc_ceph_pg() with ceph_oloc_oid_to_pg()	Ilya Dryomov
	Switch ceph_calc_ceph_pg() to new oloc and oid abstractions and rename it to ceph_oloc_oid_to_pg() to make its purpose more clear. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: introduce and start using oid abstraction	Ilya Dryomov
	In preparation for tiering support, which would require having two (base and target) object names for each osd request and also copying those names around, introduce struct ceph_object_id (oid) and a couple helpers to facilitate those copies and encapsulate the fact that object name is not necessarily a NUL-terminated string. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: rename MAX_OBJ_NAME_SIZE to CEPH_MAX_OID_NAME_LEN	Ilya Dryomov
	In preparation for adding oid abstraction, rename MAX_OBJ_NAME_SIZE to CEPH_MAX_OID_NAME_LEN. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: move ceph_file_layout helpers to ceph_fs.h	Ilya Dryomov
	Move ceph_file_layout helper macros and inline functions to ceph_fs.h. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	libceph: start using oloc abstraction	Ilya Dryomov
	Instead of relying on pool fields in ceph_file_layout (for mapping) and ceph_pg (for enconding), start using ceph_object_locator (oloc) abstraction. Note that userspace oloc currently consists of pool, key, nspace and hash fields, while this one contains only a pool. This is OK, because at this point we only send (i.e. encode) olocs and never have to receive (i.e. decode) them. This makes keeping a copy of ceph_file_layout in every osd request unnecessary, so ceph_osd_request::r_file_layout field is nuked. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-27	clk: sort Makefile	Mike Turquette
	Signed-off-by: Mike Turquette <mturquette@linaro.org>
2014-01-27	Merge branch 'bonding'	David S. Miller
	Veaceslav Falico says: ==================== bonding: fix locking in bond_ab_arp_prob After the latest patches, on every call of bond_ab_arp_probe() without an active slave I see the following warning: [ 7.912314] RTNL: assertion failed at net/core/dev.c (4494) ... [ 7.922495] [<ffffffff817acc6f>] dump_stack+0x51/0x72 [ 7.923714] [<ffffffff8168795e>] netdev_master_upper_dev_get+0x6e/0x70 [ 7.924940] [<ffffffff816a2a66>] rtnl_link_fill+0x116/0x260 [ 7.926143] [<ffffffff817acc6f>] ? dump_stack+0x51/0x72 [ 7.927333] [<ffffffff816a350c>] rtnl_fill_ifinfo+0x95c/0xb90 [ 7.928529] [<ffffffff8167af2b>] ? __kmalloc_reserve+0x3b/0xa0 [ 7.929681] [<ffffffff8167bfcf>] ? __alloc_skb+0x9f/0x1e0 [ 7.930827] [<ffffffff816a3b64>] rtmsg_ifinfo+0x84/0x100 [ 7.931960] [<ffffffffa00bca07>] bond_ab_arp_probe+0x1a7/0x370 [bonding] [ 7.933133] [<ffffffffa00bcd78>] bond_activebackup_arp_mon+0x1a8/0x2f0 [bonding] ... It happens because in bond_ab_arp_probe() we change the flags of a slave without holding the RTNL lock. To fix this - remove the useless curr_active_lock, RCUify it and lock RTNL while changing the slave's flags. Also, remove bond_ab_arp_probe() from under any locks in bond_ab_arp_mon(). ==================== Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	bonding: restructure locking of bond_ab_arp_probe()	Veaceslav Falico
	Currently we're calling it from under RCU context, however we're using some functions that require rtnl to be held. Fix this by restructuring the locking - don't call it under any locks, aquire rcu_read_lock() if we're sending _only_ (i.e. we have the active slave present), and use rtnl locking otherwise - if we need to modify (in)active flags of a slave. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	bonding: RCUify bond_ab_arp_probe	Veaceslav Falico
	Currently bond_ab_arp_probe() is always called under rcu_read_lock(), however to work with curr_active_slave we're still holding the curr_slave_lock. To remove that curr_slave_lock - rcu_dereference the bond's curr_active_slave and use it further - so that we're sure the slave won't go away, and we don't care if it will change in the meanwhile. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	AF_PACKET: Add documentation for queue mapping fanout mode	Neil Horman
	Recently I added a new AF_PACKET fanout operation mode in commit 2d36097, but I forgot to document it. Add PACKET_FANOUT_QM as an available mode in the af_packet documentation. Applies to net-next. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> CC: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27	bnx2x: More Shutdown revisions	Yuval Mintz
	Submission d9aee59 "bnx2x: Don't release PCI bars on shutdown" separated the PCI remove and shutdown flows, but pci_disable_device() is still being called on both. As a result, a dev_WARN_ONCE will be hit during shutdown for every bnx2x VF probed on a hypervisor (as its shutdown callback will be called and later pci_disable_sriov() will call its remove callback). This calls the pci_disable_device() only on the remove flow. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>