linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2009-12-14	microblaze: ftrace: add function graph support	Michal Simek
	For more information look at Documentation/trace folder. Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: ftrace: Add dynamic trace support	Michal Simek
	With dynamic function tracer, by default, _mcount is defined as an "empty" function, it returns directly without any more action. When enabling it in user-space, it will jump to a real tracing function(ftrace_caller), and do the real job for us. Differ from the static function tracer, dynamic function tracer provides two functions ftrace_make_call()/ftrace_make_nop() to enable/disable the tracing of some indicated kernel functions(set_ftrace_filter). In the kernel version, there is only one "_mcount" string for every kernel function, so, we just need to match this one in mcount_regex of scripts/recordmcount.pl. For more information please look at code and Documentation/trace folder. Steven ACK that scripts/recordmcount.pl part. Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: ftrace: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST	Michal Simek
	Implement MCOUNT_TEST in asm code - it is faster than use generic code Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: ftrace: add static function tracer	Michal Simek
	If -pg of gcc is enabled with CONFIG_FUNCTION_TRACER=y. a calling to _mcount will be inserted into each kernel function. so, there is a possibility to trace the kernel functions in _mcount. This patch add the specific _mcount support for static function tracing. by default, ftrace_trace_function is initialized as ftrace_stub(an empty function), so, the default _mcount will introduce very little overhead. after enabling ftrace in user-space, it will jump to a real tracing function and do static function tracing for us. Commit message from Wu Zhangjin <wuzhangjin@gmail.com> Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: Add TRACE_IRQFLAGS_SUPPORT	Michal Simek
	There are just two major changes Renamed local_irq functions to raw_local_irq in irq.c. Added TRACE_IRQFLAGS_SUPPORT to Kconfig.debug. Look at Documentation/irqflags-tracing.txt Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: preliminary enabling for LATENCYTOP support in Kconfig	Michal Simek
	Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: Lockdep support	Michal Simek
	Microblaze needs to do lock_init very soon because MMU init calls lock functions. Here is the explanation from Peter Zijlstra why we have to enable __ARCH_WANTS_INTERRUPTS_ON_CTSW. "So we schedule while holding rq->lock (for obvious reasons), but since lockdep tracks held locks per tasks, we need to transfer the held state from the prev to the next task. We do this by explicity calling spin_release(&rq->lock) in context_switch() right before switch_to(), and calling spin_acquire(&rq->lock) in finish_task_switch()->finish_lock_switch(). Now, for some reason lockdep thinks that interrupts got enabled over the context switch (git grep __ARCH_WANTS_INTERRUPTS_ON_CTSW arch/microblaze doesn't seem to turn up anything). Clearly trying to acquire the rq->lock with interrupts enabled is a bad idea and lockdep warns you about this." Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: Register timecounter/cyclecounter	Michal Simek
	It is the same counter as we use as free running one. I would like to use it for ftrace. Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: Stack trace support	Michal Simek
	This is working implemetation but the problem is that Microblaze misses frame pointer that's why is there big loop which trace and show all addresses which are in text. It shows addresses which are in registers, etc. This is problem and this is the reason why all Microblaze traces are wrong. There is an option to do hacks and trace the kernel code but this is too complicated. Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: Add IRQENTRY_TEXT to lds	Michal Simek
	It is important for ftrace irqsoff support Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: __init_begin symbol must be aligned	Michal Simek
	The problem was that free_initmem pass to free_initrd_mem got bad aligned __init_begin symbol and free_initrd_mem don't care about __init_end but take PAGE_SIZE instead. Here is behavior in kernel bootlog. ramdisk_execute_command from (init/main.c) was rewrite Freeing unused kernel memory: 6224k freed Failed to execute ��{�� Failed to execute ��{��. Attempting defaults... Mounting proc: Mounting var: Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	microblaze: GPIO reset support	Michal Simek
	Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-12-14	sh: Stub in P3 ioremap support for nommu parts.	Paul Mundt
	p3_ioremap() references __ioremap() which is presently undefined on nommu. This provides a trivial stub to fix the build up. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-14	sh: wire up vmallocinfo support in ioremap() implementations.	Paul Mundt
	This wires up the caller information for the ioremap VMA, which allows for more helpful caller tracking via /proc/vmallocinfo. Follows the x86 and powerpc changes of the same nature. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-13	drivers/net/bonding/: : use pr_fmt	Joe Perches
	Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt Remove DRV_NAME from pr_<level>s Consolidate long format strings Remove some extra tab indents Remove some unnecessary ()s from pr_<level>s arguments Align pr_<level> arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	can: CAN_MCP251X should depend on HAS_DMA	Geert Uytterhoeven
	When building for Sun 3: drivers/net/can/mcp251x.c:1074: undefined reference to `dma_free_coherent' drivers/net/can/mcp251x.c:976: undefined reference to `dma_alloc_coherent' drivers/net/can/mcp251x.c:1050: undefined reference to `dma_free_coherent' Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	drivers/net/usb: Correct code taking the size of a pointer	Julia Lawall
	sizeof(dev->dev_addr) is the size of a pointer. A few lines above, the size of this field is obtained using netdev->addr_len for a call to memcpy, so do the same here. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; expression f; type T; @@ f(...,(T)x,...) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	drivers/net/cpmac.c: Correct code taking the size of a pointer	Julia Lawall
	sizeof(dev->dev_addr) is the size of a pointer. On the other hand, sizeof(pdata->dev_addr) is the size of an array, so use that instead. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; expression f; type T; @@ f(...,(T)x,...) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	drivers/net/sfc: Correct code taking the size of a pointer	Julia Lawall
	The function efx_iterate_state contains the code memcpy(&payload->msg, payload_msg, sizeof(payload_msg)); This is the only use of payload_msg. The type of payload_msg is changed from a pointer to an array, so that the result of sizeof really is the length of the string. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; expression f; type T; @@ f(...,(T)x,...) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	drivers/atm: Correct code taking the size of a pointer	Julia Lawall
	sizeof(TstSchedTbl) is just the size of the pointer. Change it to the size of the referenced data. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; expression f; type T; @@ f(...,(T)x,...) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	3c574_cs: disable irq before calling el3_interrupt	Ken Kawasaki
	3c574_cs, 3c589_cs: disable irq before calling el3_interrupt in the media_check function. Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	mlx4_core: return a negative error value	roel kluin
	The return value should be negative. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	can: Fix data length code handling in rx path	Oliver Hartkopp
	A valid CAN dataframe can have a data length code (DLC) of 0 .. 8 data bytes. When reading the CAN controllers register the 4-bit value may contain values from 0 .. 15 which may exceed the reserved space in the socket buffer! The ISO 11898-1 Chapter 8.4.2.3 (DLC field) says that register values > 8 should be reduced to 8 without any error reporting or frame drop. This patch introduces a new helper macro to cast a given 4-bit data length code (dlc) to __u8 and ensure the DLC value to be max. 8 bytes. The different handlings in the rx path of the CAN netdevice drivers are fixed. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	net: Fix userspace RTM_NEWLINK notifications.	Eric W. Biederman
	I received some bug reports about userspace programs having problems because after RTM_NEWLINK was received they could not immediate access files under /proc/sys/net/ because they had not been registered yet. The original problem was trivially fixed by moving the userspace notification from rtnetlink_event() to the end of register_netdevice(). When testing that change I discovered I was still getting RTM_NEWLINK events before I could access proc and I was also getting RTM_NEWLINK events after I was seeing RTM_DELLINK. Things practically guaranteed to confuse userspace. After a little more investigation these extra notifications proved to be from the new notifiers NETDEV_POST_INIT and NETDEV_UNREGISTER_BATCH hitting the default case in rtnetlink_event, and triggering unnecessary RTM_NEWLINK messages. rtnetlink_event now explicitly handles NETDEV_UNREGISTER_BATCH and NETDEV_POST_INIT to avoid sending the incorrect userspace notifications. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13	udp: udp_lib_get_port() fix	Eric Dumazet
	Now we can have a large udp hash table, udp_lib_get_port() loop should be converted to a do {} while (cond) form, or we dont enter it at all if hash table size is exactly 65536. Reported-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-14	sh: Make the unaligned trap handler always obey notification levels.	Paul Mundt
	Presently there are a couple of paths in to the alignment handler, where only the address error path presently quiets the notificiation messages based on the configuration settings. We carry over the notification level tests to the default alignment handler itself incase so that they behave uniformly. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-14	sh: Couple kernel and user write page perm bits for CONFIG_X2TLB	Matt Fleming
	pte_write() should check whether the permissions include either the user or kernel write permission bits. Likewise, pte_wrprotect() needs to remove both the kernel and user write bits. Without this patch handle_tlbmiss() doesn't handle faulting in pages from the P3 area (our vmalloc space) because of a write. Mappings of the P3 space have the _PAGE_EXT_KERN_WRITE bit but not _PAGE_EXT_USER_WRITE. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-14	md: add 'recovery_start' per-device sysfs attribute	Dan Williams
	Enable external metadata arrays to manage rebuild checkpointing via a md/dev-XXX/recovery_start attribute which reflects rdev->recovery_offset Also update resync_start_store to allow 'none' to be written, for consistency. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: rcu_read_lock() walk of mddev->disks in md_do_sync()	Dan Williams
	Other walks of this list are either under rcu_read_lock() or the list mutation lock (mddev_lock()). This protects against the improbable case of a disk being removed from the array at the start of md_do_sync(). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-14	md: integrate spares into array at earliest opportunity.	NeilBrown
	As v1.x metadata can record that a member of the array is not completely recovered, it make sense to record that a spare has become a regular member of the array at the earliest opportunity. So remove the tests on "recovery_offset > 0" in super_1_sync as they really aren't needed, and schedule a metadata update immediately after adding spares to a degraded array. This means that if a crash happens immediately after a recovery starts, the new device will be included in the array and recovery will continue from wherever it was up to. Previously this didn't happen unless recovery was at least 1/16 of the way through. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: move compat_ioctl handling into md.c	Arnd Bergmann
	The RAID ioctls are only implemented in md.c, so the handling for them should also be moved there from fs/compat_ioctl.c. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Neil Brown <neilb@suse.de> Cc: Andre Noll <maan@systemlinux.org> Cc: linux-raid@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: revise Kconfig help for MD_MULTIPATH	NeilBrown
	Make it clear in the config message that MD_MULTIPATH is not under active development. Cc: Oren Held <orenhe@il.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: add MODULE_DESCRIPTION for all md related modules.	NeilBrown
	Suggested by Oren Held <orenhe@il.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	raid: improve MD/raid10 handling of correctable read errors.	Robert Becker
	We've noticed severe lasting performance degradation of our raid arrays when we have drives that yield large amounts of media errors. The raid10 module will queue each failed read for retry, and also will attempt call fix_read_error() to perform the read recovery. Read recovery is performed while the array is frozen, so repeated recovery attempts can degrade the performance of the array for extended periods of time. With this patch I propose adding a per md device max number of corrected read attempts. Each rdev will maintain a count of read correction attempts in the rdev->read_errors field (not used currently for raid10). When we enter fix_read_error() we'll check to see when the last read error occurred, and divide the read error count by 2 for every hour since the last read error. If at that point our read error count exceeds the read error threshold, we'll fail the raid device. In addition in this patch I add sysfs nodes (get/set) for the per md max_read_errors attribute, the rdev->read_errors attribute, and added some printk's to indicate when fix_read_error fails to repair an rdev. For testing I used debugfs->fail_make_request to inject IO errors to the rdev while doing IO to the raid array. Signed-off-by: Robert Becker <Rob.Becker@riverbed.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md/raid10: print more useful messages on device failure.	Robert Becker
	When we get a read error on a device in a RAID10, and attempting to repair the error fails, print more useful messages about why it failed. Signed-off-by: Robert Becker <Rob.Becker@riverbed.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md/bitmap: update dirty flag when bitmap bits are explicitly set.	NeilBrown
	There is a sysfs file which allows bits in the write-intent bitmap to be explicit set - indicating that the block is thought to be 'dirty'. When this happens we should really set recovery_cp backwards to include the block to reflect this dirtiness. In particular, a 'resync' process will refuse to start if recovery_cp is beyond the end of the array, so this is needed to allow a resync to be triggered. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: Support write-intent bitmaps with externally managed metadata.	NeilBrown
	In this case, the metadata needs to not be in the same sector as the bitmap. md will not read/write any bitmap metadata. Config must be done via sysfs and when a recovery makes the array non-degraded again, writing 'true' to 'bitmap/can_clear' will allow bits in the bitmap to be cleared again. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md/bitmap: move setting of daemon_lastrun out of bitmap_read_sb	NeilBrown
	Setting daemon_lastrun really has nothing to do with reading the bitmap superblock, it just happens to be needed at the same time. bitmap_read_sb is about to become options, so move that code out to after the call to bitmap_read_sb. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: support updating bitmap parameters via sysfs.	NeilBrown
	A new attribute directory 'bitmap' in 'md' is created which contains files for configuring the bitmap. 'location' identifies where the bitmap is, either 'none', or 'file' or 'sector offset from metadata'. Writing 'location' can create or remove a bitmap. Adding a 'file' bitmap this way is not yet supported. 'chunksize' and 'time_base' must be set before 'location' can be set. 'chunksize' can be set before creating a bitmap, but is currently always over-ridden by the bitmap superblock. 'time_base' and 'backlog' can be updated at any time. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>
2009-12-14	md: factor out parsing of fixed-point numbers	NeilBrown
	safe_delay_store can parse fixed point numbers (for fractions of a second). We will want to do that for another sysfs file soon, so factor out the code. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: support bitmap offset appropriate for external-metadata arrays.	NeilBrown
	For md arrays were metadata is managed externally, the kernel does not know about a superblock so the superblock offset is 0. If we want to have a write-intent-bitmap near the end of the devices of such an array, we should support sector_t sized offset. We need offset be possibly negative for when the bitmap is before the metadata, so use loff_t instead. Also add sanity check that bitmap does not overlap with data. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: remove needless setting of thread->timeout in raid10_quiesce	NeilBrown
	As bitmap_create and bitmap_destroy already set thread->timeout as appropriate, there is no need to do it in raid10_quiesce. There is a possible need to wake the thread after the timeout has been set low, but it is better to do that where the timeout is actually set low, in bitmap_create. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: change daemon_sleep to be in 'jiffies' rather than 'seconds'.	NeilBrown
	This removes a lot of multiplications by HZ. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: move offset, daemon_sleep and chunksize out of bitmap structure	NeilBrown
	... and into bitmap_info. These are all configuration parameters that need to be set before the bitmap is created. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: collect bitmap-specific fields into one structure.	NeilBrown
	In preparation for making bitmap fields configurable via sysfs, start tidying up by making a single structure to contain the configuration fields. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md/raid1: add takeover support for raid5->raid1	NeilBrown
	A 2-device raid5 array can now be converted to raid1. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: add honouring of suspend_{lo,hi} to raid1.	NeilBrown
	This will allow us to stop writeout to portions of the array while they are resynced by someone else - e.g. another node in a cluster. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md/raid5: don't complete make_request on barrier until writes are scheduled	NeilBrown
	The post-barrier-flush is sent by md as soon as make_request on the barrier write completes. For raid5, the data might not be in the per-device queues yet. So for barrier requests, wait for any pre-reading to be done so that the request will be in the per-device queues. We use the 'preread_active' count to check that nothing is still in the preread phase, and delay the decrement of this count until after write requests have been submitted to the underlying devices. Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14	md: support barrier requests on all personalities.	NeilBrown
	Previously barriers were only supported on RAID1. This is because other levels requires synchronisation across all devices and so needed a different approach. Here is that approach. When a barrier arrives, we send a zero-length barrier to every active device. When that completes - and if the original request was not empty - we submit the barrier request itself (with the barrier flag cleared) and then submit a fresh load of zero length barriers. The barrier request itself is asynchronous, but any subsequent request will block until the barrier completes. The reason for clearing the barrier flag is that a barrier request is allowed to fail. If we pass a non-empty barrier through a striping raid level it is conceivable that part of it could succeed and part could fail. That would be way too hard to deal with. So if the first run of zero length barriers succeed, we assume all is sufficiently well that we send the request and ignore errors in the second run of barriers. RAID5 needs extra care as write requests may not have been submitted to the underlying devices yet. So we flush the stripe cache before proceeding with the barrier. Note that the second set of zero-length barriers are submitted immediately after the original request is submitted. Thus when a personality finds mddev->barrier to be set during make_request, it should not return from make_request until the corresponding per-device request(s) have been queued. That will be done in later patches. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>
2009-12-14	md: don't reset curr_resync_completed after an interrupted resync	NeilBrown
	If a resync/recovery/check/repair is interrupted for some reason, it can be useful to know exactly where it got up to. So in that case, do not clear curr_resync_completed. Initialise it when starting a resync/recovery/... instead. Signed-off-by: NeilBrown <neilb@suse.de>