linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2009-06-17	sky2: reduce default transmit ring	Stephen Hemminger
	Reduce the size of the driver transmit ring to reduce latency and allow qdisc to do better rate control. Also make it obvious what the minimum transmit ring allowed is and why. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	sky2: receive counter update	Stephen Hemminger
	Since it is likely that there are multiple packets received per interrupt, only update the receive counters once after all packets are processed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	sky2: fix shutdown synchronization	Stephen Hemminger
	The logic in sky2_down was incorrect. Receiver could report status after rx_stop was called. The steps need to be: * stop new frames from being transmitted * shut off transmit/receive logic * synchronize with NAPI to process status info about transmitter and receiver Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	sky2: PCI irq issues	Stephen Hemminger
	Add some read's to avoid any PCI posting issues when controlling irq's. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	sky2: more receive shutdown	Stephen Hemminger
	Reset more parts of the receive path when device is take offline. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	sky2: turn off pause during shutdown	Stephen Hemminger
	This unblocks the chip if it is stuck in pause cycle during shutdown. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	r8169: do not bring device down when suspending	françois romieu
	Stopping all activity through ChipCmd and blindly acking the irqs is neither nice nor completely needed: the transition to low-power mode does enough work and it apparently keeps the device in a sane state. Patch suggested by a fix for http://bugzilla.kernel.org/show_bug.cgi?id=9512 The rtl_shutdown path is kept unchanged so far. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Anders Eriksson <aeriksson@fastmail.fm> Cc: Edward Hsu <edward_hsu@realtek.com.tw> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	sis190: use an adequate phy list entry as a fallback	françois romieu
	When sis190 driver is trying to get default phy, if it doesn't find home or lan phy, it falls back to the first phy in the phy list but list_entry() points to a bogus entry. list_first_entry() should be used instead. Signed-off-by: Arnaud Patard <apatard@mandriva.com> Acked-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	net/ucc_geth: Add SGMII support for UCC GETH driver	Haiying Wang
	-- derived from reverted commit 047584ce94108012288554a5f84585d792cc7f8f -- reworked by Grant Likely to play nice with commit: "net: Rework ucc_geth driver to use of_mdio infrastructure" (0b9da337dca972e7a4144e298ec3adb8f244d4a4) Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	Revert "net/ucc_geth: Add SGMII support for UEC GETH driver"	Grant Likely
	This reverts commit 047584ce94108012288554a5f84585d792cc7f8f. This patch meshes badly with "net: Rework ucc_geth driver to use of_mdio infrastructure" (0b9da337dca972e7a4144e298ec3adb8f244d4a4). Since most of the patch needs to be reworked, it is clearer to revert the patch and then apply the corrected version Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	skbuff: don't corrupt mac_header on skb expansion	Stephen Hemminger
	The skb mac_header field is sometimes NULL (or ~0u) as a sentinel value. The places where skb is expanded add an offset which would change this flag into an invalid pointer (or offset). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-17	skbuff: skb_mac_header_was_set is always true on >32 bit	Stephen Hemminger
	Looking at the crash in log_martians(), one suspect is that the check for mac header being set is not correct. The value of mac_header defaults to 0 on allocation, therefore skb_mac_header_was_set will always be true on platforms using NET_SKBUFF_USES_OFFSET. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-18	Merge commit 'gcl/merge' into next	Benjamin Herrenschmidt
	Manual merge of: drivers/net/fec_mpc52xx.c
2009-06-18	Merge commit 'origin/master' into next	Benjamin Herrenschmidt

2009-06-17	nfs: remove unnecessary NFS_INO_INVALID_ACL checks	James Morris
	Unless I'm mistaken, NFS_INO_INVALID_ACL is being checked twice during getacl calls (i.e. first via nfs_revalidate_inode() and then by each all site). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: More "sloppy" parsing problems	Chuck Lever
	Specifying "port=-5" with the kernel's current mount option parser generates "unrecognized mount option". If "sloppy" is set, this causes the mount to succeed and use the default values; the desired behavior is that, since this is a valid option with an invalid value, the mount should fail, even with "sloppy." To properly handle "sloppy" parsing, we need to distinguish between correct options with invalid values, and incorrect options. We will need to parse integer values by hand, therefore, and not rely on match_token(). For instance, these must all fail with "invalid value": port=12345678 port=-5 port=samuel and not with "unrecognized option," as they do currently. Thus, for the sake of match_token() we need to treat the values for these options as strings, and do the conversion to integers using strict_strtol(). This is basically the same solution we used for the earlier "retry=" fix (commit ecbb3845), except in this case the kernel actually has to parse the value, rather than ignore it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Invalid mount option values should always fail, even with "sloppy"	Chuck Lever
	Ian Kent reports: "I've noticed a couple of other regressions with the options vers and proto option of mount.nfs(8). The commands: mount -t nfs -o vers=<invalid version> <server>:/<path> /<mountpoint> mount -t nfs -o proto=<invalid proto> <server>:/<path> /<mountpoint> both immediately fail. But if the "-s" option is also used they both succeed with the mount falling back to defaults (by the look of it). In the past these failed even when the sloppy option was given, as I think they should. I believe the sloppy option is meant to allow the mount command to still function for mount options (for example in shared autofs maps) that exist on other Unix implementations but aren't present in the Linux mount.nfs(8). So, an invalid value specified for a known mount option is different to an unknown mount option and should fail appropriately." See RH bugzilla 486266. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Remove unused XDR decoder functions	Chuck Lever
	Clean up: Remove xdr_decode_fhstatus() and xdr_decode_fhstatus3(), now that they are unused. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Update MNT and MNT3 reply decoding functions	Chuck Lever
	Solder xdr_stream-based XDR decoding functions into the in-kernel mountd client that are more careful about checking data types and watching for buffer overflows. The new MNT3 decoder includes support for auth-flavor list decoding. The "_sz" macro for MNT3 replies was missing the size of the file handle. I've added this back, and included the size of the auth flavor array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: add XDR decoder for mountd version 3 auth-flavor lists	Chuck Lever
	Introduce an xdr_stream-based XDR decoder that can unpack the auth- flavor list returned in a MNT3 reply. The nfs_mount() function's caller allocates an array, and passes the size and a pointer to it. The decoder decodes all the flavors it can into the array, and returns the number of decoded flavors. If the caller is not interested in the auth flavors, it can pass a value of zero as the size of the pre-allocated array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: add new file handle decoders to in-kernel mountd client	Chuck Lever
	Introduce xdr_stream-based XDR file handle decoders to the in-kernel mountd client. These are more careful than the existing decoder functions about buffer overflows and data type and range checking. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Add separate mountd status code decoders for each mountd version	Chuck Lever
	Introduce data structures and xdr_stream-based decoding functions for unmarshalling mountd status codes properly. Mountd version 3 uses specific standard error return codes that are not errno values and not NFS3ERR_ values. These have a well-defined standard mapping to local errno values. Introduce data structures and a decoder function that map these status codes to local errno values properly. This is new functionality (but not used yet). Version 1 mountd status values are defined by RFC 1094 as UNIX error values (errno values). Errno values on heterogeneous systems do not necessarily match each other. To avoid exposing possibly incorrect errno values to upper layers, the current XDR decoder converts all non-zero MNT version 1 status codes to -EACCES. The OpenGroup XNFS standard provides a mapping similar to but smaller than the version 3 error codes. Implement a decoder that uses the XNFS error codes, replacing the current decoder. For both mountd protocol versions, map unrecognized errors to -EACCES. Finally we introduce a replacement data structure for mnt_fhstatus at this time, which is used by the new XDR decoders. In addition to documenting that the status value returned by the XDR decoders is always an errno, this new structure will be expanded in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: remove unused function in fs/nfs/mount_clnt.c	Chuck Lever
	Clean up: remove xdr_encode_dirpath() now that it has been replaced. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Use xdr_stream-based XDR encoder for MNT's dirpath argument	Chuck Lever
	Check the length of the supplied dirpath, and see that it fits properly in the RPC buffer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Clean up MNT program definitions	Chuck Lever
	Clean up: Relocate MNT program procedure number definitions to the only file that uses them. Relocate the version number definitions, which are shared, to nfs.h. Remove duplicate program number definitions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	lockd: Don't bother with RPC ping for NSM upcalls	Chuck Lever
	Cut NSM upcall RPC traffic in half -- don't do a NULL call first. The cases where a ping would be helpful are rare. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	lockd: Update NSM state from SM_MON replies	Chuck Lever
	When rpc.statd starts up in user space at boot time, it attempts to write the latest NSM local state number into /proc/sys/fs/nfs/nsm_local_state. If lockd.ko isn't loaded yet (as is the case in most configurations), that file doesn't exist, thus the kernel's NSM state remains set to its initial value of zero during lockd operation. This is a problem because rpc.statd and lockd use the NSM state number to prevent repeated lock recovery on rebooted hosts. If lockd sends a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state number is received, there is no way for lockd or rpc.statd to distinguish that stale SM_NOTIFY from an actual reboot. Thus lock recovery could be performed after the rebooted host has already started reclaiming locks, and those locks will be lost. We could change /etc/init.d/nfslock so it always modprobes lockd.ko before starting rpc.statd. However, if lockd.ko is ever unloaded and reloaded, we are back at square one, since the NSM state is not preserved across an unload/reload cycle. This may happen frequently on clients that use automounter. A period of NFS inactivity causes lockd.ko to be unloaded, and the kernel loses its NSM state setting. Instead, let's use the fact that rpc.statd plants the local system's NSM state in every SM_MON (and SM_UNMON) reply. lockd performs a synchronous SM_MON upcall to the local rpc.statd _before_ sending its first NLM request to a new remote. This would permit rpc.statd to provide the current NSM state to lockd, even after lockd.ko had been unloaded and reloaded. Note that NLMPROC_LOCK arguments are constructed before the nsm_monitor() call, so we have to rearrange argument construction very slightly to make this all work out. And, the kernel appears to treat NSM state as a u32 (see struct nlm_args and nsm_res). Make nsm_local_state a u32 as well, to ensure we don't get bogus comparison results. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Fix false error return from nfs_callback_up() if ipv6.ko is not available	Chuck Lever
	Clear "ret" if the error return from svc_create_xprt(AF_INET6) was -EAFNOSUPORT. Otherwise, callback start-up will succeed, but nfs_callback_up() will return -EAFNOSUPPORT anyway, and the first NFSv4 mount attempt after a reboot will fail. Bug introduced by commit f738f517 in 2.6.30-rc1. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Return error code from nfs_callback_up() to user space	Chuck Lever
	If the kernel cannot start the NFSv4 callback service during a mount request, it returns -ENOMEM to user space, resulting in this message: mount.nfs4: Cannot allocate memory Adjust nfs_alloc_client() and nfs_get_client() to pass NFSv4 callback start-up errors back to user space so a less mysterious error message can be displayed by the mount command. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: Do not display the setting of the "intr" mount option	Chuck Lever
	The "intr" mount option has been deprecated for a while, but /proc/mounts continues to display "nointr" whether "intr" or "nointr" has been specified for a mount point. Since these options do not have any effect, simply do not display them. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	NFS: add support for splice writes	Suresh Jayaraman
	Adds support for splice writes. It effectively calls generic_file_splice_write() to do the writes. We need not worry about O_APPEND case as the combination of splice() writes and O_APPEND is disallowed. This patch propagates NFS write errors back to the caller. The number of bytes written via splice are being added to NFSIO_NORMALWRITTENBYTES as these are effectively cached writes. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17	Merge commit 'linux-pnfs/nfs41-for-2.6.31' into nfsv41-for-2.6.31	Trond Myklebust

2009-06-17	jbd2: clean up jbd2_journal_try_to_free_buffers()	Hisashi Hifumi
	This patch reverts 3f31fddf, which is no longer needed because if a race between freeing buffer and committing transaction functionality occurs and dio gets error, currently dio falls back to buffered IO due to the commit 6ccfa806. Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Mingming Cao <cmm@us.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-06-18	md/raid5: correctly update sync_completed when we reach max_resync	NeilBrown
	At the end of reshape_request we update cyrr_resync_completed if we are about to pause due to reaching resync_max. However we update it to the wrong value. We need to add the "reshape_sectors" that have just been reshaped. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md/raid5: add missing call to schedule() after prepare_to_wait()	Dan Williams
	In the unlikely event that reshape progresses past the current request while it is waiting for a stripe we need to schedule() before retrying for 2 reasons: 1/ Prevent list corruption from duplicated list_add() calls without intervening list_del(). 2/ Give the reshape code a chance to make some progress to resolve the conflict. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md/linear: use call_rcu to free obsolete 'conf' structures.	NeilBrown
	Current, when we update the 'conf' structure, when adding a drive to a linear array, we keep the old version around until the array is finally stopped, as it is not safe to free it immediately. Now that we have rcu protection on all accesses to 'conf', we can use call_rcu to free it more promptly. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md linear: Protecting mddev with rcu locks to avoid races	SandeepKsinha
	Due to the lack of memory ordering guarantees, we may have races around mddev->conf. In particular, the correct contents of the structure we get from dereferencing ->private might not be visible to this CPU yet, and they might not be correct w.r.t mddev->raid_disks. This patch addresses the problem using rcu protection to avoid such race conditions. Signed-off-by: SandeepKsinha <sandeepksinha@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: Move check for bitmap presence to personality code.	Andre Noll
	If the superblock of a component device indicates the presence of a bitmap but the corresponding raid personality does not support bitmaps (raid0, linear, multipath, faulty), then something is seriously wrong and we'd better refuse to run such an array. Currently, this check is performed while the superblocks are examined, i.e. before entering personality code. Therefore the generic md layer must know which raid levels support bitmaps and which do not. This patch avoids this layer violation without adding identical code to various personalities. This is accomplished by introducing a new public function to md.c, md_check_no_bitmap(), which replaces the hard-coded checks in the superblock loading functions. A call to md_check_no_bitmap() is added to the ->run method of each personality which does not support bitmaps and assembly is aborted if at least one component device contains a bitmap. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: remove chunksize rounding from common code.	NeilBrown
	It is easiest to round sizes to multiples of chunk size in the personality code for those personalities which care. Those personalities now do the rounding, so we can remove that function from common code. Also remove the upper bound on the size of a chunk, and the lower bound on the size of a device (1 chunk), neither of which really buy us anything. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: raid0/linear: ensure device sizes are rounded to chunk size.	NeilBrown
	This is currently ensured by common code, but it is more reliable to ensure it where it is needed in personality code. All the other personalities that care already round the size to the chunk_size. raid0 and linear are the only hold-outs. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: move assignment of ->utime so that it never gets skipped.	NeilBrown
	Currently the assignment to utime gets skipped for 'external' metadata. So move it to the top of the function so that it always gets effected. This is of largely cosmetic interest. Nothing actually depends on ->utime being right for external arrays. "mdadm --monitor" does use it for 0.90 and 1.x arrays, but with mdadm-3.0, this is not important for external metadata. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: Push down reconstruction log message to personality code.	Andre Noll
	Currently, the md layer checks in analyze_sbs() if the raid level supports reconstruction (mddev->level >= 1) and if reconstruction is in progress (mddev->recovery_cp != MaxSector). Move that printk into the personality code of those raid levels that care (levels 1, 4, 5, 6, 10). Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: merge reconfig and check_reshape methods.	NeilBrown
	The difference between these two methods is artificial. Both check that a pending reshape is valid, and perform any aspect of it that can be done immediately. 'reconfig' handles chunk size and layout. 'check_reshape' handles raid_disks. So make them just one method. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: remove unnecessary arguments from ->reconfig method.	NeilBrown
	Passing the new layout and chunksize as args is not necessary as the mddev has fields for new_check and new_layout. This is preparation for combining the check_reshape and reconfig methods Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: raid5: check stripe cache is large enough in start_reshape	NeilBrown
	In reshape cases that do not change the number of devices, start_reshape is called without first calling check_reshape. Currently, the check that the stripe_cache is large enough is only done in check_reshape. It should be in start_reshape too. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: raid0: chunk_sectors cleanups.	NeilBrown
	following the conversion to chunk_sectors, there is room for cleaning up a little. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: fix some comments.	Andre Noll
	1/ Raid5 has learned to take over also raid4 and raid6 arrays. 2/ new_chunk in mdp_superblock_1 is in sectors, not bytes. Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md/raid5: Use is_power_of_2() in raid5_reconfig()/raid6_reconfig().	Andre Noll
	Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: convert conf->chunk_size and conf->prev_chunk to sectors.	Andre Noll
	This kills some more shifts. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18	md: Convert mddev->new_chunk to sectors.	Andre Noll
	A straight-forward conversion which gets rid of some multiplications/divisions/shifts. The patch also introduces a couple of new ones, most of which are due to conf->chunk_size still being represented in bytes. This will be cleaned up in subsequent patches. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>