summaryrefslogtreecommitdiff
path: root/net/rds
AgeCommit message (Collapse)Author
2009-07-20RDS: Fix completion notifications on blocking socketsAndy Grover
Completion or congestion notifications were not being checked if the socket went to sleep. This patch fixes that. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Drop connection when a fatal QP event is receivedAndy Grover
Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Disable flow control in sysctl and explain whyAndy Grover
Backwards compatibility with rds 3.0 causes protocol- based flow control to be disabled as a side-effect. I don't want to pull out FC support from the IB transport but I do want to document and keep the sysctl consistent if possible. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Move tx/rx ring init and refill to laterAndy Grover
Since RDS 3.0 and 3.1 have different packet formats, we need to wait until after protocol negotiation is complete to layout the rx buffers. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS: Don't set c_version in __rds_conn_create()Andy Grover
Protocol negotiation is logically a property of the transports, so rds core need not set it. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Rename byte_len to data_len to enhance readabilityAndy Grover
Of course len is in bytes. Calling it data_len hopefully indicates a little better what the variable is actually for. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.cAndy Grover
Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Fix printk to indicate remote IP, not localAndy Grover
Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Handle connections using RDS 3.0 wire protocolAndy Grover
The big differences between RDS 3.0 and 3.1 are protocol-level flow control, and with 3.1 the header is in front of the data. The header always ends up in the header buffer, and the data goes in the data page. In 3.0 our "header" is a trailer, and will end up either in the data page, the header buffer, or split across the two. Since 3.1 is backwards- compatible with 3.0, we need to continue to support these cases. This patch does that -- if using RDS 3.0 wire protocol, it will copy the header from wherever it ended up into the header buffer. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS/IB: Improve RDS protocol version checkingAndy Grover
RDS on IB uses privdata to do protocol version negotiation. Apparently the IB stack will return a larger privdata buffer than the struct we were expecting. Just to be extra-sure, this patch adds some checks in this area. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20RDS: Set retry_count to 2 and make modifiable via modparamAndy Grover
This will be default cause IB connections to failover faster, but allow a longer retry count to be used if desired. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-24percpu: use DEFINE_PER_CPU_SHARED_ALIGNED()Tejun Heo
There are a few places where ___cacheline_aligned* is used with DEFINE_PER_CPU(). Use DEFINE_PER_CPU_SHARED_ALIGNED() instead. DEFINE_PER_CPU_SHARED_ALIGNED() applies alignment only on SMPs. While all other converted places used _in_smp variant or only get compiled for SMP, net/rds used unconditional ____cacheline_aligned. I don't see any reason these data structures should be aligned on UP and thus converted together. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Andy Grover <andy.grover@oracle.com>
2009-05-18Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/scsi/fcoe/fcoe.c
2009-04-21FRV: Fix the section attribute on UP DECLARE_PER_CPU()David Howells
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU() does not agree with that specified by DEFINE_PER_CPU(). This means that architectures that have a small data section references relative to a base register may throw up linkage errors due to too great a displacement between where the base register points and the per-CPU variable. On FRV, the .h declaration says that the variable is in the .sdata section, but the .c definition says it's actually in the .data section. The linker throws up the following errors: kernel/built-in.o: In function `release_task': kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o To fix this, DECLARE_PER_CPU() should simply apply the same section attribute as does DEFINE_PER_CPU(). However, this is made slightly more complex by virtue of the fact that there are several variants on DEFINE, so these need to be matched by variants on DECLARE. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-09ERR_PTR() dereference in net/rds/ib.cDan Carpenter
rdma_create_id() doesn't return NULL, only ERR_PTR(). Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. regards, dan carpenter Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09ERR_PTR() dereference in net/rds/iw.cDan Carpenter
rdma_create_id() returns ERR_PTR() not null. Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. regards, dan carpenter Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09rds: use kmem_cache_zalloc instead of kmem_cache_alloc/memsetWei Yongjun
Use kmem_cache_zalloc instead of kmem_cache_alloc/memset. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS: remove unused #include <version.h>Huang Weiyi
Remove unused #include <version.h> in net/rds/af_rds.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS: use get_user_pages_fast()Andy Grover
Use the new function that is simpler and faster. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS: Establish connection before parsing CMSGsAndy Grover
The first message to a remote node should prompt a new connection. Even an RDMA op via CMSG. Therefore move CMSG parsing to after connection establishment. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS: Fix ordering in a conditionalAndy Grover
Putting the constant first is a supposed "best practice" that actually makes the code harder to read. Thanks to Roland Dreier for finding a bug in this "simple, obviously correct" patch. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS/IW+IB: Allow max credit advertise window.Steve Wise
Fix hack that restricts the credit advertisement to 127. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS/IW+IB: Set the RDS_LL_SEND_FULL bit when we're throttled.Steve Wise
The RDS_LL_SEND_FULL bit should be set when we stop transmitted due to flow control. Otherwise the send worker will keep trying as opposed to sleeping until we unthrottle. Saves CPU. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS: Correct some iw references in rdma_transport.cAndy Grover
Had some lingering instances of _iw_ variable names from when the listen code was centralized into rdma_transport.c Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-09RDS/IW+IB: Set recv ring low water mark to 1/2 full.Steve Wise
Currently the recv ring low water mark is 1/4 the depth. Performance measurements show that this limits iWARP throughput by flow controlling the rds-stress senders. Setting it to 1/2 seems to max the T3 performance. I tried even higher levels but that didn't help and it started to increase the rds thread cpu utilization. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02RDS: Use spinlock to protect 64b value update on 32b archsAndy Grover
We have a 64bit value that needs to be set atomically. This is easy and quick on all 64bit archs, and can also be done on x86/32 with set_64bit() (uses cmpxchg8b). However other 32b archs don't have this. I actually changed this to the current state in preparation for mainline because the old way (using a spinlock on 32b) resulted in unsightly #ifdefs in the code. But obviously, being correct takes precedence. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02RDS: Rewrite connection cleanup, fixing oops on rmmodAndy Grover
This fixes a bug where a connection was unexpectedly not on *any* list while being destroyed. It also cleans up some code duplication and regularizes some function names. * Grab appropriate lock in conn_free() and explain in comment * Ensure via locking that a conn is never not on either a dev's list or the nodev list * Add rds_xx_remove_conn() to match rds_xx_add_conn() * Make rds_xx_add_conn() return void * Rename remove_{,nodev_}conns() to destroy_{,nodev_}conns() and unify their implementation in a helper function * Document lock ordering as nodev conn_lock before dev_conn_lock Reported-by: Yosef Etigin <yosefe@voltaire.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02RDS: Fix m_rs_lock deadlockAndy Grover
rs_send_drop_to() is called during socket close. If it takes m_rs_lock without disabling interrupts, then rds_send_remove_from_sock() can run from the rx completion handler and thus deadlock. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-03rds: fix iband RDMA dependenciesRandy Dunlap
Fix RDS Infiniband dependencies for RDMA so that these build errors won't happen: ERROR: "rdma_accept" [net/rds/rds.ko] undefined! ERROR: "rdma_destroy_id" [net/rds/rds.ko] undefined! ERROR: "rdma_connect" [net/rds/rds.ko] undefined! ERROR: "rdma_destroy_qp" [net/rds/rds.ko] undefined! ERROR: "rdma_listen" [net/rds/rds.ko] undefined! ERROR: "rdma_notify" [net/rds/rds.ko] undefined! ERROR: "rdma_create_id" [net/rds/rds.ko] undefined! ERROR: "rdma_create_qp" [net/rds/rds.ko] undefined! ERROR: "rdma_bind_addr" [net/rds/rds.ko] undefined! ERROR: "rdma_resolve_route" [net/rds/rds.ko] undefined! ERROR: "rdma_disconnect" [net/rds/rds.ko] undefined! ERROR: "rdma_reject" [net/rds/rds.ko] undefined! ERROR: "rdma_resolve_addr" [net/rds/rds.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02rds: Fix build on powerpc.David S. Miller
As reported by Stephen Rothwell. > Today's linux-next build (powerpc allyesconfig) failed like this: > > net/rds/cong.c: In function 'rds_cong_set_bit': > net/rds/cong.c:284: error: implicit declaration of function 'generic___set_le_bit' > net/rds/cong.c: In function 'rds_cong_clear_bit': > net/rds/cong.c:298: error: implicit declaration of function 'generic___clear_le_bit' > net/rds/cong.c: In function 'rds_cong_test_bit': > net/rds/cong.c:309: error: implicit declaration of function 'generic_test_le_bit' Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Kconfig and MakefileAndy Grover
Add RDS Kconfig and Makefile, and modify net/'s to add us to the build. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Common RDMA transport codeAndy Grover
Although most of IB and iWARP are separated from each other, there is some common code required to handle their shared CM listen port. This code listens for CM events and then dispatches the event to the appropriate transport, either IB or iWARP. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Add iWARP supportAndy Grover
Support for iWARP NICs is implemented as a separate RDS transport from IB. The code, however, is very similar to IB (it was forked, basically.) so let's keep it in one changeset. The reason for this duplicationis that despite its similarity to IB, there are a number of places where it has different semantics. iwarp zcopy support is still under development, and giving it its own sandbox ensures that IB code isn't disrupted while iwarp changes. Over time these transports will re-converge. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS/IB: Stats and sysctlsAndy Grover
IB-specific stats and sysctls. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS/IB: Receive datagrams via IBAndy Grover
Header parsing, ring refill. It puts the incoming data into an rds_incoming struct, which is passed up to rds-core. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS/IB: Implement IB-specific datagram send.Andy Grover
Specific to IB is a credits-based flow control mechanism, in addition to the expected usage of the IB API to package outgoing data into work requests. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS/IB: Implement RDMA ops using FMRsAndy Grover
Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS/IB: Ring-handling code.Andy Grover
Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS/IB: Infiniband transportAndy Grover
Registers as an RDS transport and an IB client, and uses IB CM API to allocate ids, queue pairs, and the rest of that fun stuff. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: RDMA supportAndy Grover
Some transports may support RDMA features. This handles the non-transport-specific parts, like pinning user pages and tracking mapped regions. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: recv.cAndy Grover
Upon receiving a datagram from the transport, RDS parses the headers and potentially queues an ACK. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: send.cAndy Grover
This is the code to send an RDS datagram. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Message parsingAndy Grover
Parsing of newly-received RDS message headers (including ext. headers) and copy-to/from-user routines. page.c implements a per-cpu page remainder cache, to reduce the number of allocations needed for small datagrams. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: sysctlsAndy Grover
RDS exposes a few tunable parameters via sysctls. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: loopbackAndy Grover
A simple rds transport to handle loopback connections. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Connection handlingAndy Grover
While arguably the fact that the underlying transport needs a connection to convey RDS's datagrame reliably is not important to rds proper, the transports implemented so far (IB and TCP) have both been connection-oriented, and so the connection state machine-related code is in the common rds code. This patch also includes several work items, to handle connecting, sending, receiving, and shutdown. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Info and statsAndy Grover
RDS currently generates a lot of stats that are accessible via the rds-info utility. This code implements the support for this. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Transport codeAndy Grover
RDS supports multiple transports. While this initial submission only supports Infiniband transport, this abstraction allows others to be added. We're working on an iWARP transport, and also see UDP over DCB as another possibility. This code handles transport registration. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Congestion-handling codeAndy Grover
RDS handles per-socket congestion by updating peers with a complete congestion map (8KB). This code keeps track of these maps for itself and ones received from peers. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Main header fileAndy Grover
RDS's main data structure definitions and exported functions. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>