summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2014-02-14Bluetooth: Fix unreleased rfcomm_dev referencePeter Hurley
When RFCOMM_RELEASE_ONHUP is set, the rfcomm tty driver 'takes over' the initial rfcomm_dev reference created by the RFCOMMCREATEDEV ioctl. The assumption is that the rfcomm tty driver will release the rfcomm_dev reference when the tty is freed (in rfcomm_tty_cleanup()). However, if the tty is never opened, the 'take over' never occurs, so when RFCOMMRELEASEDEV ioctl is called, the reference is not released. Track the state of the reference 'take over' so that the release is guaranteed by either the RFCOMMRELEASEDEV ioctl or the rfcomm tty driver. Note that the synchronous hangup in rfcomm_release_dev() ensures that rfcomm_tty_install() cannot race with the RFCOMMRELEASEDEV ioctl. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Bluetooth: Release rfcomm_dev only oncePeter Hurley
No logic prevents an rfcomm_dev from being released multiple times. For example, if the rfcomm_dev ref count is large due to pending tx, then multiple RFCOMMRELEASEDEV ioctls may mistakenly release the rfcomm_dev too many times. Note that concurrent ioctls are not required to create this condition. Introduce RFCOMM_DEV_RELEASED status bit which guarantees the rfcomm_dev can only be released once. NB: Since the flags are exported to userspace, introduce the status field to track state for which userspace should not be aware. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Bluetooth: Exclude released devices from RFCOMMGETDEVLIST ioctlPeter Hurley
When enumerating RFCOMM devices in the rfcomm_dev_list, holding the rfcomm_dev_lock only guarantees the existence of the enumerated rfcomm_dev in memory, and not safe access to its state. Testing the device state (such as RFCOMM_TTY_RELEASED) does not guarantee the device will remain in that state for the subsequent access to the rfcomm_dev's fields, nor guarantee that teardown has not commenced. Obtain an rfcomm_dev reference for the duration of rfcomm_dev access. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Bluetooth: Fix racy acquire of rfcomm_dev referencePeter Hurley
rfcomm_dev_get() can return a rfcomm_dev reference for a device for which destruction may be commencing. This can happen on tty destruction, which calls rfcomm_tty_cleanup(), the last port reference may have been released but RFCOMM_TTY_RELEASED was not set. The following race is also possible: CPU 0 | CPU 1 | rfcomm_release_dev rfcomm_dev_get | . spin_lock | . dev = __rfcomm_dev_get | . if dev | . if test_bit(TTY_RELEASED) | . | !test_and_set_bit(TTY_RELEASED) | tty_port_put <<<< last reference else | tty_port_get | The reference acquire is bogus because destruction will commence with the release of the last reference. Ignore the external state change of TTY_RELEASED and instead rely on the reference acquire itself to determine if the reference is valid. Cc: Jiri Slaby <jslaby@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Revert "Bluetooth: Move rfcomm_get_device() before rfcomm_dev_activate()"Peter Hurley
This reverts commit e228b63390536f5b737056059a9a04ea016b1abf. This is the third of a 3-patch revert, together with Revert "Bluetooth: Remove rfcomm_carrier_raised()" and Revert "Bluetooth: Always wait for a connection on RFCOMM open()". Commit 4a2fb3ecc7467c775b154813861f25a0ddc11aa0, "Bluetooth: Always wait for a connection on RFCOMM open()" open-codes blocking on tty open(), rather than using the default behavior implemented by the tty port. The reasons for reverting that patch are detailed in that changelog; this patch restores required functionality for that revert. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Revert "Bluetooth: Always wait for a connection on RFCOMM open()"Peter Hurley
This reverts commit 4a2fb3ecc7467c775b154813861f25a0ddc11aa0. This is the second of a 3-patch revert, together with Revert "Bluetooth: Remove rfcomm_carrier_raised()" and Revert "Bluetooth: Move rfcomm_get_device() before rfcomm_dev_activate()". Before commit cad348a17e170451ea8688b532a6ca3e98c63b60, Bluetooth: Implement .activate, .shutdown and .carrier_raised methods, tty_port_block_til_ready() was open-coded in rfcomm_tty_install() as part of the RFCOMM tty open(). Unfortunately, it did not implement non-blocking open nor CLOCAL open, but rather always blocked for carrier. This is not the expected or typical behavior for ttys, and prevents several common terminal programming idioms from working (eg., opening in non-blocking mode to initialize desired termios settings then re-opening for connection). Commit cad348a17e170451ea8688b532a6ca3e98c63b60, Bluetooth: Implement .activate, .shutdown and .carrier_raised methods, added the necessary tty_port methods to use the default tty_port_open(). However, this triggered two important user-space regressions. The first regression involves the complicated mechanism for reparenting the rfcomm tty device to the ACL link device which represents an open link to a specific bluetooth host. This regression causes ModemManager to conclude the rfcomm tty device does not front a modem so it makes no attempt to initialize an attached modem. This regression is caused by the lack of a device_move() if the dlc is already open (and not specifically related to the open-coded block_til_ready()). A more appropriate solution is submitted in "Bluetooth: Fix unsafe RFCOMM device parenting" and "Bluetooth: Fix RFCOMM parent device for reused dlc" The second regression involves "rfcomm bind" and wvdial (a ppp dialer). rfcomm bind creates a device node for a /dev/rfcomm<n>. wvdial opens that device in non-blocking mode (because it expects the connection to have already been established). In addition, subsequent writes to the rfcomm tty device fail (because the link is not yet connected; rfcomm connection begins with the actual tty open()). However, restoring the original behavior (in the patch which this reverts) was undesirable. Firstly, the original reporter notes that a trivial userspace "workaround" already exists: rfcomm connect, which creates the device node and establishes the expected connection. Secondly, the failed writes occur because the rfcomm tty driver does not buffer writes to an unconnected device; this contrasts with the dozen of other tty drivers (in fact, all of them) that do just that. The submitted patch "Bluetooth: Don't fail RFCOMM tty writes" corrects this. Thirdly, it was a long-standing bug to block on non-blocking open, which is re-fixed by revert. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Revert "Bluetooth: Remove rfcomm_carrier_raised()"Peter Hurley
This reverts commit f86772af6a0f643d3e13eb3f4f9213ae0c333ee4. This is the first of a 3-patch revert, together with Revert "Bluetooth: Always wait for a connection on RFCOMM open()" and Revert "Bluetooth: Move rfcomm_get_device() before rfcomm_dev_activate()". Commit 4a2fb3ecc7467c775b154813861f25a0ddc11aa0, "Bluetooth: Always wait for a connection on RFCOMM open()" open-codes blocking on tty open(), rather than using the default behavior implemented by the tty port. The reasons for reverting that patch are detailed in that changelog; this patch restores required functionality for that revert. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-By: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14Bluetooth: Enable LE L2CAP CoC support by defaultJohan Hedberg
Now that the LE L2CAP Connection Oriented Channel support has undergone a decent amount of testing we can make it officially supported. This patch removes the enable_lecoc module parameter which was previously needed to enable support for LE L2CAP CoC. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-14appletalk: fix checkpatch error with indentwangweidong
checkpatch error: switch and case should be at the same indent. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14appletalk: fix checkpatch errors with foo* bar|foo * barwangweidong
fix checkpatch errors below: ERROR: "foo* bar" should be "foo *bar" ERROR: "foo * bar" should be "foo *bar" Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14appletalk: fix checkpatch errors with space required or prohibitedwangweidong
fix checkpatch errors while the space is required or prohibited Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14tcp: add pacing_rate information into tcp_infoEric Dumazet
Add two new fields to struct tcp_info, to report sk_pacing_rate and sk_max_pacing_rate to monitoring applications, as ss from iproute2. User exported fields are 64bit, even if kernel is currently using 32bit fields. lpaa5:~# ss -i .. skmem:(r0,rb357120,t0,tb2097152,f1584,w1980880,o0,bl0) ts sack cubic wscale:6,6 rto:400 rtt:0.875/0.75 mss:1448 cwnd:1 ssthresh:12 send 13.2Mbps pacing_rate 3336.2Mbps unacked:15 retrans:1/5448 lost:15 rcv_space:29200 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14net: introduce netdev_alloc_pcpu_stats() for driversWANG Cong
There are many drivers calling alloc_percpu() to allocate pcpu stats and then initializing ->syncp. So just introduce a helper function for them. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14net-sysfs: get_netdev_queue_index() cleanupEric Dumazet
Remove one inline keyword, and no need for a loop to find an index into a table. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14netfilter: nf_nat_snmp_basic: fix duplicates in if/else branchesFX Le Bail
The solution was found by Patrick in 2.4 kernel sources. Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-14netfilter: nft_meta: fix typo "CONFIG_NET_CLS_ROUTE"Paul Bolle
There are two checks for CONFIG_NET_CLS_ROUTE, but the corresponding Kconfig symbol was dropped in v2.6.39. Since the code guards access to dst_entry.tclassid it seems CONFIG_IP_ROUTE_CLASSID should be used instead. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-14netfilter: nft_reject_inet: fix unintended fall-through in switch-statatementPatrick McHardy
For IPv4 packets, we call both IPv4 and IPv6 reject. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-14sch_netem: replace magic numbers with enumerate in GE modelYang Yingliang
Replace some magic numbers which describe states of GE model loss generator with enumerate. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14sch_netem: change some func's param from "struct Qdisc *" to "struct ↵Yang Yingliang
netem_sched_data *" In netem_change(), we have already get "struct netem_sched_data *q". Replace params of get_correlation() and other similar functions with "struct netem_sched_data *q". Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14sch_netem: return errcode before setting paramsYang Yingliang
get_dist_table() and get_loss_clg() may be failed. These two functions should be called after setting the members of qdisc_priv(sch), or it will break the old settings while either of them is failed. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14ipv4: ipconfig.c: add parentheses in an if statementFX Le Bail
Even if the 'time_before' macro expand with parentheses, the look is bad. Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13ipv4: ip_forward: perform skb->pkt_type check at the beginningDenis Kirjanov
Packets which have L2 address different from ours should be already filtered before entering into ip_forward(). Perform that check at the beginning to avoid processing such packets. Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: remove unnecessary return'sstephen hemminger
One of my pet coding style peeves is the practice of adding extra return; at the end of function. Kill several instances of this in network code. I suppose some coccinelle wizardy could do this automatically. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: sched: Cleanup PIE commentsVijay Subramanian
Fix incorrect comment reported by Norbert Kiesel. Edit another comment to add more details. Also add references to algorithm (IETF draft and paper) to top of file. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> CC: Mythili Prabhu <mysuryan@cisco.com> CC: Norbert Kiesel <nkiesel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tcp: remove unused min_cwnd member of tcp_congestion_opsStanislav Fomichev
Commit 684bad110757 "tcp: use PRR to reduce cwin in CWR state" removed all calls to min_cwnd, so we can safely remove it. Also, remove tcp_reno_min_cwnd because it was only used for min_cwnd. Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13socket: replace some printk with pr_*Yang Yingliang
Prefer pr_*(...) to printk(KERN_* ...). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: add node_lock protection to link lookup functionJon Paul Maloy
In an earlier commit, ("tipc: remove links list from bearer struct") we described three issues that need to be pre-emptively resolved before we can remove tipc_net_lock. Here we resolve issue a) described in that commit: "a) In access method #2, we access the link before taking the protecting node_lock. This will not work once net_lock is gone, so we will have to change the access order. We will deal with this in a later commit in this series." Here, we change that access order, by ensuring that the function link_find_link() returns only a safe reference for finding the link, i.e., a node pointer and an index into its 'links' array, not the link pointer itself. We also change all callers of this function to first take the node lock before they can check if there still is a valid link pointer at the returned index. Since the function now returns a node pointer rather than a link pointer, we rename it to the more appropriate 'tipc_link_find_owner(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: remove bearer_lock from tipc_bearer structYing Xue
After the earlier commits ("tipc: remove 'links' list from tipc_bearer struct") and ("tipc: introduce new spinlock to protect struct link_req"), there is no longer any need to protect struct link_req or or any link list by use of bearer_lock. Furthermore, we have eliminated the need for using bearer_lock during downcalls (send) from the link to the bearer, since we have ensured that bearers always have a longer life cycle that their associated links, and always contain valid data. So, the only need now for a lock protecting bearers is for guaranteeing consistency of the bearer list itself. For this, it is sufficient, at least for the time being, to continue applying 'net_lock´ in write mode. By removing bearer_lock we also pre-empt introduction of issue b) descibed in the previous commit "tipc: remove 'links' list from tipc_bearer struct": "b) When the outer protection from net_lock is gone, taking bearer_lock and node_lock in opposite order of method 1) and 2) will become an obvious deadlock hazard". Therefore, we now eliminate the bearer_lock spinlock. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: delay delete of link when failover is neededJon Paul Maloy
When a bearer is disabled, all its attached links are deleted. Ideally, we should do link failover to redundant links on other bearers, if there are any, in such cases. This would be consistent with current behavior when a link is reset, but not deleted. However, due to the complexity involved, and the (wrongly) perceived low demand for this feature, it was never implemented until now. We mark the doomed link for deletion with a new flag, but wait until the failover process is finished before we actually delete it. With the improved link tunnelling/failover code introduced earlier in this commit series, it is now easy to identify a spot in the code where the failover is finished and it is safe to delete the marked link. Moreover, the test for the flag and the deletion can be done synchronously, and outside the most time critical data path. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: changes to general packet reception algorithmJon Paul Maloy
We change the order of checking for destination users when processing incoming packets. By placing the checks for users that may potentially replace the processed buffer, i.e., CHANGEOVER_PROTOCOL and MSG_FRAGMENTER, in a separate step before we check for the true end users, we get rid of a label and a 'goto', at the same time making the code more comprehensible and easy to follow. This commit does not change any functionality, it is just a cosmetic code reshuffle. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: rename stack variables in function tipc_link_tunnel_rcvJon Paul Maloy
After the previous redesign of the tunnel reception algorithm and functions, we finalize it by renaming a couple of stack variables in tipc_tunnel_rcv(). This makes it more consistent with the naming scheme elsewhere in this part of the code. This change is purely cosmetic, with no functional changes. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: more cleanup of tunnelling reception functionJon Paul Maloy
We simplify and slim down the code in function tipc_tunnel_rcv() No impact on the users of this function. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: change signature of tunnelling reception functionJon Paul Maloy
After the earlier commits in this series related to the function tipc_link_tunnel_rcv(), we can now go further and simplify its signature. The function now consumes all DUPLICATE packets, and only returns such ORIGINAL packets that are ready for immediate delivery, i.e., no more link level protocol processing needs to be done by the caller. As a consequence, the the caller, tipc_rcv(), does not access the link pointer after call return, and it becomes unnecessary to pass a link pointer reference in the call. Instead, we now only pass it the tunnel link's owner node, which is sufficient to find the destination link for the tunnelled packet. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: change reception of tunnelled failover packetsJon Paul Maloy
When a link is reset, and there is a redundant link available, all sender sockets will steer their subsequent traffic through the remaining link. In order to guarantee preserved packet order and cardinality during the transition, we tunnel the failing link's send queue through the remaining link before we allow any sockets to use it. In this commit, we change the algorithm for receiving failover ("ORIGINAL_MSG") packets in tipc_link_tunnel_rcv(), at the same time delegating it to a new subfuncton, tipc_link_failover_rcv(). Instead of directly returning an extracted inner packet to the packet reception loop in tipc_rcv(), we first check if it is a message fragment, in which case we append it to the reset link's fragment chain. If the fragment chain is complete, we return the whole chain instead of the individual buffer, eliminating any need for the tipc_rcv() loop to do reassembly of tunneled packets. This change makes it possible to further simplify tipc_link_tunnel_rcv(), as well as the calling tipc_rcv() loop. We will do that in later commits. It also makes it possible to identify a single spot in the code where we can tell that a failover procedure is finished, something that is useful when we are deleting links after a failover. This will also be done in a later commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: change reception of tunnelled duplicate packetsJon Paul Maloy
When a second link to a destination comes up, some sender sockets will steer their subsequent traffic through the new link. In order to guarantee preserved packet order and cardinality for those sockets, we tunnel a duplicate of the old link's send queue through the new link before we open it for regular traffic. The last arriving packet copy, on whichever link, will be dropped at the receiving end based on the original sequence number, to ensure that only one copy is delivered to the end receiver. In this commit, we change the algorithm for receiving DUPLICATE_MSG packets, at the same time delegating it to a new subfunction, tipc_link_dup_rcv(). Instead of returning an extracted inner packet to the packet reception loop in tipc_rcv(), we just add it to the receiving (new) link's deferred packet queue. The packet will then be processed by that link when it receives its first non-tunneled packet, i.e., at latest when the changeover procedure is finished. Because tipc_link_tunnel_rcv()/tipc_link_dup_rcv() now is consuming all packets of type DUPLICATE_MSG, the calling tipc_rcv() function can omit testing for this. This in turn means that the current conditional jump to the label 'protocol_check' becomes redundant, and we can remove that label. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: remove 'links' list from tipc_bearer structYing Xue
In our ongoing effort to simplify the TIPC locking structure, we see a need to remove the linked list for tipc_links in the bearer. This can be explained as follows. Currently, we have three different ways to access a link, via three different lists/tables: 1: Via a node hash table: Used by the time-critical outgoing/incoming data paths. (e.g. link_send_sections_fast() and tipc_recv_msg() ): grab net_lock(read) find node from node hash table grab node_lock select link grab bearer_lock send_msg() release bearer_lock release node lock release net_lock 2: Via a global linked list for nodes: Used by configuration commands (link_cmd_set_value()) grab net_lock(read) find node and link from global node list (using link name) grab node_lock update link release node lock release net_lock (Same locking order as above. No problem.) 3: Via the bearer's linked link list: Used by notifications from interface (e.g. tipc_disable_bearer() ) grab net_lock(write) grab bearer_lock get link ptr from bearer's link list get node from link grab node_lock delete link release node lock release bearer_lock release net_lock (Different order from above, but works because we grab the outer net_lock in write mode first, excluding all other access.) The first major goal in our simplification effort is to get rid of the "big" net_lock, replacing it with rcu-locks when accessing the node list and node hash array. This will come in a later patch series. But to get there we first need to rewrite access methods ##2 and 3, since removal of net_lock would introduce three major problems: a) In access method #2, we access the link before taking the protecting node_lock. This will not work once net_lock is gone, so we will have to change the access order. We will deal with this in a later commit in this series, "tipc: add node lock protection to link found by link_find_link()". b) When the outer protection from net_lock is gone, taking bearer_lock and node_lock in opposite order of method 1) and 2) will become an obvious deadlock hazard. This is fixed in the commit ("tipc: remove bearer_lock from tipc_bearer struct") later in this series. c) Similar to what is described in problem a), access method #3 starts with using a link pointer that is unprotected by node_lock, in order to via that pointer find the correct node struct and lock it. Before we remove net_lock, this access order must be altered. This is what we do with this commit. We can avoid introducing problem problem c) by even here using the global node list to find the node, before accessing its links. When we loop though the node list we use the own bearer identity as search criteria, thus easily finding the links that are associated to the resetting/disabling bearer. It should be noted that although this method is somewhat slower than the current list traversal, it is in no way time critical. This is only about resetting or deleting links, something that must be considered relatively infrequent events. As a bonus, we can get rid of the mutual pointers between links and bearers. After this commit, pointer dependency go in one direction only: from the link to the bearer. This commit pre-empts introduction of problem c) as described above. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: redefine 'started' flag in struct link to bitmapYing Xue
Currently, the 'started' field in struct tipc_link represents only a binary state, 'started' or 'not started'. We need it to represent more link execution states in the coming commits in this series. Hence, we rename the field to 'flags', and define the current started/non-started state to be represented by the LSB bit of that field. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: move code for deleting links from bearer.c to link.cYing Xue
We break out the code for deleting attached links in the function bearer_disable(), and define a new function named tipc_link_delete_list() to do this job. This commit incurs no functional changes, but makes the code of function bearer_disable() cleaner. It is also a preparation for a more important change to the bearer code, in a subsequent commit in this series. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: move code for resetting links from bearer.c to link.cYing Xue
We break out the code for resetting attached links in the function tipc_reset_bearer(), and define a new function named tipc_link_reset_list() to do this job. This commit incurs no functional changes, but makes the code of function tipc_reset_bearer() cleaner. It is also a preparation for a more important change to the bearer code, in a subsequent commit in this series. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: stricter behavior of message reassembly functionJon Paul Maloy
The function tipc_link_recv_fragment(struct sk_buff **buf) currently leaves the value of the input buffer pointer undefined when it returns, except when the return code indicates that the reassembly is complete. This despite the fact that it always consumes the input buffer. Here, we enforce a stricter behavior by this function, ensuring that the returned buffer pointer is non-NULL if and only if the reassembly is complete. This makes it possible to test for the buffer pointer as criteria for successful reassembly. We also rename the function to tipc_link_frag_rcv(), which is both shorter and more in line with common naming practice in the network subsystem. Apart from the new name, these changes have no impact on current users of the function, but makes it more practical for use in some planned future commits. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: explicitly include core.h in addr.hAndreas Bofjäll
The inline functions in addr.h uses tipc_own_addr which is exported by core.h, but addr.h never actually includes it. It works because it is explicitly included where this is used, but it looks a bit strange. Include core.h in addr.h explicitly to make the dependency clearer. Signed-off-by: Andreas Bofjäll <andreas.bofjall@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: ip, ipv6: handle gso skbs in forwarding pathFlorian Westphal
Marcelo Ricardo Leitner reported problems when the forwarding link path has a lower mtu than the incoming one if the inbound interface supports GRO. Given: Host <mtu1500> R1 <mtu1200> R2 Host sends tcp stream which is routed via R1 and R2. R1 performs GRO. In this case, the kernel will fail to send ICMP fragmentation needed messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu checks in forward path. Instead, Linux tries to send out packets exceeding the mtu. When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does not fragment the packets when forwarding, and again tries to send out packets exceeding R1-R2 link mtu. This alters the forwarding dstmtu checks to take the individual gso segment lengths into account. For ipv6, we send out pkt too big error for gso if the individual segments are too big. For ipv4, we either send icmp fragmentation needed, or, if the DF bit is not set, perform software segmentation and let the output path create fragments when the packet is leaving the machine. It is not 100% correct as the error message will contain the headers of the GRO skb instead of the original/segmented one, but it seems to work fine in my (limited) tests. Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid sofware segmentation. However it turns out that skb_segment() assumes skb nr_frags is related to mss size so we would BUG there. I don't want to mess with it considering Herbert and Eric disagree on what the correct behavior should be. Hannes Frederic Sowa notes that when we would shrink gso_size skb_segment would then also need to deal with the case where SKB_MAX_FRAGS would be exceeded. This uses sofware segmentation in the forward path when we hit ipv4 non-DF packets and the outgoing link mtu is too small. Its not perfect, but given the lack of bug reports wrt. GRO fwd being broken this is a rare case anyway. Also its not like this could not be improved later once the dust settles. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: core: introduce netif_skb_dev_featuresFlorian Westphal
Will be used by upcoming ipv4 forward path change that needs to determine feature mask using skb->dst->dev instead of skb->dev. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13sctp: optimize the sctp_sysctl_net_registerwangweidong
Here, when the net is init_net, we needn't to kmemdup the ctl_table again. So add a check for net. Also we can save some memory. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13sctp: fix a missed .data initializationwangweidong
As commit 3c68198e75111a90("sctp: Make hmac algorithm selection for cookie generation dynamic"), we miss the .data initialization. If we don't use the net_namespace, the problem that parts of the sysctl configuration won't be isolation and won't occur. In sctp_sysctl_net_register(), we register the sysctl for each net, in the for(), we use the 'table[i].data' as check condition, so when the 'i' is the index of sctp_hmac_alg, the data is NULL, then break. So add the .data initialization. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: correct error path in rtnl_newlink()Cong Wang
I saw the following BUG when ->newlink() fails in rtnl_newlink(): [ 40.240058] kernel BUG at net/core/dev.c:6438! this is due to free_netdev() is not supposed to be called before netdev is completely unregistered, therefore it is not correct to call free_netdev() here, at least for ops->newlink!=NULL case, many drivers call it in ->destructor so that rtnl_unlock() will take care of it, we probably don't need to do anything here. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: fix message corruption bug for deferred packetsErik Hugne
If a packet received on a link is out-of-sequence, it will be placed on a deferred queue and later reinserted in the receive path once the preceding packets have been processed. The problem with this is that it will be subject to the buffer adjustment from link_recv_buf_validate twice. The second adjustment for 20 bytes header space will corrupt the packet. We solve this by tagging the deferred packets and bail out from receive buffer validation for packets that have already been subjected to this. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
2014-02-13cgroup: drop @skip_css from cgroup_taskset_for_each()Tejun Heo
If !NULL, @skip_css makes cgroup_taskset_for_each() skip the matching css. The intention of the interface is to make it easy to skip css's (cgroup_subsys_states) which already match the migration target; however, this is entirely unnecessary as migration taskset doesn't include tasks which are already in the target cgroup. Drop @skip_css from cgroup_taskset_for_each(). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Daniel Borkmann <dborkman@redhat.com>
2014-02-13Bluetooth: Use connection parameters if anyAndre Guedes
This patch changes hci_connect_le() so it uses the connection parameters specified for the certain device. If no parameters were configured, we use the default values. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>