summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-08-24sfc: Stop TX queues before they fill upBen Hutchings
We now have a definite upper bound on the number of descriptors per skb; use that to stop the queue when the next packet might not fit. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-08-24sfc: Refactor struct efx_tx_buffer to use a flags fieldBen Hutchings
Add a flags field to struct efx_tx_buffer, replacing the continuation and map_single booleans. Since a single descriptor cannot be both a TSO header and the last descriptor for an skb, unionise efx_tx_buffer::{skb,tsoh} and add flags for validity of these fields. Clear all flags in free buffers (whereas previously the continuation flag would be set). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-08-24tcp: fix cwnd reduction for non-sack recoveryYuchung Cheng
The cwnd reduction in fast recovery is based on the number of packets newly delivered per ACK. For non-sack connections every DUPACK signifies a packet has been delivered, but the sender mistakenly skips counting them for cwnd reduction. The fix is to compute newly_acked_sacked after DUPACKs are accounted in sacked_out for non-sack connections. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24team: do not allow to add VLAN challenged port when vlan is usedJiri Pirko
Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24vlan: add helper which can be called to see if device is used by vlanJiri Pirko
also, remove unused vlan_info definition from header CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24team: don't print warn message on -ESRCH during event sendJiri Pirko
When no one is listening on NL socket, -ESRCH is returned and warning message is printed. This message is confusing people and in fact has no meaning. So do not print it in this case. Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24netlink: fix possible spoofing from non-root processesPablo Neira Ayuso
Non-root user-space processes can send Netlink messages to other processes that are well-known for being subscribed to Netlink asynchronous notifications. This allows ilegitimate non-root process to send forged messages to Netlink subscribers. The userspace process usually verifies the legitimate origin in two ways: a) Socket credentials. If UID != 0, then the message comes from some ilegitimate process and the message needs to be dropped. b) Netlink portID. In general, portID == 0 means that the origin of the messages comes from the kernel. Thus, discarding any message not coming from the kernel. However, ctnetlink sets the portID in event messages that has been triggered by some user-space process, eg. conntrack utility. So other processes subscribed to ctnetlink events, eg. conntrackd, know that the event was triggered by some user-space action. Neither of the two ways to discard ilegitimate messages coming from non-root processes can help for ctnetlink. This patch adds capability validation in case that dst_pid is set in netlink_sendmsg(). This approach is aggressive since existing applications using any Netlink bus to deliver messages between two user-space processes will break. Note that the exception is NETLINK_USERSOCK, since it is reserved for netlink-to-netlink userspace communication. Still, if anyone wants that his Netlink bus allows netlink-to-netlink userspace, then they can set NL_NONROOT_SEND. However, by default, I don't think it makes sense to allow to use NETLINK_ROUTE to communicate two processes that are sending no matter what information that is not related to link/neighbouring/routing. They should be using NETLINK_USERSOCK instead for that. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24w5300: using eth_hw_addr_random() for random MAC and set device flagWei Yongjun
Using eth_hw_addr_random() to generate a random Ethernet address (MAC) to be used by a net device and set addr_assign_type. Not need to duplicating its implementation. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24w5100: using eth_hw_addr_random() for random MAC and set device flagWei Yongjun
Using eth_hw_addr_random() to generate a random Ethernet address (MAC) to be used by a net device and set addr_assign_type. Not need to duplicating its implementation. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24wimax/i2400m: use is_zero_ether_addr() instead of memcmp()Wei Yongjun
Using is_zero_ether_addr() instead of directly use memcmp() to determine if the ethernet address is all zeros. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24stmmac: add header inclusion protectionRayagond Kokatanur
This patch adds "#ifndef __<header>_H" for protecting header from double inclusion. Signed-off-by: Rayagond Kokatanur <rayagond@vayavyalabs.com> Hacked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24net: Set device operstate at registration timeBen Hutchings
The operstate of a device is initially IF_OPER_UNKNOWN and is updated asynchronously by linkwatch after each change of carrier state reported by the driver. The default carrier state of a net device is on, and this will never be changed on drivers that do not support carrier detection, thus the operstate remains IF_OPER_UNKNOWN. For devices that do support carrier detection, the driver must set the carrier state to off initially, then poll the hardware state when the device is opened. However, we must not activate linkwatch for a unregistered device, and commit b473001 ('net: Do not fire linkwatch events until the device is registered.') ensured that we don't. But this means that the operstate for many devices that support carrier detection remains IF_OPER_UNKNOWN when it should be IF_OPER_DOWN. The same issue exists with the dormant state. The proper initialisation sequence, avoiding a race with opening of the device, is: rtnl_lock(); rc = register_netdevice(dev); if (rc) goto out_unlock; netif_carrier_off(dev); /* or netif_dormant_on(dev) */ rtnl_unlock(); but it seems silly that this should have to be repeated in so many drivers. Further, the operstate seen immediately after opening the device may still be IF_OPER_UNKNOWN due to the asynchronous nature of linkwatch. Commit 22604c8 ('net: Fix for initial link state in 2.6.28') attempted to fix this by setting the operstate synchronously, but it was reverted as it could lead to deadlock. This initialises the operstate synchronously at registration time only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24net/fsl: introduce Freescale 10G MDIO driverTimur Tabi
Similar to fsl_pq_mdio.c, this driver is for the 10G MDIO controller on Freescale Frame Manager Ethernet controllers. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24cls_cgroup: Allow classifier cgroups to have their classid reset to 0Neil Horman
The network classifier cgroup initalizes each cgroups instance classid value to 0. However, the sock_update_classid function only updates classid's in sockets if the tasks cgroup classid is not zero, and if it differs from the current classid. The later check is to prevent cache line dirtying, but the former is detrimental, as it prevents resetting a classid for a cgroup to 0. While this is not a common action, it has administrative usefulness (if the admin wants to disable classification of a certain group temporarily for instance). Easy fix, just remove the zero check. Tested successfully by myself Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2012-08-24ipv4: take rt_uncached_lock only if neededEric Dumazet
Multicast traffic allocates dst with DST_NOCACHE, but dst is not inserted into rt_uncached_list. This slowdown multicast workloads on SMP because rt_uncached_lock is contended. Change the test before taking the lock to actually check the dst was inserted into rt_uncached_list. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Antonio Quartulli says: ==================== Included changes: - a set of codestyle rearrangements/fixes - new feature to early detect new joining (mesh-unaware) clients - a minor fix for the gw-feature - substitution of shift operations with the BIT() macro - reorganization of the main batman-adv structure (struct batadv_priv) - some more (very) minor cleanups and fixes =================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-08-24can: sja1000_platform: fix wrong flag IRQF_SHARED for interrupt sharingSven Schmitt
The sja1000 platform driver wrongly assumes that a shared IRQ is indicated with the IRQF_SHARED flag in irq resource flags. This patch changes the driver to handle the correct flag IORESOURCE_IRQ_SHAREABLE instead. There are no mainline users of the platform driver which wrongly make use of IRQF_SHARED. Signed-off-by: Sven Schmitt <sven.schmitt@volkswagen.de> Acked-by: Yegor Yefremov <yegorslists@googlemail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-08-24can: softing: Fix potential memory leak in softing_load_fw()Alexey Khoroshilov
Do not leak memory by updating pointer with potentially NULL realloc return value. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-08-24sfc: Fix reporting of IPv4 full filters through ethtoolBen Hutchings
ETHTOOL_GRXCLSRULE returns filters for a TCP/IPv4 or UDP/IPv4 4-tuple with source and destination swapped. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-08-23packet: fix broken build.Rami Rosen
This patch fixes a broken build due to a missing header: ... CC net/ipv4/proc.o In file included from include/net/net_namespace.h:15, from net/ipv4/proc.c:35: include/net/netns/packet.h:11: error: field 'sklist_lock' has incomplete type ... The lock of netns_packet has been replaced by a recent patch to be a mutex instead of a spinlock, but we need to replace the header file to be linux/mutex.h instead of linux/spinlock.h as well. See commit 0fa7fa98dbcc2789409ed24e885485e645803d7f: packet: Protect packet sk list with mutex (v2) patch, Signed-off-by: Rami Rosen <rosenr@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-23af_packet: match_fanout_group() can be staticFengguang Wu
cc: Eric Leblond <eric@regit.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-23net: reinstate rtnl in call_netdevice_notifiers()Eric Dumazet
Eric Biederman pointed out that not holding RTNL while calling call_netdevice_notifiers() was racy. This patch is a direct transcription his feedback against commit 0115e8e30d6fc (net: remove delay at device dismantle) Thanks Eric ! Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-23Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
2012-08-23Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
2012-08-23batman-adv: Start new development cycleSven Eckelmann
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: change interface_rx to get orig nodeAntonio Quartulli
In order to understand where a broadcast packet is coming from and use this information to detect not yet announced clients, this patch modifies the interface_rx() function by passing a new argument: the orig node corresponding to the node that originated the received packet (if known). This new argument if not NULL for broadcast packets only (other packets does not have source field). Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: detect not yet announced clientsAntonio Quartulli
With the current TT mechanism a new client joining the network is not immediately able to communicate with other hosts because its MAC address has not been announced yet. This situation holds until the first OGM containing its joining event will be spread over the mesh network. This behaviour can be acceptable in networks where the originator interval is a small value (e.g. 1sec) but if that value is set to an higher time (e.g. 5secs) the client could suffer from several malfunctions like DHCP client timeouts, etc. This patch adds an early detection mechanism that makes nodes in the network able to recognise "not yet announced clients" by means of the broadcast packets they emitted on connection (e.g. ARP or DHCP request). The added client will then be confirmed upon receiving the OGM claiming it or purged if such OGM is not received within a fixed amount of time. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Reduce accumulated length of simple statementsSven Eckelmann
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Don't break statements after assignment operatorSven Eckelmann
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Use BIT(x) macro to calculate bit positionsSven Eckelmann
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Drop tt queries with foreign destMartin Hundebøll
When enabling promiscuous mode, tt queries for other hosts might be received. Before this patch, "foreign" tt queries were processed like any other query and thus forwarded to its destination again and thereby causing a loop. This patch adds a check to drop foreign tt queries. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Move batadv_check_unicast_packet()Martin Hundebøll
batadv_check_unicast_packet() is needed in batadv_recv_tt_query(), so move the former to before the latter. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Split batadv_priv in sub-structures for featuresSven Eckelmann
The structure batadv_priv grows everytime a new feature is introduced. It gets hard to find the parts of the struct that belongs to a specific feature. This becomes even harder by the fact that not every feature uses a prefix in the member name. The variables for bridge loop avoidence, gateway handling, translation table and visualization server are moved into separate structs that are included in the bat_priv main struct. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: check batadv_orig_hash_add_if() return codeSimon Wunderlich
If this call fails, some of the orig_nodes spaces may have been resized for the increased number of interface, and some may not. If we would just continue with the larger number of interfaces, this would lead to access to not allocated memory later. We better check the return code, and don't add the interface if no memory is available. OTOH, keeping some of the orig_nodes with too much memory allocated should hurt no one (except for a few too many bytes allocated). Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: fix typos in commentsAntonio Quartulli
the word millisecond is misspelled in several comments. This patch fixes it. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: add reference counting for type batadv_tt_orig_list_entryAntonio Quartulli
The batadv_tt_orig_list_entry structure didn't have any refcounting mechanism so far. This patch introduces it and makes the structure being usable in much more complex context. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: remove a misleading commentJonathan Corbet
As much as I'm happy to see LWN links sprinkled through the kernel by the dozen, this one in particular reflects a very old state of reality; the associated comment is now incorrect. So just delete it. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: convert remaining packet counters to per_cpu_ptr() infrastructureMarek Lindner
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: rename bridge loop avoidance claim typesSimon Wunderlich
for consistency reasons within the code and with the documentation, we should always call it "claim" and "unclaim". Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: correct comments in bridge loop avoidanceSimon Wunderlich
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: Add the backbone gateway list to debugfsSimon Wunderlich
This is especially useful if there are no claims yet, but we still want to know which gateways are using bridge loop avoidance in the network. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-23batman-adv: move function arguments on one lineAntonio Quartulli
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-08-22packet: Protect packet sk list with mutex (v2)Pavel Emelyanov
Change since v1: * Fixed inuse counters access spotted by Eric In patch eea68e2f (packet: Report socket mclist info via diag module) I've introduced a "scheduling in atomic" problem in packet diag module -- the socket list is traversed under rcu_read_lock() while performed under it sk mclist access requires rtnl lock (i.e. -- mutex) to be taken. [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002 [152363.820573] 4 locks held by crtools/12517: [152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e [152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a [152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab [152363.820693] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af Similar thing was then re-introduced by further packet diag patches (fanount mutex and pgvec mutex for rings) :( Apart from being terribly sorry for the above, I propose to change the packet sk list protection from spinlock to mutex. This lock currently protects two modifications: * sklist * prot inuse counters The sklist modifications can be just reprotected with mutex since they already occur in a sleeping context. The inuse counters modifications are trickier -- the __this_cpu_-s are used inside, thus requiring the caller to handle the potential issues with contexts himself. Since packet sockets' counters are modified in two places only (packet_create and packet_release) we only need to protect the context from being preempted. BH disabling is not required in this case. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22mdio: translation of MMD EEE registers to/from ethtool settingsAllan, Bruce W
The helper functions which translate IEEE MDIO Manageable Device (MMD) Energy-Efficient Ethernet (EEE) registers 3.20, 7.60 and 7.61 to and from the comparable ethtool supported/advertised settings will be needed by drivers other than those in PHYLIB (e.g. e1000e in a follow-on patch). In the same fashion as similar translation functions in linux/mii.h, move these functions from the PHYLIB core to the linux/mdio.h header file so the code will not have to be duplicated in each driver needing MMD-to-ethtool (and vice-versa) translations. The function and some variable names have been renamed to be more descriptive. Not tested on the only hardware that currently calls the related functions, stmmac, because I don't have access to any. Has been compile tested and the translations have been tested on a locally modified version of e1000e. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22af_packet: use define instead of constantdanborkmann@iogearbox.net
Instead of using a hard-coded value for the status variable, it would make the code more readable to use its destined define from linux/if_packet.h. Signed-off-by: daniel.borkmann@tik.ee.ethz.ch Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22rds: Don't disable BH on BH contextYing Xue
Since we have already in BH context when *_write_space(), *_data_ready() as well as *_state_change() are called, it's unnecessary to disable BH. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22bonding: support for IPv6 transmit hashingJohn Eaglesham
Currently the "bonding" driver does not support load balancing outgoing traffic in LACP mode for IPv6 traffic. IPv4 (and TCP or UDP over IPv4) are currently supported; this patch adds transmit hashing for IPv6 (and TCP or UDP over IPv6), bringing IPv6 up to par with IPv4 support in the bonding driver. In addition, bounds checking has been added to all transmit hashing functions. The algorithm chosen (xor'ing the bottom three quads of the source and destination addresses together, then xor'ing each byte of that result into the bottom byte, finally xor'ing with the last bytes of the MAC addresses) was selected after testing almost 400,000 unique IPv6 addresses harvested from server logs. This algorithm had the most even distribution for both big- and little-endian architectures while still using few instructions. Its behavior also attempts to closely match that of the IPv4 algorithm. The IPv6 flow label was intentionally not included in the hash as it appears to be unset in the vast majority of IPv6 traffic sampled, and the current algorithm not using the flow label already offers a very even distribution. Fragmented IPv6 packets are handled the same way as fragmented IPv4 packets, ie, they are not balanced based on layer 4 information. Additionally, IPv6 packets with intermediate headers are not balanced based on layer 4 information. In practice these intermediate headers are not common and this should not cause any problems, and the alternative (a packet-parsing loop and look-up table) seemed slow and complicated for little gain. Tested-by: John Eaglesham <linux@8192.net> Signed-off-by: John Eaglesham <linux@8192.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22ipv6: gre: fix ip6gre_err()Eric Dumazet
ip6gre_err() miscomputes grehlen (sizeof(ipv6h) is 4 or 8, not 40 as expected), and should take into account 'offset' parameter. Also uses pskb_may_pull() to cope with some fragged skbs Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>