summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2010-03-21pktgen node allocationRobert Olsson
Here is patch to manipulate packet node allocation and implicitly how packets are DMA'd etc. The flag NODE_ALLOC enables the function and numa_node_id(); when enabled it can also be explicitly controlled via a new node parameter Tested this with 10 Intel 82599 ports w. TYAN S7025 E5520 CPU's. Was able to TX/DMA ~80 Gbit/s to Ethernet wires. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21net: dev_getfirstbyhwtype() optimizationEric Dumazet
Use RCU to avoid RTNL use in dev_getfirstbyhwtype() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21net: snmp mib cleanupEric Dumazet
There is no point to align or pad mibs to cache lines, they are per cpu allocated with a 8 bytes alignment anyway. This wastes space for no gain. This patch removes __SNMP_MIB_ALIGN__ Since SNMP mibs contain "unsigned long" fields only, we can relax the allocation alignment from "unsigned long long" to "unsigned long" Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21atm: Use kasprintfEric Dumazet
Use kasprintf in atm_proc_dev_register() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21net: speedup netdev_set_master()Eric Dumazet
We currently force a synchronize_net() in netdev_set_master() This seems necessary only when a slave had a master and we dismantle it. In the other case ("ifenslave bond0 ethO"), we dont need this long delay. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21tcp: Add SNMP counter for DEFER_ACCEPTEric Dumazet
Its currently hard to diagnose when ACK frames are dropped because an application set TCP_DEFER_ACCEPT on its listening socket. See http://bugzilla.kernel.org/show_bug.cgi?id=15507 This patch adds a SNMP value, named TCPDeferAcceptDrop netstat -s | grep TCPDeferAcceptDrop TCPDeferAcceptDrop: 0 This counter is incremented every time we drop a pure ACK frame received by a socket in SYN_RECV state because its SYNACK retrans count is lower than defer_accept value. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21bonding: flush unicast and multicast lists when changing typeJiri Pirko
After the type change, addresses in unicast and multicast lists wouldn't make sense, not to mention possible different lenghts. So flush both lists here. Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once mc_list conversion will be done). Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21net: rtnetlink: ignore NETDEV_PRE_TYPE_CHANGE in rtnetlink_event()Patrick McHardy
Ignore the new NETDEV_PRE_TYPE_CHANGE event in rtnetlink_event() since there have been no changes userspace needs to be notified of. Also add a comment to the netdev notifier event definitions to remind people to update the exclusion list when adding new event types. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6
2010-03-21net: suppress lockdep-RCU false positive in FIB trie.Paul E. McKenney
Allow fib_find_node() to be called either under rcu_read_lock() protection or with RTNL held. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21SUNRPC: Fix a potential memory leak in auth_gssTrond Myklebust
The function alloc_enc_pages() currently fails to release the pointer rqstp->rq_enc_pages in the error path. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: stable@kernel.org
2010-03-21Bluetooth: Fix kernel crash on L2CAP stress testsAndrei Emeltchenko
Added very simple check that req buffer has enough space to fit configuration parameters. Shall be enough to reject packets with configuration size more than req buffer. Crash trace below [ 6069.659393] Unable to handle kernel paging request at virtual address 02000205 [ 6069.673034] Internal error: Oops: 805 [#1] PREEMPT ... [ 6069.727172] PC is at l2cap_add_conf_opt+0x70/0xf0 [l2cap] [ 6069.732604] LR is at l2cap_recv_frame+0x1350/0x2e78 [l2cap] ... [ 6070.030303] Backtrace: [ 6070.032806] [<bf1c2880>] (l2cap_add_conf_opt+0x0/0xf0 [l2cap]) from [<bf1c6624>] (l2cap_recv_frame+0x1350/0x2e78 [l2cap]) [ 6070.043823] r8:dc5d3100 r7:df2a91d6 r6:00000001 r5:df2a8000 r4:00000200 [ 6070.050659] [<bf1c52d4>] (l2cap_recv_frame+0x0/0x2e78 [l2cap]) from [<bf1c8408>] (l2cap_recv_acldata+0x2bc/0x350 [l2cap]) [ 6070.061798] [<bf1c814c>] (l2cap_recv_acldata+0x0/0x350 [l2cap]) from [<bf0037a4>] (hci_rx_task+0x244/0x478 [bluetooth]) [ 6070.072631] r6:dc647700 r5:00000001 r4:df2ab740 [ 6070.077362] [<bf003560>] (hci_rx_task+0x0/0x478 [bluetooth]) from [<c006b9fc>] (tasklet_action+0x78/0xd8) [ 6070.087005] [<c006b984>] (tasklet_action+0x0/0xd8) from [<c006c160>] Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com> Acked-by: Gustavo F. Padovan <gustavo@padovan.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-03-21Bluetooth: Convert debug files to actually use debugfs instead of sysfsMarcel Holtmann
Some of the debug files ended up wrongly in sysfs, because at that point of time, debugfs didn't exist. Convert these files to use debugfs and also seq_file. This patch converts all of these files at once and then removes the exported symbol for the Bluetooth sysfs class. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-03-21Bluetooth: Fix potential bad memory access with sysfs filesMarcel Holtmann
When creating a high number of Bluetooth sockets (L2CAP, SCO and RFCOMM) it is possible to scribble repeatedly on arbitrary pages of memory. Ensure that the content of these sysfs files is always less than one page. Even if this means truncating. The files in question are scheduled to be moved over to debugfs in the future anyway. Based on initial patches from Neil Brown and Linus Torvalds Reported-by: Neil Brown <neilb@suse.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-03-20ipv6: Fix bug in ipv6_chk_same_addr().David S. Miller
hlist_for_each_entry(p...) will not necessarily initialize 'p' to anything if the hlist is empty. GCC notices this and emits a warning. Just return true explicitly when we hit a match, and return false is we fall out of the loop without one. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20ipv6: Reduce timer events for addrconf_verify().YOSHIFUJI Hideaki
This patch reduces timer events while keeping accuracy by rounding our timer and/or batching several address validations in addrconf_verify(). addrconf_verify() is called at earliest timeout among interface addresses' timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs). In most cases, all of timeouts of interface addresses are long enough (e.g. several hours or days vs 2 minutes), this timer is usually called every ADDR_CHECK_FREQUENCY, and it is okay to be lazy. (Note this timer could be eliminated if all code paths which modifies variables related to timeouts call us manually, but it is another story.) However, in other least but important cases, we try keeping accuracy. When the real interface address timeout is coming, and the timeout is just before the rounded timeout, we accept some error. When a timeout has been reached, we also try batching other several events in very near future. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20IPv6: addrconf cleanup addrconf_verifystephen hemminger
The variable regen_advance is only used in the privacy case. Move it to simplify code and eliminate ifdef's Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20addrconf: checkpatch fixesStephen Hemminger
Fix some of the checkpatch complaints. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20IPv6: addrconf cleanupsStephen Hemminger
Some minor stuff, reformat comments and add whitespace for clarity Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20ipv6: convert idev_list to list macrosstephen hemminger
Convert to list macro's for the list of addresses per interface in IPv6. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20ipv6: user better hash for addrconfstephen hemminger
The existing hash function has a couple of issues: * it is hardwired to 16 for IN6_ADDR_HSIZE * limited to 256 and callers using int * use jhash2 rather than some old BSD algorithm No need for random seed since this is local only (based on assigned addresses) table. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20IPv6: convert addrconf hash list to RCUstephen hemminger
Convert from reader/writer lock to RCU and spinlock for addrconf hash list. Adds an additional helper macro for hlist_for_each_entry_continue_rcu to handle the continue case. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20ipv6: convert addrconf list to hliststephen hemminger
Using hash list macros, simplifies code and helps later RCU. This patch includes some initialization that is not strictly necessary, since an empty hlist node/list is all zero; and list is in BSS and node is allocated with kzalloc. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20ipv6: convert temporary address list to list macrosstephen hemminger
Use list macros instead of open coded linked list. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-03-20netfilter: ctnetlink: fix reliable event delivery if message building failsPablo Neira Ayuso
This patch fixes a bug that allows to lose events when reliable event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR and NETLINK_RECV_NO_ENOBUFS socket options are set. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()Pablo Neira Ayuso
Currently, ENOBUFS errors are reported to the socket via netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However, that should not happen. This fixes this problem and it changes the prototype of netlink_set_err() to return the number of sockets that have set the NETLINK_RECV_NO_ENOBUFS socket option. This return value is used in the next patch in these bugfix series. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20NET_DMA: free skbs periodicallySteven J. Magnani
Under NET_DMA, data transfer can grind to a halt when userland issues a large read on a socket with a high RCVLOWAT (i.e., 512 KB for both). This appears to be because the NET_DMA design queues up lots of memcpy operations, but doesn't issue or wait for them (and thus free the associated skbs) until it is time for tcp_recvmesg() to return. The socket hangs when its TCP window goes to zero before enough data is available to satisfy the read. Periodically issue asynchronous memcpy operations, and free skbs for ones that have completed, to prevent sockets from going into zero-window mode. Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19tcp: Fix tcp_mark_head_lost() with packets == 0Lennart Schulte
A packet is marked as lost in case packets == 0, although nothing should be done. This results in a too early retransmitted packet during recovery in some cases. This small patch fixes this issue by returning immediately. Signed-off-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de> Signed-off-by: Arnd Hannemann <hannemann@nets.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19net: ipmr/ip6mr: fix potential out-of-bounds vif_table accessPatrick McHardy
mfc_parent of cache entries is used to index into the vif_table and is initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1, while the vif_table has only MAXVIFS (32) entries. The same problem affects ip6mr. Refuse invalid values to fix a potential out-of-bounds access. Unlike the other validity checks, this is checked in ipmr_mfc_add() instead of the setsockopt handler since its unused in the delete path and might be uninitialized. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19TCP: check min TTL on received ICMP packetsstephen hemminger
This adds RFC5082 checks for TTL on received ICMP packets. It adds some security against spoofed ICMP packets disrupting GTSM protected sessions. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19ipv6: Remove redundant dst NULL check in ip6_dst_checkHerbert Xu
As the only path leading to ip6_dst_check makes an indirect call through dst->ops, dst cannot be NULL in ip6_dst_check. This patch removes this check in case it misleads people who come across this code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19ipv4: check rt_genid in dst_checkTimo Teräs
Xfrm_dst keeps a reference to ipv4 rtable entries on each cached bundle. The only way to renew xfrm_dst when the underlying route has changed, is to implement dst_check for this. This is what ipv6 side does too. The problems started after 87c1e12b5eeb7b30b4b41291bef8e0b41fc3dde9 ("ipsec: Fix bogus bundle flowi") which fixed a bug causing xfrm_dst to not get reused, until that all lookups always generated new xfrm_dst with new route reference and path mtu worked. But after the fix, the old routes started to get reused even after they were expired causing pmtu to break (well it would occationally work if the rtable gc had run recently and marked the route obsolete causing dst_check to get called). Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19rename new rfkill sysfs knobsflorian@mickler.org
This patch renames the (never officially released) sysfs-knobs "blocked_hw" and "blocked_sw" to "hard" and "soft", as the hardware vs software conotation is misleading. It also gets rid of not needed locks around u32-read-access. Signed-off-by: Florian Mickler <florian@mickler.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-03-19netfilter: remove unused headers in net/ipv4/netfilter/nf_nat_h323.cZhitong Wang
Remove unused headers in net/ipv4/netfilter/nf_nat_h323.c Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-19netfilter: remove unused headers in net/ipv6/netfilter/ip6t_LOG.cZhitong Wang
Remove unused headers in net/ipv6/netfilter/ip6t_LOG.c Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-18net: Potential null skb->dev dereferenceEric Dumazet
When doing "ifenslave -d bond0 eth0", there is chance to get NULL dereference in netif_receive_skb(), because dev->master suddenly becomes NULL after we tested it. We should use ACCESS_ONCE() to avoid this (or rcu_dereference()) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18tcp: Fix OOB POLLIN avoidance.Alexandra Kossovsky
From: Alexandra.Kossovsky@oktetlabs.ru Fixes kernel bugzilla #15541 Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18net: forbid underlaying devices to change its typeJiri Pirko
It's not desired for underlaying devices to change type. At the time, there is for example possible to have bond with changed type from Ethernet to Infiniband as a port of a bridge. This patch fixes this. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18bonding: check return value of nofitier when changing typeJiri Pirko
This patch adds the possibility to refuse the bonding type change for other subsystems (such as for example bridge, vlan, etc.) Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18net: rename notifier defines for netdev type changeJiri Pirko
Since generally there could be more netdevices changing type other than bonding, making this event type name "bonding-unrelated" Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18rps: Fixed build with CONFIG_SMP not enabled.Tom Herbert
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: ensure bdi_unregister is called on mount failure. NFS: Avoid a deadlock in nfs_release_page NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode() nfs4: Make the v4 callback service hidden nfs: fix unlikely memory leak rpc client can not deal with ENOSOCK, so translate it into ENOCONN
2010-03-18netfilter: xt extensions: use pr_<level>Jan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18netfilter: xtables: replace custom duprintf with pr_debugJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18netfilter: xtables: do not print any messages on ENOMEMJan Engelhardt
ENOMEM is a very obvious error code (cf. EINVAL), so I think we do not really need a warning message. Not to mention that if the allocation fails, the user is most likely going to get a stack trace from slab already. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18netfilter: xtables: remove almost-unused xt_match_param.data memberJan Engelhardt
This member is taking up a "long" per match, yet is only used by one module out of the roughly 90 modules, ip6t_hbh. ip6t_hbh can be restructured a little to accomodate for the lack of the .data member. This variant uses checking the par->match address, which should avoid having to add two extra functions, including calls, i.e. (hbh_mt6: call hbhdst_mt6(skb, par, NEXTHDR_OPT), dst_mt6: call hbhdst_mt6(skb, par, NEXTHDR_DEST)) Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18netfilter: xtables: make use of caller family rather than match familyJan Engelhardt
The matches can have .family = NFPROTO_UNSPEC, and though that is not the case for the touched modules, it seems better to just use the nfproto from the caller. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18netfilter: xtables: resort osf kconfig textJan Engelhardt
Restore alphabetical ordering of the list and put the xt_osf option into its 'right' place again. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18netfilter: xtables: limit xt_mac to ethernet devicesJan Engelhardt
I do not see a point of allowing the MAC module to work with devices that don't possibly have one, e.g. various tunnel interfaces such as tun and sit. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>