summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2013-12-31crush: new SET_CHOOSE_LEAF_TRIES commandIlya Dryomov
Explicitly control the number of sample attempts, and allow the number of tries in the recursive call to be explicitly controlled via the rule. This is important because the amount of time we want to spend looking for a solution may be rule dependent (e.g., higher for the wide indep pool than the rep pools). (We should do the same for the other tunables, by the way!) Reflects ceph.git commit c43c893be872f709c787bc57f46c0e97876ff681. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: pass parent r value for indep callIlya Dryomov
Pass down the parent's 'r' value so that we will sample different values in the recursive call when the parent tries multiple times. This avoids doing useless work (calling multiple times and trying the same values). Reflects ceph.git commit 2731d3030d7a3e80922b7f1b7756f9a4a124bac5. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: clarify numrep vs endposIlya Dryomov
Pass numrep (the width of the result) separately from the number of results we want *this* iteration. This makes things less awkward when we do a recursive call (for chooseleaf) and want only one item. Reflects ceph.git commit 1b567ee08972f268c11b43fc881e57b5984dd08b. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: strip firstn conditionals out of crush_choose, renameIlya Dryomov
Now that indep is handled by crush_choose_indep, rename crush_choose to crush_choose_firstn and remove all the conditionals. This ends up stripping out *lots* of code. Note that it *also* makes it obvious that the shenanigans we were playing with r' for uniform buckets were broken for firstn mode. This appears to have happened waaaay back in commit dae8bec9 (or earlier)... 2007. Reflects ceph.git commit 94350996cb2035850bcbece6a77a9b0394177ec9. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: add note about r in recursive chooseIlya Dryomov
Reflects ceph.git commit 4551fee9ad89d0427ed865d766d0d44004d3e3e1. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: use breadth-first search for indep modeIlya Dryomov
Reflects ceph.git commit 86e978036a4ecbac4c875e7c00f6c5bbe37282d3. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: return CRUSH_ITEM_UNDEF for failed placements with indepIlya Dryomov
For firstn mode, if we fail to make a valid placement choice, we just continue and return a short result to the caller. For indep mode, however, we need to make the position stable, and return an undefined value on failed placements to avoid shifting later results to the left. Reflects ceph.git commit b1d4dd4eb044875874a1d01c01c7d766db5d0a80. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: eliminate CRUSH_MAX_SET result size limitationIlya Dryomov
This is only present to size the temporary scratch arrays that we put on the stack. Let the caller allocate them as they wish and remove the limitation. Reflects ceph.git commit 1cfe140bf2dab99517589a82a916f4c75b9492d1. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: fix some commentsIlya Dryomov
Reflects ceph.git commit 3cef755428761f2481b1dd0e0fbd0464ac483fc5. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: reduce scope of some local variablesIlya Dryomov
Reflects ceph.git commit e7d47827f0333c96ad43d257607fb92ed4176550. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: factor out (trivial) crush_destroy_rule()Ilya Dryomov
Reflects ceph.git commit 43a01c9973c4b83f2eaa98be87429941a227ddde. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31crush: pass weight vector size to map functionIlya Dryomov
Pass the size of the weight vector into crush_do_rule() to ensure that we don't access values past the end. This can happen if the caller misbehaves and passes a weight vector that is smaller than max_devices. Currently the monitor tries to prevent that from happening, but this will gracefully tolerate previous bad osdmaps that got into this state. It's also a bit more defensive. Reflects ceph.git commit 5922e2c2b8335b5e46c9504349c3a55b7434c01a. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31libceph: update ceph_features.hIlya Dryomov
This updates ceph_features.h so that it has all feature bits defined in ceph.git. In the interim since the last update, ceph.git crossed the "32 feature bits" point, and, the addition of the 33rd bit wasn't handled correctly. The work-around is squashed into this commit and reflects ceph.git commit 053659d05e0349053ef703b414f44965f368b9f0. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31libceph: all features fields must be u64Ilya Dryomov
In preparation for ceph_features.h update, change all features fields from unsigned int/u32 to u64. (ceph.git has ~40 feature bits at this point.) Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31Merge tag 'v3.13-rc6' into for-3.14/coreJens Axboe
Needed to bring blk-mq uptodate, since changes have been going in since for-3.14/core was established. Fixup merge issues related to the immutable biovec changes. Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-flush.c fs/btrfs/check-integrity.c fs/btrfs/extent_io.c fs/btrfs/scrub.c fs/logfs/dev_bdev.c
2013-12-30cfg80211/mac80211: correct qos-map lockingJohannes Berg
Since the RTNL can't always be held, use wdev/sdata locking for the qos-map dereference in mac80211. This requires cfg80211 to consistently lock it, which it was missing in one place. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-12-30netfilter: nft_reject: support for IPv6 and TCP resetEric Leblond
This patch moves nft_reject_ipv4 to nft_reject and adds support for IPv6 protocol. This patch uses functions included in nf_reject.h to implement reject by TCP reset. The code has to be build as a module if NF_TABLES_IPV6 is also a module to avoid compilation error due to usage of IPv6 functions. This has been done in Kconfig by using the construct: depends on NF_TABLES_IPV6 || !NF_TABLES_IPV6 This seems a bit weird in terms of syntax but works perfectly. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-30netfilter: REJECT: separate reusable codeEric Leblond
This patch prepares the addition of TCP reset support in the nft_reject module by moving reusable code into a header file. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-29net: rose: restore old recvmsg behaviorFlorian Westphal
recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen. After commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c (net: rework recvmsg handler msg_name and msg_namelen logic), we now always take the else branch due to namelen being initialized to 0. Digging in netdev-vger-cvs git repo shows that msg_namelen was initialized with a fixed-size since at least 1995, so the else branch was never taken. Compile tested only. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-29tipc: fix deadlock during socket releaseYing Xue
A deadlock might occur if name table is withdrawn in socket release routine, and while packets are still being received from bearer. CPU0 CPU1 T0: recv_msg() release() T1: tipc_recv_msg() tipc_withdraw() T2: [grab node lock] [grab port lock] T3: tipc_link_wakeup_ports() tipc_nametbl_withdraw() T4: [grab port lock]* named_cluster_distribute() T5: wakeupdispatch() tipc_link_send() T6: [grab node lock]* The opposite order of holding port lock and node lock on above two different paths may result in a deadlock. If socket lock instead of port lock is used to protect port instance in tipc_withdraw(), the reverse order of holding port lock and node lock will be eliminated, as a result, the deadlock is killed as well. Reported-by: Lars Everbrand <lars.everbrand@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-29lro: remove dead codestephen hemminger
Remove leftover code that is not used anywhere in current tree. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-29tcp: make local functions staticstephen hemminger
The following are only used in one file: tcp_connect_init tcp_set_rto Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-29netfilter: select NFNETLINK when enabling NF_TABLESEric Leblond
In Kconfig, nf_tables depends on NFNETLINK so building nf_tables as a module or inside kernel depends on the state of NFNETLINK inside the kernel config. If someone wants to build nf_tables inside the kernel, it is necessary to also build NFNETLINK inside the kernel. But NFNETLINK can not be set in the menu so it is necessary to toggle other nfnetlink subsystems such as logging and nfacct to see the nf_tables switch. This patch changes the dependency from 'depend' to 'select' inside Kconfig to allow to set the build of nftables as modules or inside kernel independently. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-29Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Included changes: - reset netfilter-bridge state when removing the batman-adv header from an incoming packet. This prevents netfilter bridge from being fooled when the same packet enters a bridge twice (or more): the first time within the batman-adv header and the second time without. - adjust the packet layout to prevent any architecture from adding padding bytes. All the structs sent over the wire now have size multiple of 4bytes (unless pack(2) is used). - fix access to the inner vlan_eth header when reading the VID in the rx path. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-29Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This patchset contains four nf_tables fixes, one IPVS fix due to missing updates in the interaction with the new sedadj conntrack extension that was added to support the netfilter synproxy code, and a couple of one-liners to fix netnamespace netfilter issues. More specifically, they are: * Fix ipv6_find_hdr() call without offset being explicitly initialized in nft_exthdr, as required by that function, from Daniel Borkmann. * Fix oops in nfnetlink_log when using netns and unloading the kernel module, from Gao feng. * Fix BUG_ON in nf_ct_timestamp extension after netns is destroyed, from Helmut Schaa. * Fix crash in IPVS due to missing sequence adjustment extension being allocated in the conntrack, from Jesper Dangaard Brouer. * Add bugtrap to spot a warning in case you deference sequence adjustment conntrack area when not available, this should help to catch similar invalid dereferences in the Netfilter tree, also from Jesper. * Fix incomplete dumping of sets in nf_tables when retrieving by family, from me. * Fix oops when updating the table state (dormant <-> active) and having user (not base ) chains, from me. * Fix wrong validation in set element data that results in returning -EINVAL when using the nf_tables dictionary feature with mappings, also from me. We don't usually have this amount of fixes by this time (as we're already in -rc5 of the development cycle), although half of them are related to nf_tables which is a relatively new thing, and I also believe that holidays have also delayed the flight of bugfixes to mainstream a bit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28ipv4: ping make local stuff staticStephen Hemminger
Don't export ping_table or ping_v4_sendmsg. Both are only used inside ping code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28ipv4: remove unused functionStephen Hemminger
inetpeer_invalidate_family defined but never used Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28arp: make arp_invalidate staticStephen Hemminger
Don't export arp_invalidate, only used in arp.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28ipv4: make fib_detect_death staticStephen Hemminger
Make fib_detect_death function static only used in one file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28netfilter: nf_tables: fix wrong datatype in nft_validate_data_load()Pablo Neira Ayuso
This patch fixes dictionary mappings, eg. add rule ip filter input meta dnat set tcp dport map { 22 => 1.1.1.1, 23 => 2.2.2.2 } The kernel was returning -EINVAL in nft_validate_data_load() since the type of the set element data that is passed was the real userspace datatype instead of NFT_DATA_VALUE. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28batman-adv: fix vlan header accessAntonio Quartulli
When batadv_get_vid() is invoked in interface_rx() the batman-adv header has already been removed, therefore the header_len argument has to be 0. Introduced by c018ad3de61a1dc4194879a53e5559e094aa7b1a ("batman-adv: add the VLAN ID attribute to the TT entry") Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-12-28batman-adv: clean nf state when removing protocol headerAntonio Quartulli
If an interface enslaved into batman-adv is a bridge (or a virtual interface built on top of a bridge) the nf_bridge member of the skbs reaching the soft-interface is filled with the state about "netfilter bridge" operations. Then, if one of such skbs is locally delivered, the nf_bridge member should be cleaned up to avoid that the old state could mess up with other "netfilter bridge" operations when entering a second bridge. This is needed because batman-adv is an encapsulation protocol. However at the moment skb->nf_bridge is not released at all leading to bogus "netfilter bridge" behaviours. Fix this by cleaning the netfilter state of the skb before it gets delivered to the upper layer in interface_rx(). Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-12-28netfilter: nf_tables: remove nft_meta_targetPablo Neira Ayuso
In e035b77 ("netfilter: nf_tables: nft_meta module get/set ops"), we got the meta target merged into the existing meta expression. So let's get rid of this dead code now that we fully support that feature. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28netfilter: nf_tables: nft_meta module get/set opsArturo Borrero Gonzalez
This patch adds kernel support for the meta expression in get/set flavour. The set operation indicates that a given packet has to be set with a property, currently one of mark, priority, nftrace. The get op is what was currently working: evaluate the given packet property. In the nftrace case, the value is always 1. Such behaviour is copied from net/netfilter/xt_TRACE.c The NFTA_META_DREG and NFTA_META_SREG attributes are mutually exclusives. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28batman-adv: fix alignment for batadv_tvlv_tt_changeAntonio Quartulli
Make struct batadv_tvlv_tt_change a multiple 4 bytes long to avoid padding on any architecture. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-12-28batman-adv: fix size of batadv_bla_claim_dstSimon Wunderlich
Since this is a mac address and always 48 bit, and we can assume that it is always aligned to 2-byte boundaries, add a pack(2) pragma. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2013-12-28batman-adv: fix size of batadv_icmp_headerAntonio Quartulli
struct batadv_icmp_header currently has a size of 17, which will be padded to 20 on some architectures. Fix this by unrolling the header into the parent structures. Moreover keep the ICMP parsing functions as generic as they are now by using a stub icmp_header struct during packet parsing. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-12-28batman-adv: fix header alignment by unrolling batadv_headerSimon Wunderlich
The size of the batadv_header of 3 is problematic on some architectures which automatically pad all structures to a 32 bit boundary. To not lose performance by packing this struct, better embed it into the various host structures. Reported-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2013-12-28batman-adv: fix alignment for batadv_coded_packetSimon Wunderlich
The compiler may decide to pad the structure, and then it does not have the expected size of 46 byte. Fix this by moving it in the pragma pack(2) part of the code. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2013-12-28netfilter: nf_tables: fix oops when updating table with user chainsPablo Neira Ayuso
This patch fixes a crash while trying to deactivate a table that contains user chains. You can reproduce it via: % nft add table table1 % nft add chain table1 chain1 % nft-table-upd ip table1 dormant [ 253.021026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 253.021114] IP: [<ffffffff8134cebd>] nf_register_hook+0x35/0x6f [ 253.021167] PGD 30fa5067 PUD 30fa2067 PMD 0 [ 253.021208] Oops: 0000 [#1] SMP [...] [ 253.023305] Call Trace: [ 253.023331] [<ffffffffa0885020>] nf_tables_newtable+0x11c/0x258 [nf_tables] [ 253.023385] [<ffffffffa0878592>] nfnetlink_rcv_msg+0x1f4/0x226 [nfnetlink] [ 253.023438] [<ffffffffa0878418>] ? nfnetlink_rcv_msg+0x7a/0x226 [nfnetlink] [ 253.023491] [<ffffffffa087839e>] ? nfnetlink_bind+0x45/0x45 [nfnetlink] [ 253.023542] [<ffffffff8134b47e>] netlink_rcv_skb+0x3c/0x88 [ 253.023586] [<ffffffffa0878973>] nfnetlink_rcv+0x3af/0x3e4 [nfnetlink] [ 253.023638] [<ffffffff813fb0d4>] ? _raw_read_unlock+0x22/0x34 [ 253.023683] [<ffffffff8134af17>] netlink_unicast+0xe2/0x161 [ 253.023727] [<ffffffff8134b29a>] netlink_sendmsg+0x304/0x332 [ 253.023773] [<ffffffff8130d250>] __sock_sendmsg_nosec+0x25/0x27 [ 253.023820] [<ffffffff8130fb93>] sock_sendmsg+0x5a/0x7b [ 253.023861] [<ffffffff8130d5d5>] ? copy_from_user+0x2a/0x2c [ 253.023905] [<ffffffff8131066f>] ? move_addr_to_kernel+0x35/0x60 [ 253.023952] [<ffffffff813107b3>] SYSC_sendto+0x119/0x15c [ 253.023995] [<ffffffff81401107>] ? sysret_check+0x1b/0x56 [ 253.024039] [<ffffffff8108dc30>] ? trace_hardirqs_on_caller+0x140/0x1db [ 253.024090] [<ffffffff8120164e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 253.024141] [<ffffffff81310caf>] SyS_sendto+0x9/0xb [ 253.026219] [<ffffffff814010e2>] system_call_fastpath+0x16/0x1b Reported-by: Alex Wei <alex.kern.mentor@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28netfilter: nf_tables: fix dumping with large number of setsPablo Neira Ayuso
If not table name is specified, the dumping of the existing sets may be incomplete with a sufficiently large number of sets and tables. This patch fixes missing reset of the cursors after finding the location of the last object that has been included in the previous multi-part message. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-27ipv6: release dst properly in ipip6_tunnel_xmitLi RongQing
if a dst is not attached to anywhere, it should be released before exit ipip6_tunnel_xmit, otherwise cause dst memory leakage. Fixes: 61c1db7fae21 ("ipv6: sit: add GSO/TSO support") Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-27ieee802154: space prohibited before that close parenthesisWeilong Chen
Fix checkpatch error with space. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-27llc: "foo* bar" should be "foo *bar"Weilong Chen
Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-27net_sched: act: Dont increment refcnt on replaceJamal Hadi Salim
This is a bug fix. The existing code tries to kill many birds with one stone: Handling binding of actions to filters, new actions and replacing of action attributes. A simple test case to illustrate: XXXX moja@fe1:~$ sudo tc actions add action drop index 12 moja@fe1:~$ actions get action gact index 12 action order 1: gact action drop random type none pass val 0 index 12 ref 1 bind 0 moja@fe1:~$ sudo tc actions replace action ok index 12 moja@fe1:~$ actions get action gact index 12 action order 1: gact action drop random type none pass val 0 index 12 ref 2 bind 0 XXXX The above shows the refcounf being wrongly incremented on replace. There are more complex scenarios with binding of actions to filters that i am leaving out that didnt work as well... Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-27rds: prevent dereference of a NULL deviceSasha Levin
Binding might result in a NULL device, which is dereferenced causing this BUG: [ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097 4 [ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0 [ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1317.264179] Dumping ftrace buffer: [ 1317.264774] (ftrace buffer empty) [ 1317.265220] Modules linked in: [ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4- next-20131218-sasha-00013-g2cebb9b-dirty #4159 [ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000 [ 1317.268399] RIP: 0010:[<ffffffff84225f52>] [<ffffffff84225f52>] rds_ib_laddr_check+ 0x82/0x110 [ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246 [ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000 [ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286 [ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000 [ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031 [ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000 0000 [ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0 [ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 [ 1317.270230] Stack: [ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000 [ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160 [ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280 [ 1317.270230] Call Trace: [ 1317.270230] [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0 [ 1317.270230] [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0 [ 1317.270230] [<ffffffff8421c9c3>] rds_bind+0x73/0xf0 [ 1317.270230] [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0 [ 1317.270230] [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0 [ 1317.270230] [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10 [ 1317.270230] [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290 [ 1317.270230] [<ffffffff83e4cece>] SyS_bind+0xe/0x10 [ 1317.270230] [<ffffffff843a6ad0>] tracesys+0xdd/0xe2 [ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00 89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7 4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02 [ 1317.270230] RIP [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.270230] RSP <ffff8803cd31bdf8> [ 1317.270230] CR2: 0000000000000974 Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-27ipvs: correct usage/allocation of seqadj ext in ipvsJesper Dangaard Brouer
The IPVS FTP helper ip_vs_ftp could trigger an OOPS in nf_ct_seqadj_set, after commit 41d73ec053d2 (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT). This is because, the seqadj ext is now allocated dynamically, and the IPVS code didn't handle this situation. Fix this in the IPVS nfct code by invoking the alloc function nfct_seqadj_ext_add(). Fixes: 41d73ec053d2 (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT) Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2013-12-27netfilter: WARN about wrong usage of sequence number adjustmentsJesper Dangaard Brouer
Since commit 41d73ec053d2 (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT), the sequence number extension is dynamically allocated. Instead of dying, give a WARN splash, in case of wrong usage of the seqadj code, e.g. when forgetting to allocate via nfct_seqadj_ext_add(). Wrong usage have been seen in the IPVS code path. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2013-12-27ipvs: Remove unused variable ret from sync_thread_master()Geert Uytterhoeven
net/netfilter/ipvs/ip_vs_sync.c: In function 'sync_thread_master': net/netfilter/ipvs/ip_vs_sync.c:1640:8: warning: unused variable 'ret' [-Wunused-variable] Commit 35a2af94c7ce7130ca292c68b1d27fcfdb648f6b ("sched/wait: Make the __wait_event*() interface more friendly") changed how the interruption state is returned. However, sync_thread_master() ignores this state, now causing a compile warning. According to Julian Anastasov <ja@ssi.bg>, this behavior is OK: "Yes, your patch looks ok to me. In the past we used ssleep() but IPVS users were confused why IPVS threads increase the load average. So, we switched to _interruptible calls and later the socket polling was added." Document this, as requested by Peter Zijlstra, to avoid precious developers disappearing in this pitfall in the future. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2013-12-26sch_tbf: add TBF_BURST/TBF_PBURST attributeYang Yingliang
When we set burst to 1514 with low rate in userspace, the kernel get a value of burst that less than 1514, which doesn't work. Because it may make some loss when transform burst to buffer in userspace. This makes burst lose some bytes, when the kernel transform the buffer back to burst. This patch adds two new attributes to support sending burst/mtu to kernel directly to avoid the loss. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>