summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2008-11-25x86, bts: base in-kernel ds interface on handlesMarkus Metzger
Impact: generalize the DS code to shared buffers Change the in-kernel ds.h interface to identify the tracer via a handle returned on ds_request_~(). Tracers used to be identified via their task_struct. The changes are required to allow DS to be shared between different tasks, which is needed for perfmon2 and for ftrace. For ptrace, the handle is stored in the traced task's task_struct. This should probably go into a (arch-specific) ptrace context some time. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25hrtimer: removing all ur callback modesPeter Zijlstra
Impact: cleanup, move all hrtimer processing into hardirq context This is an attempt at removing some of the hrtimer complexity by reducing the number of callback modes to 1. This means that all hrtimer callback functions will be ran from HARD-irq context. I went through all the 30 odd hrtimer callback functions in the kernel and saw only one that I'm not quite sure of, which is the one in net/can/bcm.c - hence I'm CC-ing the folks responsible for that code. Furthermore, the hrtimer core now calls callbacks directly with IRQs disabled in case you try to enqueue an expired timer. If this timer is a periodic timer (which should use hrtimer_forward() to advance its time) then it might be possible to end up in an inf. recursive loop due to the fact that hrtimer_forward() doesn't round up to the next timer granularity, and therefore keeps on calling the callback - obviously this needs a fix. Aside from that, this seems to compile and actually boot on my dual core test box - although I'm sure there are some bugs in, me not hitting any makes me certain :-) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25Merge branch 'master' of ↵David Vrabel
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream
2008-11-25DCB: fix kconfig optionJeff Kirsher
Since the netlink option for DCB is necessary to actually be useful, simplified the Kconfig option. In addition, added useful help text for the Kconfig option. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25netdev: add HAVE_NET_DEVICE_OPSStephen Hemminger
As a concession to vendors who have to deal with one source for different kernel versions, add a HAVE_NET_DEVICE_OPS so they don't end up hard coding ifdef against kernel version. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25tracing, tty: fix warnings caused by branch tracing and tty_kref_get()Ingo Molnar
Stephen Rothwell reported tht this warning started triggering in linux-next: In file included from init/main.c:27: include/linux/tty.h: In function ‘tty_kref_get’: include/linux/tty.h:330: warning: ‘______f’ is static but declared in inline function ‘tty_kref_get’ which is not static Which gcc emits for 'extern inline' functions that nevertheless define static variables. Change it to 'static inline', which is the norm in the kernel anyway. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-24tcp: add some mibs to track collapsingIlpo Järvinen
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24tcp: Try to restore large SKBs while SACK processingIlpo Järvinen
During SACK processing, most of the benefits of TSO are eaten by the SACK blocks that one-by-one fragment SKBs to MSS sized chunks. Then we're in problems when cleanup work for them has to be done when a large cumulative ACK comes. Try to return back to pre-split state already while more and more SACK info gets discovered by combining newly discovered SACK areas with the previous skb if that's SACKed as well. This approach has a number of benefits: 1) The processing overhead is spread more equally over the RTT 2) Write queue has less skbs to process (affect everything which has to walk in the queue past the sacked areas) 3) Write queue is consistent whole the time, so no other parts of TCP has to be aware of this (this was not the case with some other approach that was, well, quite intrusive all around). 4) Clean_rtx_queue can release most of the pages using single put_page instead of previous PAGE_SIZE/mss+1 calls In case a hole is fully filled by the new SACK block, we attempt to combine the next skb too which allows construction of skbs that are even larger than what tso split them to and it handles hole per on every nth patterns that often occur during slow start overshoot pretty nicely. Though this to be really useful also a retransmission would have to get lost since cumulative ACKs advance one hole at a time in the most typical case. TODO: handle upwards only merging. That should be rather easy when segment is fully sacked but I'm leaving that as future work item (it won't make very large difference anyway since this current approach already covers quite a lot of normal cases). I was earlier thinking of some sophisticated way of tracking timestamps of the first and the last segment but later on realized that it won't be that necessary at all to store the timestamp of the last segment. The cases that can occur are basically either: 1) ambiguous => no sensible measurement can be taken anyway 2) non-ambiguous is due to reordering => having the timestamp of the last segment there is just skewing things more off than does some good since the ack got triggered by one of the holes (besides some substle issues that would make determining right hole/skb even harder problem). Anyway, it has nothing to do with this change then. I choose to route some abnormal looking cases with goto noop, some could be handled differently (eg., by stopping the walking at that skb but again). In general, they either shouldn't happen at all or are rare enough to make no difference in practice. In theory this change (as whole) could cause some macroscale regression (global) because of cache misses that are taken over the round-trip time but it gets very likely better because of much less (local) cache misses per other write queue walkers and the big recovery clearing cumulative ack. Worth to note that these benefits would be very easy to get also without TSO/GSO being on as long as the data is in pages so that we can merge them. Currently I won't let that happen because DSACK splitting at fragment that would mess up pcounts due to sk_can_gso in tcp_set_skb_tso_segs. Once DSACKs fragments gets avoided, we have some conditions that can be made less strict. TODO: I will probably have to convert the excessive pointer passing to struct sacktag_state... :-) My testing revealed that considerable amount of skbs couldn't be shifted because they were cloned (most likely still awaiting tx reclaim)... [The rest is considering future work instead since I got repeatably EFAULT to tcpdump's recvfrom when I added pskb_expand_head to deal with clones, so I separated that into another, later patch] ...To counter that, I gave up on the fifth advantage: 5) When growing previous SACK block, less allocs for new skbs are done, basically a new alloc is needed only when new hole is detected and when the previous skb runs out of frags space ...which now only happens of if reclaim is fast enough to dispose the clone before the SACK block comes in (the window is RTT long), otherwise we'll have to alloc some. With clones being handled I got these numbers (will be somewhat worse without that), taken with fine-grained mibs: TCPSackShifted 398 TCPSackMerged 877 TCPSackShiftFallback 320 TCPSACKCOLLAPSEFALLBACKGSO 0 TCPSACKCOLLAPSEFALLBACKSKBBITS 0 TCPSACKCOLLAPSEFALLBACKSKBDATA 0 TCPSACKCOLLAPSEFALLBACKBELOW 0 TCPSACKCOLLAPSEFALLBACKFIRST 1 TCPSACKCOLLAPSEFALLBACKPREVBITS 318 TCPSACKCOLLAPSEFALLBACKMSS 1 TCPSACKCOLLAPSEFALLBACKNOHEAD 0 TCPSACKCOLLAPSEFALLBACKSHIFT 0 TCPSACKCOLLAPSENOOPSEQ 0 TCPSACKCOLLAPSENOOPSMALLPCOUNT 0 TCPSACKCOLLAPSENOOPSMALLLEN 0 TCPSACKCOLLAPSEHOLE 12 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24netfilter: xtables: add missing const qualifier to xt_tgchk_paramJan Engelhardt
When entryinfo was a standalone parameter to functions, it used to be "const void *". Put the const back in. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24User namespaces: set of cleanups (v2)Serge Hallyn
The user_ns is moved from nsproxy to user_struct, so that a struct cred by itself is sufficient to determine access (which it otherwise would not be). Corresponding ecryptfs fixes (by David Howells) are here as well. Fix refcounting. The following rules now apply: 1. The task pins the user struct. 2. The user struct pins its user namespace. 3. The user namespace pins the struct user which created it. User namespaces are cloned during copy_creds(). Unsharing a new user_ns is no longer possible. (We could re-add that, but it'll cause code duplication and doesn't seem useful if PAM doesn't need to clone user namespaces). When a user namespace is created, its first user (uid 0) gets empty keyrings and a clean group_info. This incorporates a previous patch by David Howells. Here is his original patch description: >I suggest adding the attached incremental patch. It makes the following >changes: > > (1) Provides a current_user_ns() macro to wrap accesses to current's user > namespace. > > (2) Fixes eCryptFS. > > (3) Renames create_new_userns() to create_user_ns() to be more consistent > with the other associated functions and because the 'new' in the name is > superfluous. > > (4) Moves the argument and permission checks made for CLONE_NEWUSER to the > beginning of do_fork() so that they're done prior to making any attempts > at allocation. > > (5) Calls create_user_ns() after prepare_creds(), and gives it the new creds > to fill in rather than have it return the new root user. I don't imagine > the new root user being used for anything other than filling in a cred > struct. > > This also permits me to get rid of a get_uid() and a free_uid(), as the > reference the creds were holding on the old user_struct can just be > transferred to the new namespace's creator pointer. > > (6) Makes create_user_ns() reset the UIDs and GIDs of the creds under > preparation rather than doing it in copy_creds(). > >David >Signed-off-by: David Howells <dhowells@redhat.com> Changelog: Oct 20: integrate dhowells comments 1. leave thread_keyring alone 2. use current_user_ns() in set_user() Signed-off-by: Serge Hallyn <serue@us.ibm.com>
2008-11-24futex: make clock selectable for FUTEX_WAIT_BITSETThomas Gleixner
FUTEX_WAIT_BITSET could be used instead of FUTEX_WAIT by setting the bit set to FUTEX_BITSET_MATCH_ANY, but FUTEX_WAIT uses CLOCK_REALTIME while FUTEX_WAIT_BITSET uses CLOCK_MONOTONIC. Add a flag to select CLOCK_REALTIME for FUTEX_WAIT_BITSET so glibc can replace the FUTEX_WAIT logic which needs to do gettimeofday() calls before and after the syscall to convert the absolute timeout to a relative timeout for FUTEX_WAIT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ulrich Drepper <drepper@redhat.com>
2008-11-24Merge branch 'linus' into core/futexesThomas Gleixner
2008-11-24sched: convert remaining old-style cpumask operatorsRusty Russell
Impact: Trivial API conversion NR_CPUS -> nr_cpu_ids cpumask_t -> struct cpumask sizeof(cpumask_t) -> cpumask_size() cpumask_a = cpumask_b -> cpumask_copy(&cpumask_a, &cpumask_b) cpu_set() -> cpumask_set_cpu() first_cpu() -> cpumask_first() cpumask_of_cpu() -> cpumask_of() cpus_* -> cpumask_* There are some FIXMEs where we all archs to complete infrastructure (patches have been sent): cpu_coregroup_map -> cpu_coregroup_mask node_to_cpumask* -> cpumask_of_node There is also one FIXME where we pass an array of cpumasks to partition_sched_domains(): this implies knowing the definition of 'struct cpumask' and the size of a cpumask. This will be fixed in a future patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-24sched: convert nohz_cpu_mask to cpumask_var_t.Rusty Russell
Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-24sched: convert struct sched_group/sched_domain cpumask_ts to variable bitmapsRusty Russell
Impact: (future) size reduction for large NR_CPUS. We move the 'cpumask' member of sched_group to the end, so when we kmalloc it we can do a minimal allocation: saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. Similar trick for 'span' in sched_domain. This isn't quite as good as converting to a cpumask_var_t, as some sched_groups are actually static, but it's safer: we don't have to figure out where to call alloc_cpumask_var/free_cpumask_var. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-24sched: wrap sched_group and sched_domain cpumask accesses.Rusty Russell
Impact: trivial wrap of member accesses This eases the transition in the next patch. We also get rid of a temporary cpumask in find_idlest_cpu() thanks to for_each_cpu_and, and sched_balance_self() due to getting weight before setting sd to NULL. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-24Merge branches 'sched/core', 'core/core' and 'tracing/core' into cpus4096Ingo Molnar
2008-11-24Merge branches 'tracing/branch-tracer', 'tracing/fastboot', ↵Ingo Molnar
'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/power-tracer', 'tracing/powerpc', 'tracing/ring-buffer', 'tracing/stack-tracer' and 'tracing/urgent' into tracing/core
2008-11-24Merge branches 'core/debug', 'core/futexes', 'core/locking', 'core/rcu', ↵Ingo Molnar
'core/signal', 'core/urgent' and 'core/xen' into core/core
2008-11-24Merge branch 'sched/rt' into sched/coreIngo Molnar
2008-11-23eth: Declare an optimized compare_ether_addr_64bits() functionEric Dumazet
Linus mentioned we could try to perform long word operations, even on potentially unaligned addresses, on x86 at least. David mentioned the HAVE_EFFICIENT_UNALIGNED_ACCESS test to handle this on all arches that have efficient unailgned accesses. I tried this idea and got nice assembly on 32 bits: 158: 33 82 38 01 00 00 xor 0x138(%edx),%eax 15e: 33 8a 34 01 00 00 xor 0x134(%edx),%ecx 164: c1 e0 10 shl $0x10,%eax 167: 09 c1 or %eax,%ecx 169: 74 0b je 176 <eth_type_trans+0x87> And very nice assembly on 64 bits of course (one xor, one shl) Nice oprofile improvement in eth_type_trans(), 0.17 % instead of 0.41 %, expected since we remove 8 instructions on a fast path. This patch implements a compare_ether_addr_64bits() function, that uses the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS ifdef to efficiently perform the 6 bytes comparison on all capable arches. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-23dccp: Set per-connection CCIDs via socket optionsGerrit Renker
With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which overrides the defaults set by the global sysctl variables for TX/RX CCIDs. To make full use of this facility, the remaining patches of this patch set are needed, which track dependencies and activate negotiated feature values. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-23irq.h: remove padding from irq_desc on 64bitsRichard Kennedy
Impact: reduce struct irq_desc size struct irq_desc: reorder to remove padding on 64bits shrinks irq_desc to 128 bytes which saves data space & cache lines On a generic x86_64/SMP build this reduces the reported data size by 64k. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORTTörök Edwin
Impact: cleanup User stack tracing is just implemented for x86, but it is not x86 specific. Introduce a generic config flag, that is currently enabled only for x86. When other arches implement it, they will have to SELECT USER_STACKTRACE_SUPPORT. Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/stack-tracer: fix style issuesTörök Edwin
Impact: cleanup Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23ftrace: add ftrace_off_permanentSteven Rostedt
Impact: add new API to disable all of ftrace on anomalies It case of a serious anomaly being detected (like something caught by lockdep) it is a good idea to disable all tracing immediately, without grabing any locks. This patch adds ftrace_off_permanent that disables the tracers, function tracing and ring buffers without a way to enable them again. This should only be used when something serious has been detected. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23ring-buffer: add tracing_off_permanentSteven Rostedt
Impact: feature to permanently disable ring buffer This patch adds a API to the ring buffer code that will permanently disable the ring buffer from ever recording. This should only be called when some serious anomaly is detected, and the system may be in an unstable state. When that happens, shutting down the recording to the ring buffers may be appropriate. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: profile all if conditionalsSteven Rostedt
Impact: feature to profile if statements This patch adds a branch profiler for all if () statements. The results will be found in: /debugfs/tracing/profile_branch For example: miss hit % Function File Line ------- --------- - -------- ---- ---- 0 1 100 x86_64_start_reservations head64.c 127 0 1 100 copy_bootdata head64.c 69 1 0 0 x86_64_start_kernel head64.c 111 32 0 0 set_intr_gate desc.h 319 1 0 0 reserve_ebda_region head.c 51 1 0 0 reserve_ebda_region head.c 47 0 1 100 reserve_ebda_region head.c 42 0 0 X maxcpus main.c 165 Miss means the branch was not taken. Hit means the branch was taken. The percent is the percentage the branch was taken. This adds a significant amount of overhead and should only be used by those analyzing their system. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: consolidate unlikely and likely profilerSteven Rostedt
Impact: clean up to make one profiler of like and unlikely tracer The likely and unlikely profiler prints out the file and line numbers of the annotated branches that it is profiling. It shows the number of times it was correct or incorrect in its guess. Having two different files or sections for that matter to tell us if it was a likely or unlikely is pretty pointless. We really only care if it was correct or not. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23trace: remove extra assign in branch checkSteven Rostedt
Impact: clean up of branch check The unlikely/likely profiler does an extra assign of the f.line. This is not needed since it is already calculated at compile time. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23irq.h: fix missing/extra kernel-docRandy Dunlap
Impact: fix kernel-doc build Fix missing & excess irq.h kernel-doc: Warning(include/linux/irq.h:182): No description found for parameter 'irq' Warning(include/linux/irq.h:182): Excess struct/union/enum/typedef member 'affinity_entry' description in 'irq_desc' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23Merge commit 'v2.6.28-rc6' into irq/urgentIngo Molnar
2008-11-23vfs, seqfile: make mangle_path() globalTörök Edwin
Impact: expose new VFS API make mangle_path() available, as per the suggestions of Christoph Hellwig and Al Viro: http://lkml.org/lkml/2008/11/4/338 Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing: add support for userspace stacktraces in tracing/iter_ctrlTörök Edwin
Impact: add new (default-off) tracing visualization feature Usage example: mount -t debugfs nodev /sys/kernel/debug cd /sys/kernel/debug/tracing echo userstacktrace >iter_ctrl echo sched_switch >current_tracer echo 1 >tracing_enabled .... run application ... echo 0 >tracing_enabled Then read one of 'trace','latency_trace','trace_pipe'. To get the best output you can compile your userspace programs with frame pointers (at least glibc + the app you are tracing). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/function-return-tracer: clean up task start/exit callbacksIngo Molnar
Impact: cleanup Eliminate #ifdefs in core code by using empty inline functions. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23tracing/function-return-tracer: store return stack into task_struct and ↵Frederic Weisbecker
allocate it dynamically Impact: use deeper function tracing depth safely Some tests showed that function return tracing needed a more deeper depth of function calls. But it could be unsafe to store these return addresses to the stack. So these arrays will now be allocated dynamically into task_struct of current only when the tracer is activated. Typical scheme when tracer is activated: - allocate a return stack for each task in global list. - fork: allocate the return stack for the newly created task - exit: free return stack of current - idle init: same as fork I chose a default depth of 50. I don't have overruns anymore. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23Merge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' ↵Ingo Molnar
into tracing/core
2008-11-21netdevice hdlc: Convert directly reference of netdev->privWang Chen
For killing directly reference of netdev->priv, use netdev->ml_priv to replace it. Because the private pvc data comes from add_pvc() and can't be allocated in alloc_netdev(). Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20DCB: Add support for DCB BCNAlexander Duyck
Adds an interface to configure the Backward Congestion Notification (BCN) feature. In a BCN capabale network, congestion notifications from congested points out in the network can cause the end station limit the rate of a given traffic flow. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20DCB: Add interface to query the state of PFC feature.Alexander Duyck
Adds a netlink interface for Data Center Bridging (DCB) to get and set the enable state of the Priority Flow Control (PFC) feature. Primarily, this is a way to turn off PFC in the driver while DCB remains enabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20DCB: Add interface to query # of TCs supported by deviceAlexander Duyck
Adds interface for Data Center Bridging (DCB) to query (and set if supported) the number of traffic classes currently supported by the device for the two (DCB) features: priority groups (PG) and priority flow control (PFC). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20DCB: Add interface to query for the DCB capabilities of an device.Alexander Duyck
Adds to the netlink interface for Data Center Bridging (DCB), allowing the DCB capabilities supported by a device to be queried. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20ixgbe: this patch adds support for DCB to the kernel and ixgbe driverAlexander Duyck
This adds support for Data Center Bridging (DCB) features in the ixgbe driver and adds an rtnetlink interface for configuring DCB to the kernel. The DCB feature support included are Priority Grouping (PG) - which allows bandwidth guarantees to be allocated to groups to traffic based on the 802.1q priority, and Priority Based Flow Control (PFC) - which introduces a new MAC control PAUSE frame which works at granularity of the 802.1p priority instead of the link (IEEE 802.3x). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20hippi: convert driver to net_device_opsStephen Hemminger
Convert the HIPPI infrastructure for use with net_device_ops. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20fddi: convert to new network device opsStephen Hemminger
Similar to ethernet. Convert infrastructure and the one lone FDDI driver (for the one lone user of that hardware??). Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20netdev: add more functions to netdevice opsStephen Hemminger
This patch moves neigh_setup and hard_start_xmit into the network device ops structure. For bisection, fix all the previously converted drivers as well. Bonding driver took the biggest hit on this. Added a prefetch of the hard_start_xmit in the fast path to try and reduce any impact this would have. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c include/net/mac80211.h net/phonet/af_phonet.c
2008-11-20netfilter: ip{,6}t_policy.h should include xp_policy.hAndy Whitcroft
It seems that all of the include/netfilter_{ipv4,ipv6}/{ipt,ip6t}_*.h which share constants include the corresponding include/netfilter/xp_*.h files. Neither ipt_policy.h not ip6t_policy.h do. Make these consistant with the norm. Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-20pkt_sched: add DRR schedulerPatrick McHardy
Add classful DRR scheduler as a more flexible replacement for SFQ. The main difference to the algorithm described in "Efficient Fair Queueing using Deficit Round Robin" is that this implementation doesn't drop packets from the longest queue on overrun because its classful and limits are handled by each individual child qdisc. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20netlink: avoid memset of 0 bytes sparse warningPatrick McHardy
A netlink attribute padding of zero triggers this sparse warning: include/linux/netlink.h:245:8: warning: memset with byte count of 0 Avoid the memset when the size parameter is constant and requires no padding. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>