linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2010-07-24	net sched: fix race in mirred device removal	stephen hemminger
	This fixes hang when target device of mirred packet classifier action is removed. If a mirror or redirection action is configured to cause packets to go to another device, the classifier holds a ref count, but was assuming the adminstrator cleaned up all redirections before removing. The fix is to add a notifier and cleanup during unregister. The new list is implicitly protected by RTNL mutex because it is held during filter add/delete as well as notifier. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24	sysfs: add attribute to indicate hw address assignment type	Stefan Assmann
	Add addr_assign_type to struct net_device and expose it via sysfs. This new attribute has the purpose of giving user-space the ability to distinguish between different assignment types of MAC addresses. For example user-space can treat NICs with randomly generated MAC addresses differently than NICs that have permanent (locally assigned) MAC addresses. For the former udev could write a persistent net rule by matching the device path instead of the MAC address. There's also the case of devices that 'steal' MAC addresses from slave devices. In which it is also be beneficial for user-space to be aware of the fact. This patch also introduces a helper function to assist adoption of drivers that generate MAC addresses randomly. Signed-off-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24	Merge branch 'bugzilla-16396' into release	Len Brown

2010-07-24	ACPI / Sleep: Allow the NVS saving to be skipped during suspend to RAM	Rafael J. Wysocki
	Commit 2a6b69765ad794389f2fc3e14a0afa1a995221c2 (ACPI: Store NVS state even when entering suspend to RAM) caused the ACPI suspend code save the NVS area during suspend and restore it during resume unconditionally, although it is known that some systems need to use acpi_sleep=s4_nonvs for hibernation to work. To allow the affected systems to avoid saving and restoring the NVS area during suspend to RAM and resume, introduce kernel command line option acpi_sleep=nonvs and make acpi_sleep=s4_nonvs work as its alias temporarily (add acpi_sleep=s4_nonvs to the feature removal file). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16396 . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: tomas m <tmezzadra@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-07-24	of: remove of_default_bus_ids	Jonas Bonn
	This list used was by only two platforms with all other platforms defining an own list of valid bus id's to pass to of_platform_bus_probe. This patch: i) copies the default list to the two platforms that depended on it (powerpc) ii) remove the usage of of_default_bus_ids in of_platform_bus_probe iii) removes the definition of the list from all architectures that defined it Passing a NULL 'matches' parameter to of_platform_bus_probe is still valid; the function returns no error in that case as the NULL value is equivalent to an empty list. Signed-off-by: Jonas Bonn <jonas@southpole.se> [grant.likely@secretlab.ca: added __initdata annotations, warn on and return error on missing match table, and fix whitespace errors] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-07-24	of/device: Replace of_device with platform_device in includes and core code	Grant Likely
	of_device is currently just an #define alias to platform_device until it gets removed entirely. This patch removes references to it from the include directories and the core drivers/of code. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24	of: remove asm/of_device.h	Grant Likely
	It is mostly unused now. Sparc has a few defines left in it, but they can be moved to other headers. Removing this header means that new architectures adding CONFIG_OF support don't need to also add this header file. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24	of: remove asm/of_platform.h	Grant Likely
	Only thing left in it is of_instantiate_rtc() which can be moved to asm/prom.h on PowerPC and is unused in microblaze. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24	of/platform: remove all of_bus_type and of_platform_bus_type references	Grant Likely
	Both of_bus_type and of_platform_bus_type are just #define aliases for the platform bus. This patch removes all references to them and switches to the of_register_platform_driver()/of_unregister_platform_driver() API for registering. Subsequent patches will convert each user of of_register_platform_driver() into plain platform_drivers without the of_platform_driver shim. At which point the of_register_platform_driver()/of_unregister_platform_driver() functions can be removed. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24	of: Merge of_platform_bus_type with platform_bus_type	Grant Likely
	of_platform_bus was being used in the same manner as the platform_bus. The only difference being that of_platform_bus devices are generated from data in the device tree, and platform_bus devices are usually statically allocated in platform code. Having them separate causes the problem of device drivers having to be registered twice if it was possible for the same device to appear on either bus. This patch removes of_platform_bus_type and registers all of_platform bus devices and drivers on the platform bus instead. A previous patch made the of_device structure an alias for the platform_device structure, and a shim is used to adapt of_platform_drivers to the platform bus. After all of of_platform_bus drivers are converted to be normal platform drivers, the shim code can be removed. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24	Merge commit 'v2.6.35-rc6' into devicetree/next	Grant Likely
	Conflicts: arch/sparc/kernel/prom_64.c
2010-07-23	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-commands.h
2010-07-23	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: vmlinux.lds: fix .data..init_task output section (fix popwerpc boot) powerpc: Fix erroneous lmb->memblock conversions powerpc/mm: Add some debug output when hash insertion fails powerpc/mm: Fix bugs in huge page hashing powerpc/mm: Move around testing of _PAGE_PRESENT in hash code powerpc/mm: Handle hypervisor pte insert failure in __hash_page_huge powerpc/kexec: Fix boundary case for book-e kexec memory limits
2010-07-23	Merge branch 'perf-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available perf annotate: Fix handling of goto labels that are valid hex numbers tracing: Properly align linker defined symbols perf symbols: Fix directory descriptor leaking perf: Fix various display bugs with parent filtering
2010-07-23	timer: Added usleep[_range] timer	Patrick Pannuto
	usleep[_range] are finer precision implementations of msleep and are designed to be drop-in replacements for udelay where a precise sleep / busy-wait is unnecessary. They also allow an easy interface to specify slack when a precise (ish) wakeup is unnecessary to help minimize wakeups Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: akinobu.mita@gmail.com Cc: sboyd@codeaurora.org Acked-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <4C44CDD2.1070708@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-23	xt_quota: report initial quota value instead of current value to userspace	Changli Gao
	We should copy the initial value to userspace for iptables-save and to allow removal of specific quota rules. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23	firewire: cdev: improve FW_CDEV_IOC_ALLOCATE	Stefan Richter
	In both the ieee1394 stack and the firewire stack, the core treats kernelspace drivers better than userspace drivers when it comes to CSR address range allocation: The former may request a register to be placed automatically at a free spot anywhere inside a specified address range. The latter may only request a register at a fixed offset. Hence, userspace drivers which do not require a fixed offset potentially need to implement a retry loop with incremented offset in each retry until the kernel does not fail allocation with EBUSY. This awkward procedure is not fundamentally necessary as the core already provides a superior allocation API to kernelspace drivers. Therefore change the ioctl() ABI by addition of a region_end member in the existing struct fw_cdev_allocate. Userspace and kernelspace APIs work the same way now. There is a small cost to pay by clients though: If client source code is required to compile with older kernel headers too, then any use of the new member fw_cdev_allocate.region_end needs to be enclosed by #ifdef/#endif directives. However, any client program that seriously wants to use address range allocations will require a kernel of cdev ABI version >= 4 at runtime and a linux/firewire-cdev.h header of >= 4 anyway. This is because v4 brings FW_CDEV_EVENT_REQUEST2. The only client program in which build-time compatibility with struct fw_cdev_allocate as found in older kernel headers makes sense is libraw1394. (libraw1394 uses the older broken FW_CDEV_EVENT_REQUEST to implement a makeshift, incorrect transaction responder that does at least work somewhat in many simple scenarios, relying on guesswork by libraw1394 and by libraw1394 based applications. Plus, address range allocation and transaction responder is only one of many features that libraw1394 needs to provide, and these other features need to work with kernel and kernel-headers as old as possible. Any new linux/firewire-cdev.h based client that implements a transaction responder should never attempt to do it like libraw1394; instead it should make a header and kernel of v4 or later a hard requirement.) While we are at it, update the struct fw_cdev_allocate documentation to better reflect the recent fw_cdev_event_request2 ABI addition. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-07-23	firewire: cdev: add PHY pinging	Stefan Richter
	This extends the FW_CDEV_IOC_SEND_PHY_PACKET ioctl() for /dev/fw* to be useful for ping time measurements. One application for it would be gap count optimization in userspace that is based on ping times rather than hop count. (The latter is implemented in firewire-core itself but is not applicable to beta PHYs that act as repeater.) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-07-23	firewire: cdev: add PHY packet reception	Stefan Richter
	Add an FW_CDEV_IOC_RECEIVE_PHY_PACKETS ioctl() and FW_CDEV_EVENT_PHY_PACKET_RECEIVED poll()/read() event for /dev/fw. This can be used to get information from remote PHYs by remote access PHY packets. This is also the 2nd half of the functionality (the receive part) to support a userspace implementation of a VersaPHY transaction layer. Safety considerations: - PHY packets are generally broadcasts, hence some kind of elevated privileges should be required of a process to be able to listen in on PHY packets. This implementation assumes that a process that is allowed to open the /dev/fw of a local node does have this privilege. There was an inconclusive discussion about introducing POSIX capabilities as a means to check for user privileges for these kinds of operations. Other limitations: - PHY packet reception may be switched on by ioctl() but cannot be switched off again. It would be trivial to provide an off switch, but this is not worth the code. The client should simply close() the fd then, or just ignore further events. - For sake of simplicity of API and kernel-side implementation, no filter per packet content is provided. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-07-23	firewire: cdev: add PHY packet transmission	Stefan Richter
	Add an FW_CDEV_IOC_SEND_PHY_PACKET ioctl() for /dev/fw* which can be used to implement bus management related functionality in userspace. This is also half of the functionality (the transmit part) that is needed to support a userspace implementation of a VersaPHY transaction layer. Safety considerations: - PHY packets are generally broadcasts and may have interesting effects on PHYs and the bus, e.g. make asynchronous arbitration impossible due to too low gap count. Hence some kind of elevated privileges should be required of a process to be able to send PHY packets. This implementation assumes that a process that is allowed to open the /dev/fw* of a local node does have this privilege. There was an inconclusive discussion about introducing POSIX capabilities as a means to check for user privileges for these kinds of operations. - The kernel does not check integrity of the supplied packet data. That would be far too much code, considering the many kinds of PHY packets. A process which got the privilege to send these packets is trusted to do it correctly. Just like with the other "send packet" ioctls, a non-blocking API is chosen; i.e. the ioctl may return even before AT DMA started. After transmission, an event for poll()/read() is enqueued. Most users are going to need a blocking API, but a blocking userspace wrapper is easy to implement, and the second of the two existing libraw1394 calls raw1394_phy_packet_write() and raw1394_start_phy_packet_write() can be better supported that way. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-07-23	firewire: cdev: some clarifications to the API documentation	Stefan Richter
	Response events: - are generated on more occasions than their documentation claimed. CSR allocation: - An already occupied CSR can be determined from errno==EBUSY. Bus resets: - Note that FW_CDEV_IOC_INITIATE_BUS_RESET is nonblocking and that the client is not required to observe a grace period since kernels 2.6.36+ will enforce it now (commit 02d37bed). - The possible values of fw_cdev_initiate_bus_reset.type are listed in the kerneldoc comment already. - Clarify that an application that uses FW_CDEV_IOC_ADD_DESCRIPTOR and FW_CDEV_IOC_REMOVE_DESCRIPTOR does not have to issue a bus reset. Isochronous I/O contexts: - At most one can be created per open file descriptor. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-07-23	firewire: normalize status values in packet callbacks	Stefan Richter
	core-transaction.c transmit_complete_callback() and close_transaction() expect packet callback status to be an ACK or RCODE, and ACKs get translated to RCODEs for transaction callbacks. An old comment on the packet callback API (been there from the initial submission of the stack) and the dummy_driver implementation of send_request/send_response deviated from this as they also included -ERRNO in the range of status values. Let's narrow status values down to ACK and RCODE to prevent surprises. RCODE_CANCELLED is chosen as the dummy_driver's RCODE as its meaning of "transaction timed out" comes closest to what happens when a transaction coincides with card removal. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-07-23	slow-work: kill it	Tejun Heo
	slow-work doesn't have any user left. Kill it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>
2010-07-23	netfilter: add xt_cpu match	Eric Dumazet
	In some situations a CPU match permits a better spreading of connections, or select targets only for a given cpu. With Remote Packet Steering or multiqueue NIC and appropriate IRQ affinities, we can distribute trafic on available cpus, per session. (all RX packets for a given flow is handled by a given cpu) Some legacy applications being not SMP friendly, one way to scale a server is to run multiple copies of them. Instead of randomly choosing an instance, we can use the cpu number as a key so that softirq handler for a whole instance is running on a single cpu, maximizing cache effects in TCP/UDP stacks. Using NAT for example, a four ways machine might run four copies of server application, using a separate listening port for each instance, but still presenting an unique external port : iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \ -j REDIRECT --to-port 8080 iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \ -j REDIRECT --to-port 8081 iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \ -j REDIRECT --to-port 8082 iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \ -j REDIRECT --to-port 8083 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23	quota: Use mark_inode_dirty_sync instead of mark_inode_dirty	Jan Kara
	Quota code never touches file data. It just modifies i_blocks + i_bytes of inodes and inode flags of quota files. So use mark_inode_dirty_sync instead of mark_inode_dirty. Signed-off-by: Jan Kara <jack@suse.cz>
2010-07-23	IPVS: make FTP work with full NAT support	Hannes Eder
	Use nf_conntrack/nf_nat code to do the packet mangling and the TCP sequence adjusting. The function 'ip_vs_skb_replace' is now dead code, so it is removed. To SNAT FTP, use something like: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \ --vport 21 -j SNAT --to-source 192.168.10.10 and for the data connections in passive mode: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \ --vportctl 21 -j SNAT --to-source 192.168.10.10 using '-m state --state RELATED' would also works. Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and nf_nat_ftp are loaded. [ up-port and minor fixes by Simon Horman <horms@verge.net.au> ] Signed-off-by: Hannes Eder <heder@google.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23	netfilter: xt_ipvs (netfilter matcher for IPVS)	Hannes Eder
	This implements the kernel-space side of the netfilter matcher xt_ipvs. [ minor fixes by Simon Horman <horms@verge.net.au> ] Signed-off-by: Hannes Eder <heder@google.com> Signed-off-by: Simon Horman <horms@verge.net.au> [ Patrick: added xt_ipvs.h to Kbuild ] Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23	Merge branch 'tip/perf/core' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2010-07-23	vmlinux.lds: fix .data..init_task output section (fix popwerpc boot)	Sam Ravnborg
	The .data..init_task output section was missing a load offset causing a popwerpc target to fail to boot. Sean MacLennan tracked it down to the definition of INIT_TASK_DATA_SECTION(). There are only two users of INIT_TASK_DATA_SECTION() in the kernel today: cris and popwerpc. cris do not support relocatable kernels and is thus not impacted by this change. Fix INIT_TASK_DATA_SECTION() to specify load offset like all other output sections. Reported-by: Sean MacLennan <smaclennan@pikatech.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-23	nilfs2: add feature set fields to super block	Ryusuke Konishi
	This adds three new fields to nilfs_super_block structure, compatible feature set, readonly-compatible feature set, and incompatible feature set in order to prepare for future disk format modifications. The role of these fields conforms to those of ext3 or other filesystems. Most important flags are the incompatible feature set; it is used to refuse to mount the filesystem which sets an incompatible feature the kernel doesn't know about. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2010-07-23	nilfs2: clarify byte offset in super block format	Ryusuke Konishi
	This inserts comments indicating hexadecimal offset in declaration of nilfs_super_block structure so that people can know offset of its fields without counting from the head. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2010-07-22	xen: Add suspend/resume support for PV on HVM guests.	Stefano Stabellini
	Suspend/resume requires few different things on HVM: the suspend hypercall is different; we don't need to save/restore memory related settings; except the shared info page and the callback mechanism. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22	xen: Xen PCI platform device driver.	Stefano Stabellini
	Add the xen pci platform device driver that is responsible for initializing the grant table and xenbus in PV on HVM mode. Few changes to xenbus and grant table are necessary to allow the delayed initialization in HVM mode. Grant table needs few additional modifications to work in HVM mode. The Xen PCI platform device raises an irq every time an event has been delivered to us. However these interrupts are only delivered to vcpu 0. The Xen PCI platform interrupt handler calls xen_hvm_evtchn_do_upcall that is a little wrapper around __xen_evtchn_do_upcall, the traditional Xen upcall handler, the very same used with traditional PV guests. When running on HVM the event channel upcall is never called while in progress because it is a normal Linux irq handler (and we cannot switch the irq chip wholesale to the Xen PV ones as we are running QEMU and might have passed in PCI devices), therefore we cannot be sure that evtchn_upcall_pending is 0 when returning. For this reason if evtchn_upcall_pending is set by Xen we need to loop again on the event channels set pending otherwise we might loose some event channel deliveries. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22	x86/xen: event channels delivery on HVM.	Sheng Yang
	Set the callback to receive evtchns from Xen, using the callback vector delivery mechanism. The traditional way for receiving event channel notifications from Xen is via the interrupts from the platform PCI device. The callback vector is a newer alternative that allow us to receive notifications on any vcpu and doesn't need any PCI support: we allocate a vector exclusively to receive events, in the vector handler we don't need to interact with the vlapic, therefore we avoid a VMEXIT. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22	xen: Add support for HVM hypercalls.	Jeremy Fitzhardinge
	Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-07-22	Merge branch 'bugzilla-15886' into release	Len Brown

2010-07-22	drm: use workqueue instead of slow-work	Tejun Heo
	Workqueue can now handle high concurrency. Convert drm_crtc_helper to use system_nrt_wq instead of slow-work. The conversion is mostly straight forward. One difference is that drm_helper_hpd_irq_event() no longer blocks and can be called from any context. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Airlie <airlied@linux.ie> Cc: dri-devel@lists.freedesktop.org
2010-07-22	fscache: drop references to slow-work	Tejun Heo
	fscache no longer uses slow-work. Drop references to it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>
2010-07-22	fscache: convert operation to use workqueue instead of slow-work	Tejun Heo
	Make fscache operation to use only workqueue instead of combination of workqueue and slow-work. FSCACHE_OP_SLOW is dropped and FSCACHE_OP_FAST is renamed to FSCACHE_OP_ASYNC and uses newly added fscache_op_wq workqueue to execute op->processor(). fscache_operation_init_slow() is dropped and fscache_operation_init() now takes @processor argument directly. * Unbound workqueue is used. * fscache_retrieval_work() is no longer necessary as OP_ASYNC now does the equivalent thing. * sysctl fscache.operation_max_active added to control concurrency. The default value is nr_cpus clamped between 2 and WQ_UNBOUND_MAX_ACTIVE. * debugfs support is dropped for now. Tracing API based debug facility is planned to be added. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>
2010-07-22	fscache: convert object to use workqueue instead of slow-work	Tejun Heo
	Make fscache object state transition callbacks use workqueue instead of slow-work. New dedicated unbound CPU workqueue fscache_object_wq is created. get/put callbacks are renamed and modified to take @object and called directly from the enqueue wrapper and the work function. While at it, make all open coded instances of get/put to use fscache_get/put_object(). * Unbound workqueue is used. * work_busy() output is printed instead of slow-work flags in object debugging outputs. They mean basically the same thing bit-for-bit. * sysctl fscache.object_max_active added to control concurrency. The default value is nr_cpus clamped between 4 and WQ_UNBOUND_MAX_ACTIVE. * slow_work_sleep_till_thread_needed() is replaced with fscache private implementation fscache_object_sleep_till_congested() which waits on fscache_object_wq congestion. * debugfs support is dropped for now. Tracing API based debug facility is planned to be added. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>
2010-07-22	ACPI: skip checking BM_STS if the BIOS doesn't ask for it	Len Brown
	It turns out that there is a bit in the _CST for Intel FFH C3 that tells the OS if we should be checking BM_STS or not. Linux has been unconditionally checking BM_STS. If the chip-set is configured to enable BM_STS, it can retard or completely prevent entry into deep C-states -- as illustrated by turbostat: http://userweb.kernel.org/~lenb/acpi/utils/pmtools/turbostat/ ref: Intel Processor Vendor-Specific ACPI Interface Specification table 4 "_CST FFH GAS Field Encoding" Bit 1: Set to 1 if OSPM should use Bus Master avoidance for this C-state https://bugzilla.kernel.org/show_bug.cgi?id=15886 Signed-off-by: Len Brown <len.brown@intel.com>
2010-07-22	net: RTA_MARK addition	Eric Dumazet
	Add a new rt attribute, RTA_MARK, and use it in rt_fill_info()/inet_rtm_getroute() to support following commands : ip route get 192.168.20.110 mark NUMBER ip route get 192.168.20.108 from 192.168.20.110 iif eth1 mark NUMBER ip route list cache [192.168.20.110] mark NUMBER Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22	workqueue: fix how cpu number is stored in work->data	Tejun Heo
	Once a work starts execution, its data contains the cpu number it was on instead of pointing to cwq. This is added by commit 7a22ad75 (workqueue: carry cpu number in work data once execution starts) to reliably determine the work was last on even if the workqueue itself was destroyed inbetween. Whether data points to a cwq or contains a cpu number was distinguished by comparing the value against PAGE_OFFSET. The assumption was that a cpu number should be below PAGE_OFFSET while a pointer to cwq should be above it. However, on architectures which use separate address spaces for user and kernel spaces, this doesn't hold as PAGE_OFFSET is zero. Fix it by using an explicit flag, WORK_STRUCT_CWQ, to mark what the data field contains. If the flag is set, it's pointing to a cwq; otherwise, it contains a cpu number. Reported on s390 and microblaze during linux-next testing. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Sachin Sant <sachinp@in.ibm.com> Reported-by: Michal Simek <michal.simek@petalogix.com> Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Tested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Tested-by: Michal Simek <monstr@monstr.eu>
2010-07-22	macvtap: Limit packet queue length	Herbert Xu
	Mark Wagner reported OOM symptoms when sending UDP traffic over a macvtap link to a kvm receiver. This appears to be caused by the fact that macvtap packet queues are unlimited in length. This means that if the receiver can't keep up with the rate of flow, then we will hit OOM. Of course it gets worse if the OOM killer then decides to kill the receiver. This patch imposes a cap on the packet queue length, in the same way as the tuntap driver, using the device TX queue length. Please note that macvtap currently has no way of giving congestion notification, that means the software device TX queue cannot be used and packets will always be dropped once the macvtap driver queue fills up. This shouldn't be a great problem for the scenario where macvtap is used to feed a kvm receiver, as the traffic is most likely external in origin so congestion notification can't be applied anyway. Of course, if anybody decides to complain about guest-to-guest UDP packet loss down the track, then we may have to revisit this. Incidentally, this patch also fixes a real memory leak when macvtap_get_queue fails. Chris Wright noticed that for this patch to work, we need a non-zero TX queue length. This patch includes his work to change the default macvtap TX queue length to 500. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22	Merge branch 'for_linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: sysrq,kdb: Use __handle_sysrq() for kdb's sysrq function debug_core,kdb: fix kgdb_connected bit set in the wrong place Fix merge regression from external kdb to upstream kdb repair gdbstub to match the gdbserial protocol specification kdb: break out of kdb_ll() when command is terminated
2010-07-22	CAN: Add Flexcan CAN controller driver	Marc Kleine-Budde
	This core is found on some Freescale SoCs and also some Coldfire SoCs. Support for Coldfire is missing though at the moment as they have an older revision of the core which does not have RX FIFO support. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2010-07-22	x86 cpufreq, perf: Make trace_power_frequency cpufreq driver independent	Thomas Renninger
	and fix the broken case if a core's frequency depends on others. trace_power_frequency was only implemented in a rather ungeneric way in acpi-cpufreq driver's target() function only. -> Move the call to trace_power_frequency to cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE notifier is triggered. This will support power frequency tracing by all cpufreq drivers. trace_power_frequency did not trace frequency changes correctly when the userspace governor was used or when CPU cores' frequency depend on each other. -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu which gets switched automatically fixes this. Robert Schoene provided some important fixes on top of my initial quick shot version which are integrated in this patch: - Forgot some changes in power_end trace (TP_printk/variable names) - Variable dummy in power_end must now be cpu_id - Use static 64 bit variable instead of unsigned int for cpu_id [akpm@linux-foundation.org: build fix] Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: davej@codemonkey.org.uk Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Jones <davej@codemonkey.org.uk> Acked-by: Arjan van de Ven <arjan@infradead.org> Cc: Robert Schoene <robert.schoene@tu-dresden.de> Tested-by: Robert Schoene <robert.schoene@tu-dresden.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2010-07-21	sysrq,kdb: Use __handle_sysrq() for kdb's sysrq function	Jason Wessel
	The kdb code should not toggle the sysrq state in case an end user wants to try and resume the normal kernel execution. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2010-07-21	net: remove last uses of __attribute__((packed))	Gustavo F. Padovan
	Network code uses the __packed macro instead of __attribute__((packed)). Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-21	irda: Use __packed annotation instead IRDA_PACKED macro	Gustavo F. Padovan
	Remove IRDA_PACKED macro, which maps to __attribute__((packed)). IRDA is one of the last users of __attribute__((packet)). Networking code uses __packed now. Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: David S. Miller <davem@davemloft.net>