summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-04-19Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== A few stragglers hoping for 3.9, somewhat delayed due to my travels... On the mac80211 bits, Johannes says: "Sadly, I have another pull request -- the idle handling fix broke LED handling in some cases." and: "Yet one more! This fixes a fairly important/annoying bug -- when roaming between multiple APs of the same network, the system could get stuck thinking it was connected to the old one while it really wasn't." On top of that... Arend sends a brcmfmac patch that removes advertising a feature that isn't actually fully supported, and a brcmsmac patch that rearranges code to request firmware at IFF_UP to play more nicely with being built into the kernel. Felix gives us a minor ath9k_htc fix to support the newly released open source firmware, and an ath9k_hw initvals fix to improve device stability. Rafał Miłecki provides a fix for an ssb regression that caused a serious performance problem with b43. Zefir Kurtisi offers an ath9k fix to change some kmalloc flags to allow the DFS detector to be called in softirq context. Please let me know if there are problems. If these don't make 3.9, I'll just pull them into wireless-next -- just let me know if you want to do it that way! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19mei: revamp mei_amthif_irq_read_messageTomas Winkler
Rename the function to mei_amthif_irq_read_msg and change parameters order Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19tcp: call tcp_replace_ts_recent() from tcp_ack()Eric Dumazet
commit bd090dfc634d (tcp: tcp_replace_ts_recent() should not be called from tcp_validate_incoming()) introduced a TS ecr bug in slow path processing. 1 A > B P. 1:10001(10000) ack 1 <nop,nop,TS val 1001 ecr 200> 2 B < A . 1:1(0) ack 1 win 257 <sack 9001:10001,TS val 300 ecr 1001> 3 A > B . 1:1001(1000) ack 1 win 227 <nop,nop,TS val 1002 ecr 200> 4 A > B . 1001:2001(1000) ack 1 win 227 <nop,nop,TS val 1002 ecr 200> (ecr 200 should be ecr 300 in packets 3 & 4) Problem is tcp_ack() can trigger send of new packets (retransmits), reflecting the prior TSval, instead of the TSval contained in the currently processed incoming packet. Fix this by calling tcp_replace_ts_recent() from tcp_ack() after the checks, but before the actions. Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19staging: comedi: drivers: free_irq() in comedi_legacy_detach()H Hartley Sweeten
All the legacy comedi drivers now call comedi_legacy_detach() either directly or as part of their (*detach). Move the free_irq() into comedi_legacy_detach() so that the drivers don't have to deal with it. For drivers that then only call comedi_legacy_detach() in their private (*detach), remove the private function and use the helper directly for the (*detach). The amplc_pc236 and ni_labpc drivers are hybrid legacy/PCI drivers. In the detach of a PCI board free_irq() still needs to be handled by the driver. The pcl724 and pcl726 drivers currently have the free_irq() #ifdef'ed out. The comedi_legacy_detach() function sanity checks that the irq has been requested before freeing it so they are safe to convert. For aesthetic reasons, move the #ifdef unused chunk in the pcl816 driver up to the previous #ifdef unused block. The pcmio and pcmuio drivers request multiple irqs and handle the freeing of them. Remove the 'dev->irq = irq[0]' in those drivers so that comedi_legacy_detach() does not attempt to free the irq. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: drivers: use comedi_legacy_detach()H Hartley Sweeten
Use comedi_legacy_detach() to release the I/O region requested by these drivers. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: skel: use comedi_legacy_detach()H Hartley Sweeten
Update the skeleton driver to use the new comedi_legacy_detach() helper in the (*detach) to release the I/O region. Also, update the comment about it. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: amplc_dio200: use comedi_legacy_detach()H Hartley Sweeten
The I/O region used by this driver is always requested using comedi_request_region(). The devpriv->io union is only used by the common code shared by the legacy and PCI drivers. Use the new comedi_legacy_detach() helper in the (*detach) to release the I/O region requested by this driver. That function will handle the proper sanity checking before releasing the resource. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: drivers: use comedi_legacy_detach() in simple driversH Hartley Sweeten
Use the new comedi_legacy_detach() helper in the (*detach) to release the I/O region requested by these drivers. Since the (*detach) for these drivers only releases the region, remove the private (*detach) functions and use comedi_legacy_detach() directly for the (*detach). Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: das1800: use comedi_legacy_detach()H Hartley Sweeten
Use the new comedi_legacy_detach() helper in the (*detach) to release the first I/O region requested by this driver. An additional I/O region is requested for some of the boards this driver supports. The iobase for that region is stored in the private data so that the (*detach) knows it needs to be released. Remove the extra cleanup in the (*attach) that releases the first region. For aesthetics, move the release of the additional region in the (*detach) so it follows the (*attach) order. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: das16m1: check for subdev_8255_init() failureH Hartley Sweeten
Make sure to check if subdev_8255_init() fails and propogate the error code. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: das16m1: use comedi_legacy_detach()H Hartley Sweeten
Use the new comedi_legacy_detach() helper in the (*detach) to release the first I/O region requested by this driver. An additional I/O region is requested by this driver for the 8255 device. Save the iobase for that region in the private data so that the (*detach) knows it needs to be released. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: das16: use comedi_legacy_detach()H Hartley Sweeten
Use the new comedi_legacy_detach() helper in the (*detach) to release the first I/O region requested by this driver. An additional I/O region is requested for some of the boards this driver supports. Save the iobase for that region in the private data so that the (*detach) knows it needs to be released. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: pcl812: use comedi_legacy_detach()H Hartley Sweeten
This driver does not follow the standard (*attach) (*detach) flow of the other drivers in comedi. Comedi drivers do not 'cleanup' any of the allocations made during the (*attach) if failures are encountered. If the (*attach) fails, the comedi core will call the (*detach) to handle any clenaup. In this driver, the function free_resources() handles all the cleanup. Remove the calls to this function during the (*attach). Since the (*detach) is then the only caller, remove the function and just put all the cleanup in the (*detach) function. Use the new comedi_legacy_detach() helper in the (*detach) to release the I/O region. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: comedi: drivers: introduce comedi_legacy_detach()H Hartley Sweeten
This function is intended to be used by the comedi legacy (ISA) drivers either directly as the (*detach) function or as a helper in the drivers private (*detach) function. Modify the comedi_request_region() helper so that it stores the 'len' of the region as well as the 'start' after the region has been successfuly allocated by request_region() in __comedi_request_region(). This region will then be automatically released detach of the driver by the comedi_legacy_detach() helper. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: rts5129: re-use kbasename()Andy Shevchenko
The custom filename function mostly repeats the kernel's kbasename. This patch simplifies it. The updated filename() will not check for the '\' in the filenames. It seems redundant in Linux. The __FILE__ macro always defined if we compile an existing file. Thus, NULL check is not needed there as well. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19staging: net: remove pc300 driverDaniel Borkmann
To quote the TODO from staging/net/: PC300: The driver is very broken and cannot work with the current TTY layer. It is inevitable to convert it to the new TTY API. If no one steps in to adopt the driver, it will be removed in the 3.7 release. Nothing has changed since more than _one_ year on this driver, thus just remove it since we already moved past 3.7. If somebody steps up and does a whole rework, he/she, of course, is free to resubmit it. Since this is the only one in the net directory, we can remove it as well. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19mei: revamp hbm state machineTomas Winkler
1. Rename init_clients_state to hbm_state and use MEI_HBM_ prefix for HBM states 2. Remove recvd_msg and use hbm state for synchronizing hbm protocol has successful start. We can wake up the hbm event from start response handler and remove the hack from the interrupt thread 3. mei_hbm_start_wait function encapsulate start completion waiting Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: ti_usb_3410_5052: kill custom closing_waitJohan Hovold
Kill custom closing_wait implementation and let the tty-layer handle it instead. Note that the port drain-delay is set to three characters to keep the 20ms delay after wait_until_sent at low baudrates (1200 baud) during close. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: ti_usb_3410_5052: remove redundant drain from break_ctlJohan Hovold
Remove redundant drain, which has already been handled by the tty-layer, from break_ctl. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: ti_usb_3410_5052: query hardware-buffer status in chars_in_bufferJohan Hovold
Query hardware-buffer status in chars_in_buffer should the write fifo be empty. This is needed to make the tty layer wait for hardware buffers to drain on close. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: ti_usb_3410_5052: remove lsr from port dataJohan Hovold
The line status register is only polled so let's not keep a possibly outdated value in the port data. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: ti_usb_3410_5052: move write-fifo flushing to closeJohan Hovold
Move write-fifo flushing from ti_drain to close where it belongs. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: io_ti: remove redundant wait_until_sentJohan Hovold
Remove redundant wait_until_sent, which has already been handled by the tty-layer, from break_ctl. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: io_ti: fix TIOCGSERIALJohan Hovold
Fix regression introduced by commit f40d78155 ("USB: io_ti: kill custom closing_wait implementation") which made TIOCGSERIAL return the wrong value for closing_wait. Signed-off-by: Johan Hovold <jhovold@gmail.com> Cc: stable <stable@vger.kernel.org> # 3.9 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19USB: usbtmc: remove unnecessary memory allocationMing Lei
Inside usbtmc_ioctl_clear_out_halt()/usbtmc_ioctl_clear_in_halt(), usb_clear_halt() needn't any buffer to pass in, so remove the unnecessary memory allocation. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19usbatm: fix potential NULL pointer dereferenceWei Yongjun
The dereference to 'instance' in the debug code should be moved below the NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Duncan Sands <baldrick@free.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19mtdchar: remove no-longer-used vma helpersLinus Torvalds
With the conversion to vm_iomap_memory(), these vma helpers are no longer used. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19vm: convert snd_pcm_lib_mmap_iomem() to vm_iomap_memory() helperLinus Torvalds
This is my example conversion of a few existing mmap users. The pcm mmap case is one of the more straightforward ones. Acked-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19vm: convert fb_mmap to vm_iomap_memory() helperLinus Torvalds
This is my example conversion of a few existing mmap users. The fb_mmap() case is a good example because it is a bit more complicated than some: fb_mmap() mmaps one of two different memory areas depending on the page offset of the mmap (but happily there is never any mixing of the two, so the helper function still works). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19vm: convert mtdchar mmap to vm_iomap_memory() helperLinus Torvalds
This is my example conversion of a few existing mmap users. The mtdchar case is actually disabled right now (and stays disabled), but I did it because it showed up on my "git grep", and I was familiar with the code due to fixing an overflow problem in the code in commit 9c603e53d380 ("mtdchar: fix offset overflow detection"). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19vm: convert HPET mmap to vm_iomap_memory() helperLinus Torvalds
This is my example conversion of a few existing mmap users. The HPET case is simple, widely available, and easy to test (Clemens Ladisch sent a trivial test-program for it). Test-program-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Two more small fixups to the wacom driver" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: wacom - fix "can not retrieve extra class descriptor" for DTH2242 Input: wacom - DTH2242 Grip Pen id was off by one bit
2013-04-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse build fix from Miklos Szeredi: "This fixes android builds. The patch appears large, but is just search & replace." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix type definitions in uapi header
2013-04-19Input: wacom - fix "can not retrieve extra class descriptor" for DTH2242Ping Cheng
Same as Cintiq 24HDT, DTH2242 has two interfaces sharing one configuration. This patch ignores the second interface. Signed-off-by: Ping Cheng <pingc@wacom.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-04-19Input: wacom - DTH2242 Grip Pen id was off by one bitPing Cheng
Signed-off-by: Ping Cheng <pingc@wacom.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-04-19xen: resolve section mismatch warnings in xen-acpi-processorBen Guthro
The following resolves a section mismatch warning below in xen-acpi-processor introduced by 3fac10145b766a2244422788f62dc35978613fd8 [13/13] xen: Re-upload processor PM data to hypervisor after S3 resume (v2) Warning: WARNING: drivers/xen/built-in.o(.text+0x2056a): Section mismatch in reference from the function xen_upload_processor_pm_data() to the function .init.text:read_acpi_id() The function xen_upload_processor_pm_data() references the function __init read_acpi_id(). This is often because xen_upload_processor_pm_data lacks a __init annotation or the annotation of read_acpi_id is wrong. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Ben Guthro <benjamin.guthro@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-04-19pinctrl/pinconfig: add debug interfaceLaurent Meunier
This update adds a debugfs interface to modify a pin configuration for a given state in the pinctrl map. This allows to modify the configuration for a non-active state, typically sleep state. This configuration is not applied right away, but only when the state will be entered. This solution is mandated for us by HW validation: in order to test and verify several pin configurations during sleep without recompiling the software. Change log in this patch set; Take into account latest feedback from Stephen Warren: - stale comments update - improved code efficiency and readibility - limit size of global variable pinconf_dbg_conf - remove req_type as it can easily be added later when add/delete requests support is implemented Signed-off-by: Laurent Meunier <laurent.meunier@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2013-04-19arm64: Define readq and writeq for driver module usingChen Gang
when compiling with allmodconfig, CONFIG_64BIT=y the file drivers/base/regmap/regmap-mmio.c will use readq and writeq so we need implement these functions. Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-04-19Merge tag 'edac_amd_f16h' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull AMD F16h support for amd64_edac from Borislav Petkov. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19regulator: palmas: Don't update tstep register for SMPS3 and SMPS7Axel Lin
SMPS3 and SMPS7 do not have tstep_addr setting, so current code actually writes 0 to smps12_ctl (offset is 0) register when set_ramp_delay callback is called for SMPS3 and SMPS7. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-04-19amd64_edac: Add Family 16h supportAravind Gopalakrishnan
Add code to handle DRAM ECC errors decoding for Fam16h. Tested on Fam16h with ECC turned on using the mce_amd_inj facility and works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Boris: cleanups and clarifications ] Signed-off-by: Borislav Petkov <bp@suse.de>
2013-04-19mutex: Back out architecture specific check for negative mutex countWaiman Long
Linus suggested that probably all the supported architectures can allow a negative mutex count without incorrect behavior, so we can then back out the architecture specific change and allow the mutex count to go to any negative number. That should further reduce contention for non-x86 architecture. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <Waiman.Long@hp.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chandramouleeswaran Aswin <aswin@hp.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Norton Scott J <scott.norton@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1366226594-5506-5-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19mutex: Queue mutex spinners with MCS lock to reduce cacheline contentionWaiman Long
The current mutex spinning code (with MUTEX_SPIN_ON_OWNER option turned on) allow multiple tasks to spin on a single mutex concurrently. A potential problem with the current approach is that when the mutex becomes available, all the spinning tasks will try to acquire the mutex more or less simultaneously. As a result, there will be a lot of cacheline bouncing especially on systems with a large number of CPUs. This patch tries to reduce this kind of contention by putting the mutex spinners into a queue so that only the first one in the queue will try to acquire the mutex. This will reduce contention and allow all the tasks to move forward faster. The queuing of mutex spinners is done using an MCS lock based implementation which will further reduce contention on the mutex cacheline than a similar ticket spinlock based implementation. This patch will add a new field into the mutex data structure for holding the MCS lock. This expands the mutex size by 8 bytes for 64-bit system and 4 bytes for 32-bit system. This overhead will be avoid if the MUTEX_SPIN_ON_OWNER option is turned off. The following table shows the jobs per minute (JPM) scalability data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The numactl command is used to restrict the running of the fserver workloads to 1/2/4/8 nodes with hyperthreading off. +-----------------+-----------+-----------+-------------+----------+ | Configuration | Mean JPM | Mean JPM | Mean JPM | % Change | | | w/o patch | patch 1 | patches 1&2 | 1->1&2 | +-----------------+------------------------------------------------+ | | User Range 1100 - 2000 | +-----------------+------------------------------------------------+ | 8 nodes, HT off | 227972 | 227237 | 305043 | +34.2% | | 4 nodes, HT off | 393503 | 381558 | 394650 | +3.4% | | 2 nodes, HT off | 334957 | 325240 | 338853 | +4.2% | | 1 node , HT off | 198141 | 197972 | 198075 | +0.1% | +-----------------+------------------------------------------------+ | | User Range 200 - 1000 | +-----------------+------------------------------------------------+ | 8 nodes, HT off | 282325 | 312870 | 332185 | +6.2% | | 4 nodes, HT off | 390698 | 378279 | 393419 | +4.0% | | 2 nodes, HT off | 336986 | 326543 | 340260 | +4.2% | | 1 node , HT off | 197588 | 197622 | 197582 | 0.0% | +-----------------+-----------+-----------+-------------+----------+ At low user range 10-100, the JPM differences were within +/-1%. So they are not that interesting. The fserver workload uses mutex spinning extensively. With just the mutex change in the first patch, there is no noticeable change in performance. Rather, there is a slight drop in performance. This mutex spinning patch more than recovers the lost performance and show a significant increase of +30% at high user load with the full 8 nodes. Similar improvements were also seen in a 3.8 kernel. The table below shows the %time spent by different kernel functions as reported by perf when running the fserver workload at 1500 users with all 8 nodes. +-----------------------+-----------+---------+-------------+ | Function | % time | % time | % time | | | w/o patch | patch 1 | patches 1&2 | +-----------------------+-----------+---------+-------------+ | __read_lock_failed | 34.96% | 34.91% | 29.14% | | __write_lock_failed | 10.14% | 10.68% | 7.51% | | mutex_spin_on_owner | 3.62% | 3.42% | 2.33% | | mspin_lock | N/A | N/A | 9.90% | | __mutex_lock_slowpath | 1.46% | 0.81% | 0.14% | | _raw_spin_lock | 2.25% | 2.50% | 1.10% | +-----------------------+-----------+---------+-------------+ The fserver workload for an 8-node system is dominated by the contention in the read/write lock. Mutex contention also plays a role. With the first patch only, mutex contention is down (as shown by the __mutex_lock_slowpath figure) which help a little bit. We saw only a few percents improvement with that. By applying patch 2 as well, the single mutex_spin_on_owner figure is now split out into an additional mspin_lock figure. The time increases from 3.42% to 11.23%. It shows a great reduction in contention among the spinners leading to a 30% improvement. The time ratio 9.9/2.33=4.3 indicates that there are on average 4+ spinners waiting in the spin_lock loop for each spinner in the mutex_spin_on_owner loop. Contention in other locking functions also go down by quite a lot. The table below shows the performance change of both patches 1 & 2 over patch 1 alone in other AIM7 workloads (at 8 nodes, hyperthreading off). +--------------+---------------+----------------+-----------------+ | Workload | mean % change | mean % change | mean % change | | | 10-100 users | 200-1000 users | 1100-2000 users | +--------------+---------------+----------------+-----------------+ | alltests | 0.0% | -0.8% | +0.6% | | five_sec | -0.3% | +0.8% | +0.8% | | high_systime | +0.4% | +2.4% | +2.1% | | new_fserver | +0.1% | +14.1% | +34.2% | | shared | -0.5% | -0.3% | -0.4% | | short | -1.7% | -9.8% | -8.3% | +--------------+---------------+----------------+-----------------+ The short workload is the only one that shows a decline in performance probably due to the spinner locking and queuing overhead. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Reviewed-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chandramouleeswaran Aswin <aswin@hp.com> Cc: Norton Scott J <scott.norton@hp.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1366226594-5506-4-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19mutex: Make more scalable by doing less atomic operationsWaiman Long
In the __mutex_lock_common() function, an initial entry into the lock slow path will cause two atomic_xchg instructions to be issued. Together with the atomic decrement in the fast path, a total of three atomic read-modify-write instructions will be issued in rapid succession. This can cause a lot of cache bouncing when many tasks are trying to acquire the mutex at the same time. This patch will reduce the number of atomic_xchg instructions used by checking the counter value first before issuing the instruction. The atomic_read() function is just a simple memory read. The atomic_xchg() function, on the other hand, can be up to 2 order of magnitude or even more in cost when compared with atomic_read(). By using atomic_read() to check the value first before calling atomic_xchg(), we can avoid a lot of unnecessary cache coherency traffic. The only downside with this change is that a task on the slow path will have a tiny bit less chance of getting the mutex when competing with another task in the fast path. The same is true for the atomic_cmpxchg() function in the mutex-spin-on-owner loop. So an atomic_read() is also performed before calling atomic_cmpxchg(). The mutex locking and unlocking code for the x86 architecture can allow any negative number to be used in the mutex count to indicate that some tasks are waiting for the mutex. I am not so sure if that is the case for the other architectures. So the default is to avoid atomic_xchg() if the count has already been set to -1. For x86, the check is modified to include all negative numbers to cover a larger case. The following table shows the jobs per minutes (JPM) scalability data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The numactl command is used to restrict the running of the high_systime workloads to 1/2/4/8 nodes with hyperthreading on and off. +-----------------+-----------+------------+----------+ | Configuration | Mean JPM | Mean JPM | % Change | | | w/o patch | with patch | | +-----------------+-----------------------------------+ | | User Range 1100 - 2000 | +-----------------+-----------------------------------+ | 8 nodes, HT on | 36980 | 148590 | +301.8% | | 8 nodes, HT off | 42799 | 145011 | +238.8% | | 4 nodes, HT on | 61318 | 118445 | +51.1% | | 4 nodes, HT off | 158481 | 158592 | +0.1% | | 2 nodes, HT on | 180602 | 173967 | -3.7% | | 2 nodes, HT off | 198409 | 198073 | -0.2% | | 1 node , HT on | 149042 | 147671 | -0.9% | | 1 node , HT off | 126036 | 126533 | +0.4% | +-----------------+-----------------------------------+ | | User Range 200 - 1000 | +-----------------+-----------------------------------+ | 8 nodes, HT on | 41525 | 122349 | +194.6% | | 8 nodes, HT off | 49866 | 124032 | +148.7% | | 4 nodes, HT on | 66409 | 106984 | +61.1% | | 4 nodes, HT off | 119880 | 130508 | +8.9% | | 2 nodes, HT on | 138003 | 133948 | -2.9% | | 2 nodes, HT off | 132792 | 131997 | -0.6% | | 1 node , HT on | 116593 | 115859 | -0.6% | | 1 node , HT off | 104499 | 104597 | +0.1% | +-----------------+------------+-----------+----------+ At low user range 10-100, the JPM differences were within +/-1%. So they are not that interesting. AIM7 benchmark run has a pretty large run-to-run variance due to random nature of the subtests executed. So a difference of less than +-5% may not be really significant. This patch improves high_systime workload performance at 4 nodes and up by maintaining transaction rates without significant drop-off at high node count. The patch has practically no impact on 1 and 2 nodes system. The table below shows the percentage time (as reported by perf record -a -s -g) spent on the __mutex_lock_slowpath() function by the high_systime workload at 1500 users for 2/4/8-node configurations with hyperthreading off. +---------------+-----------------+------------------+---------+ | Configuration | %Time w/o patch | %Time with patch | %Change | +---------------+-----------------+------------------+---------+ | 8 nodes | 65.34% | 0.69% | -99% | | 4 nodes | 8.70% | 1.02% | -88% | | 2 nodes | 0.41% | 0.32% | -22% | +---------------+-----------------+------------------+---------+ It is obvious that the dramatic performance improvement at 8 nodes was due to the drastic cut in the time spent within the __mutex_lock_slowpath() function. The table below show the improvements in other AIM7 workloads (at 8 nodes, hyperthreading off). +--------------+---------------+----------------+-----------------+ | Workload | mean % change | mean % change | mean % change | | | 10-100 users | 200-1000 users | 1100-2000 users | +--------------+---------------+----------------+-----------------+ | alltests | +0.6% | +104.2% | +185.9% | | five_sec | +1.9% | +0.9% | +0.9% | | fserver | +1.4% | -7.7% | +5.1% | | new_fserver | -0.5% | +3.2% | +3.1% | | shared | +13.1% | +146.1% | +181.5% | | short | +7.4% | +5.0% | +4.2% | +--------------+---------------+----------------+-----------------+ Signed-off-by: Waiman Long <Waiman.Long@hp.com> Reviewed-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chandramouleeswaran Aswin <aswin@hp.com> Cc: Norton: Scott J <scott.norton@hp.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1366226594-5506-3-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19mutex: Move mutex spinning code from sched/core.c back to mutex.cWaiman Long
As mentioned by Ingo, the SCHED_FEAT_OWNER_SPIN scheduler feature bit was really just an early hack to make with/without mutex-spinning testable. So it is no longer necessary. This patch removes the SCHED_FEAT_OWNER_SPIN feature bit and move the mutex spinning code from kernel/sched/core.c back to kernel/mutex.c which is where they should belong. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chandramouleeswaran Aswin <aswin@hp.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Norton Scott J <scott.norton@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1366226594-5506-2-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-18cgroup: fix broken file xattrsLi Zefan
We should store file xattrs in struct cfent instead of struct cftype, because cftype is a type while cfent is object instance of cftype. For example each cgroup has a tasks file, and each tasks file is associated with a uniq cfent, but all those files share the same struct cftype. Alexey Kodanev reported a crash, which can be reproduced: # mount -t cgroup -o xattr /sys/fs/cgroup # mkdir /sys/fs/cgroup/test # setfattr -n trusted.value -v test_value /sys/fs/cgroup/tasks # rmdir /sys/fs/cgroup/test # umount /sys/fs/cgroup oops! In this case, simple_xattrs_free() will free the same struct simple_xattrs twice. tj: Dropped unused local variable @cft from cgroup_diput(). Cc: <stable@vger.kernel.org> # 3.8.x Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2013-04-18HID: appleir: add support for Apple ir devicesBenjamin Tissoires
This driver was originally written by James McKenzie, updated by Greg Kroah-Hartman, further updated by Bastien Nocera, with suspend support added. I ported it to the HID subsystem, in order to simplify it a litle and allow lirc to use it through hiddev. More recent versions of the IR receiver are also supported through a patch by Alex Karpenko. The patch also adds support for the 2nd and 5th generation of the controller, and the menu key on newer brushed metal remotes. Tested-by: Fabien André <fabien.andre@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-04-18usb: storage: Fix link errorJoe Perches
Fix allmodconfig link error introduced by commit 75b9130e8a ("usb: storage: Add usb_stor_dbg, reduce object size") Export the symbol usb_stor_dbg. Add export.h Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-18Merge branch 'userns-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux Pull user-namespace fixes from Andy Lutomirski. * 'userns-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux: userns: Changing any namespace id mappings should require privileges userns: Check uid_map's opener's fsuid, not the current fsuid userns: Don't let unprivileged users trick privileged users into setting the id_map
2013-04-18Revert "[media] mfd: Add chip properties handling code for SI476X MFD"Mauro Carvalho Chehab
This reverts commit a118e9c122969a924061b48b6cc6bb73ccc8de4c. Requested-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>