summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-05-24be2net: Bug fix in init code in probeSarveshwar Bandi
PCI function reset needs to invoked after fw init ioctl is issued. Signed-off-by: Sarveshwar Bandi <sarveshwarb@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-24net/dccp: expansion of error code sizeYoichi Yuasa
Because MIPS's EDQUOT value is 1133(0x46d). It's larger than u8. Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-24Merge branch 'master' into develRussell King
2010-05-24Merge branch 'devel-stable' into develRussell King
2010-05-24Merge branch 'for-rmk/samsung3' of git://git.fluff.org/bjdooks/linux into ↵Russell King
devel-stable Conflicts: arch/arm/Kconfig
2010-05-24ARM: 6141/1: Add audio support part in arch/arm/mach-w90x900wanzongshun
Add audio support part in arch/arm/mach-w90x900 Signed-off-by: Wan ZongShun<mcuos.com@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24lib/atomic64_test: fix missing include of linux/kernel.hPeter Huewe
Fix a build-failure (http://kisskb.ellerman.id.au/kisskb/buildresult/2601239/) by adding the missing include file (linux/kernel.h) for printk and KERN_INFO. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> LKML-Reference: <201005241913.o4OJDKdf010884@imap1.linux-foundation.org> Cc: Luca Barbieri <luca@luca-barbieri.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24x86: remove last traces of quicklist usagePeter Zijlstra
We still have a stray quicklist header included even though we axed quicklist usage quite a while back. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <201005241913.o4OJDJe9010881@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012Gabor Gombas
The low-memory corruption checker triggers during suspend/resume, so we need to reserve the low 64k. Don't be fooled that the BIOS identifies itself as "Dell Inc.", it's still Phoenix BIOS. [ hpa: I think we blacklist almost every BIOS in existence. We should either change this to a whitelist or just make it unconditional. ] Signed-off-by: Gabor Gombas <gombasg@digikabel.hu> LKML-Reference: <201005241913.o4OJDIMM010877@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>
2010-05-24x86: "nosmp" command line option should force the system into UP modeJan Beulich
Bits set in cpu_possible_mask prior to the execution of prefill_possible_map() (i.e. when parsing ACPI or MPS tables) would prevent the SMP alternatives logic from switching to UP mode, plus unnecessary setup of per-CPU data for CPUs that can never come online. Additionally, without CONFIG_HOTPLUG_CPU disabled CPUs can never come online, and hence setting cpu_possible_mask bits for them is again a simple waste of resources. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <201005241913.o4OJDH3Z010874@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24arch/x86/pci: use kasprintfJulia Lawall
kasprintf combines kmalloc and sprintf, and takes care of the size calculation itself. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression a,flag; expression list args; statement S; @@ a = - \(kmalloc\|kzalloc\)(...,flag) + kasprintf(flag,args) <... when != a if (a == NULL || ...) S ...> - sprintf(a,args); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> LKML-Reference: <201005241913.o4OJDG3R010871@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24x86, apic: ack all pending irqs when crashed/on kexecKerstin Jonsson
When the SMP kernel decides to crash_kexec() the local APICs may have pending interrupts in their vector tables. The setup routine for the local APIC has a deficient mechanism for clearing these interrupts, it only handles interrupts that has already been dispatched to the local core for servicing (the ISR register) safely, it doesn't consider lower prioritized queued interrupts stored in the IRR register. If you have more than one pending interrupt within the same 32 bit word in the LAPIC vector table registers you may find yourself entering the IO APIC setup with pending interrupts left in the LAPIC. This is a situation for wich the IO APIC setup is not prepared. Depending of what/which interrupt vector/vectors are stuck in the APIC tables your system may show various degrees of malfunctioning. That was the reason why the check_timer() failed in our system, the timer interrupts was blocked by pending interrupts from the old kernel when routed trough the IO APIC. Additional comment from Jiri Bohac: ============== If this should go into stable release, I'd add some kind of limit on the number of iterations, just to be safe from hard to debug lock-ups: +if (loops++ > MAX_LOOPS) { + printk("LAPIC pending clean-up") + break; +} while (queued); with MAX_LOOPS something like 1E9 this would leave plenty of time for the pending IRQs to be cleared and would and still cause at most a second of delay if the loop were to lock-up for whatever reason. [trenn@suse.de: V2: Use tsc if avail to bail out after 1 sec due to possible virtual apic_read calls which may take rather long (suggested by: Avi Kivity <avi@redhat.com>) If no tsc is available bail out quickly after cpu_khz, if we broke out too early and still have irqs pending (which should never happen?) we still get a WARN_ON... V3: - Fixed indentation -> checkpatch clean - max_loops must be signed V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case] Cc: <jbohac@novell.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Kerstin Jonsson <kerstin.jonsson@ericsson.com> Cc: Avi Kivity <avi@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Thomas Renninger <trenn@suse.de> LKML-Reference: <201005241913.o4OJDGWM010865@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24ARM: 5939/1: ARM: Add option CMDLINE_FORCE to force usage of the in-kernel ↵Alexander Holler
cmdline Add an option to force usage of the in-kernel cmdline even if the boot loader passes another command string to the kernel. Useful if someone cannot or don't want to change the command-line options of the boot loader but is able to change the kernel. Signed-off-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: 6140/1: silence a bogus sparse warning in unwind.cAlexander Shishkin
The check for compiler which is supposed to miscompile unwind tables clearly has nothing to do with sparse (which does not define necessary macros anyway), so simply silence it. Signed-off-by: Alexander Shishkin <virtuoso@slind.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: mach-at91: duplicated includeAndrea Gelmini
arch/arm/mach-at91/board-sam9m10g45ek.c: mach/hardware.h is included more than once Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/nwfpe/fpsr.h: Checkpatch cleanupAndrea Gelmini
arch/arm/nwfpe/fpsr.h:33: ERROR: trailing whitespace Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/mach-shark/pci.c: Checkpatch cleanupAndrea Gelmini
arch/arm/mach-shark/pci.c:19: ERROR: trailing statements should be on next line arch/arm/mach-shark/pci.c:20: ERROR: trailing statements should be on next line arch/arm/mach-shark/pci.c:21: ERROR: trailing statements should be on next line arch/arm/mach-shark/pci.c:24: WARNING: externs should be avoided in .c files arch/arm/mach-shark/pci.c:28: WARNING: please, no space before tabs Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/nwfpe/ChangeLog: Checkpatch cleanupAndrea Gelmini
arch/arm/nwfpe/ChangeLog:75: ERROR: trailing whitespace Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/mach-sa1100/leds.c: Checkpatch cleanupAndrea Gelmini
arch/arm/mach-sa1100/leds.c:21: ERROR: code indent should use tabs where possible arch/arm/mach-sa1100/leds.c:21: WARNING: please, no space before tabs arch/arm/mach-sa1100/leds.c:22: ERROR: code indent should use tabs where possible arch/arm/mach-sa1100/leds.c:22: WARNING: please, no space before tabs arch/arm/mach-sa1100/leds.c:24: ERROR: code indent should use tabs where possible arch/arm/mach-sa1100/leds.c:24: WARNING: please, no space before tabs Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/mach-h720x/common.h: Checkpatch cleanupAndrea Gelmini
arch/arm/mach-h720x/common.h:17: WARNING: space prohibited between function name and open parenthesis '(' arch/arm/mach-h720x/common.h:23: WARNING: space prohibited between function name and open parenthesis '(' Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/mach-footbridge/ebsa285-pci.c: Checkpatch cleanupAndrea Gelmini
arch/arm/mach-footbridge/ebsa285-pci.c:22: ERROR: switch and case should be at the same indent Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/mach-clps711x/Makefile.boot: Checkpatch cleanupAndrea Gelmini
arch/arm/mach-clps711x/Makefile.boot:2: ERROR: trailing whitespace Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: arch/arm/boot/bootp/bootp.lds: Checkpatch cleanupAndrea Gelmini
arch/arm/boot/bootp/bootp.lds:22: ERROR: trailing whitespace Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ARM: SPEAR6xx: remove duplicated #includeHuang Weiyi
Remove duplicated #include('s) in arch/arm/mach-spear6xx/spear6xx.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-24ath9k: Fix rx of mcast/bcast frames in PS mode with auto sleepVasanthakumar Thiagarajan
The functionality to keep the device awake until it is done with the rx of any mcast/bcast frames which are pending on AP should also be added to the hardwares which support auto sleep feature. This patch fixes frequent failures in ARP resolution when it is initiated by the other end. Currently auto sleep is enabled only for ar9003 in ath9k. Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24wireless: fix sta_info.h kernel-doc warningsRandy Dunlap
Fix sta_info.h kernel-doc warnings: Warning(net/mac80211/sta_info.h:164): No description found for parameter 'tid_active_rx[STA_TID_NUM]' Warning(net/mac80211/sta_info.h:164): Excess struct/union/enum/typedef member 'tid_state_rx' description in 'sta_ampdu_mlme' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24wireless: fix mac80211.h kernel-doc warningsRandy Dunlap
Fix kernel-doc warnings in mac80211.h: Warning(include/net/mac80211.h:838): No description found for parameter 'ap_addr' Warning(include/net/mac80211.h:1726): No description found for parameter 'get_survey' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24iwlwifi: testing the wrong variable in iwl_add_bssid_station()Dan Carpenter
The intent here is to test that "sta_id_r" is a valid pointer. We do this same test later on in the function. Btw iwl_add_bssid_station() is called from two places and "sta_id_r" is a valid pointer from both callers. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24ath9k_htc: rare leak in ath9k_hif_usb_alloc_tx_urbs()Dan Carpenter
This is obviously a small picky thing. The original error handling code doesn't free the most recent allocations which haven't been added to the hif_dev->tx.tx_buf list yet. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24ath9k_htc: dereferencing before check in hif_usb_tx_cb()Dan Carpenter
After c11d8f89d3b7: "ath9k_htc: Simplify TX URB management" we no longer assume that tx_buf is a non-null pointer. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24rt2x00: Fix rt2800usb TX descriptor writing.Gertjan van Wingerde
The recent changes to skb handling introduced a bug in the rt2800usb TX descriptor writing whereby the length of the USB packet wasn't calculated correctly. Found via code inspection, as the devices themselves didn't seem to mind. Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24rt2x00: Fix failed SLEEP->AWAKE and AWAKE->SLEEP transitions.Gertjan van Wingerde
(Based on a patch created by Ondrej Zary) In some circumstances the Ralink devices do not properly go to sleep or wake up, with timeouts occurring. Fix this by retrying telling the device that it has to wake up or sleep. Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24Revert "ath9k: Group Key fix for VAPs"John W. Linville
This reverts commit 03ceedea972a82d343fa5c2528b3952fa9e615d5. This patch was reported to cause a regression in which connectivity is lost and cannot be reestablished after a suspend/resume cycle. Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24wireless: update gfp/slab.h includesTejun Heo
Implicit slab.h inclusion via percpu.h is about to go away. Make sure gfp.h or slab.h is included as necessary. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24rt2x00: don't use to_pci_dev in rt2x00pci_uninitializeHelmut Schaa
Don't use to_pci_dev in rt2x00pci_uninitialize to get the allocated irq as it won't work for platform devices (SoC). Instead, use the irq field that's already used everywhere else. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24ath5k: consistently use rx_bufsize for RX DMABruno Randolf
We should use the same buffer size we set up for DMA also in the hardware descriptor. Previously we used common->rx_bufsize for setting up the DMA mapping, but used skb_tailroom(skb) for the size we tell to the hardware in the descriptor itself. The problem is that skb_tailroom(skb) can give us a larger value than the size we set up for DMA before. This allows the hardware to write into memory locations not set up for DMA. In practice this should rarely happen because all packets should be smaller than the maximum 802.11 packet size. On the tested platform rx_bufsize is 2528, and we allocated an skb of 2559 bytes length (including padding for cache alignment) but sbk_tailroom() was 2592. Just consistently use rx_bufsize for all RX DMA memory sizes. Also use the return value of the descriptor setup function. Cc: stable@kernel.org Signed-off-by: Bruno Randolf <br1@einfach.org> Reviewed-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-05-24slub: move kmem_cache_node into it's own cachelineAlexander Duyck
This patch is meant to improve the performance of SLUB by moving the local kmem_cache_node lock into it's own cacheline separate from kmem_cache. This is accomplished by simply removing the local_node when NUMA is enabled. On my system with 2 nodes I saw around a 5% performance increase w/ hackbench times dropping from 6.2 seconds to 5.9 seconds on average. I suspect the performance gain would increase as the number of nodes increases, but I do not have the data to currently back that up. Bugzilla-Reference: http://bugzilla.kernel.org/show_bug.cgi?id=15713 Cc: <stable@kernel.org> Reported-by: Alex Shi <alex.shi@intel.com> Tested-by: Alex Shi <alex.shi@intel.com> Acked-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-24HID: Add the GYR4101US USB ID to hid-gyrationCory Maccarrone
This change adds in the USB product ID for the Gyration GYR4101US USB media center remote control. This remote is similar enough to the other two devices that this driver can be used without any other changes to get full support for the remote. Signed-off-by: Cory Maccarrone <darkstar6262@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-05-24Merge branch 'delayed-logging-for-2.6.35' into for-linusAlex Elder
2010-05-24[SCSI] ipr: improve interrupt service routine performanceWayne Boyer
During performance testing on P7 machines it was observed that the interrupt service routine was doing unnecessary MMIO operations. This patch rearranges the logic of the routine and moves some of the code out of the main routine. The result is that there are now fewer MMIO operations in the performance path of the code. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-24[SCSI] ipr: set the data list length in the request control blockWayne Boyer
In bring up testing for the new 64 bit adapters, the first read command failed after loading the driver. The cause was that the command requires more than one scatter gather element and the corresponding code to set the data list length in the request control block was missing. This patch adds the correct assignment. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-24[SCSI] ipr: fix a register read to use the correct address for 64 bit adaptersWayne Boyer
Fix ipr_reset_enable_ioa() to read the correct IOA to host interrupt register address for 64 bit adapters. We need to read the lower 32 bits, not the upper 32 bits. Also change the write of the 64 bit mask value to a single writeq instead of two writel calls. Finally, use the correct u8 type for the type field in the ipr_resource_entry structure. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-24[SCSI] ipr: include the resource path in the IOA status area structureWayne Boyer
The IOA status area now includes the new resource path field for 64 bit adapters. This patch changes the driver to fix the ioasa structure and to use the correct structure definition based on the type of adatper. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-24[SCSI] ipr: implement fixes for 64 bit adapter supportWayne Boyer
Implement some small fixes for 64 bit support that were preventing the adapter from becoming operational. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-24[SCSI] be2iscsi: correct return value in mgmt_invalidate_icds()Dan Carpenter
This function should return 0 on error. Returning -1 would cause a crash. Also there is an extra space before the newline character and a missing space between the "for" and the "mgmt_invalidate_icds". I put the string on one line. The current version of checkpatch.pl complains that the line is too long, but it makes grepping easier. Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-24xfs: Ensure inode allocation buffers are fully replayedDave Chinner
With delayed logging, we can get inode allocation buffers in the same transaction inode unlink buffers. We don't currently mark inode allocation buffers in the log, so inode unlink buffers take precedence over allocation buffers. The result is that when they are combined into the same checkpoint, only the unlinked inode chain fields are replayed, resulting in uninitialised inode buffers being detected when the next inode modification is replayed. To fix this, we need to ensure that we do not set the inode buffer flag in the buffer log item format flags if the inode allocation has not already hit the log. To avoid requiring a change to log recovery, we really need to make this a modification that relies only on in-memory sate. We can do this by checking during buffer log formatting (while the CIL cannot be flushed) if we are still in the same sequence when we commit the unlink transaction as the inode allocation transaction. If we are, then we do not add the inode buffer flag to the buffer log format item flags. This means the entire buffer will be replayed, not just the unlinked fields. We do this while CIL flusheѕ are locked out to ensure that we don't race with the sequence numbers changing and hence fail to put the inode buffer flag in the buffer format flags when we really need to. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: enable background pushing of the CILDave Chinner
If we let the CIL grow without bound, it will grow large enough to violate recovery constraints (must be at least one complete transaction in the log at all times) or take forever to write out through the log buffers. Hence we need a check during asynchronous transactions as to whether the CIL needs to be pushed. We track the amount of log space the CIL consumes, so it is relatively simple to limit it on a pure size basis. Make the limit the minimum of just under half the log size (recovery constraint) or 8MB of log space (which is an awful lot of metadata). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: forced unmounts need to push the CILDave Chinner
If the filesystem is being shut down and the there is no log error, the current code forces out the current log buffers. This code now needs to push the CIL before it forces out the log buffers to acheive the same result. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: Introduce delayed logging core codeDave Chinner
The delayed logging code only changes in-memory structures and as such can be enabled and disabled with a mount option. Add the mount option and emit a warning that this is an experimental feature that should not be used in production yet. We also need infrastructure to track committed items that have not yet been written to the log. This is what the Committed Item List (CIL) is for. The log item also needs to be extended to track the current log vector, the associated memory buffer and it's location in the Commit Item List. Extend the log item and log vector structures to enable this tracking. To maintain the current log format for transactions with delayed logging, we need to introduce a checkpoint transaction and a context for tracking each checkpoint from initiation to transaction completion. This includes adding a log ticket for tracking space log required/used by the context checkpoint. To track all the changes we need an io vector array per log item, rather than a single array for the entire transaction. Using the new log vector structure for this requires two passes - the first to allocate the log vector structures and chain them together, and the second to fill them out. This log vector chain can then be passed to the CIL for formatting, pinning and insertion into the CIL. Formatting of the log vector chain is relatively simple - it's just a loop over the iovecs on each log vector, but it is made slightly more complex because we re-write the iovec after the copy to point back at the memory buffer we just copied into. This code also needs to pin log items. If the log item is not already tracked in this checkpoint context, then it needs to be pinned. Otherwise it is already pinned and we don't need to pin it again. The only other complexity is calculating the amount of new log space the formatting has consumed. This needs to be accounted to the transaction in progress, and the accounting is made more complex becase we need also to steal space from it for log metadata in the checkpoint transaction. Calculate all this at insert time and update all the tickets, counters, etc correctly. Once we've formatted all the log items in the transaction, attach the busy extents to the checkpoint context so the busy extents live until checkpoint completion and can be processed at that point in time. Transactions can then be freed at this point in time. Now we need to issue checkpoints - we are tracking the amount of log space used by the items in the CIL, so we can trigger background checkpoints when the space usage gets to a certain threshold. Otherwise, checkpoints need ot be triggered when a log synchronisation point is reached - a log force event. Because the log write code already handles chained log vectors, writing the transaction is trivial, too. Construct a transaction header, add it to the head of the chain and write it into the log, then issue a commit record write. Then we can release the checkpoint log ticket and attach the context to the log buffer so it can be called during Io completion to complete the checkpoint. We also need to allow for synchronising multiple in-flight checkpoints. This is needed for two things - the first is to ensure that checkpoint commit records appear in the log in the correct sequence order (so they are replayed in the correct order). The second is so that xfs_log_force_lsn() operates correctly and only flushes and/or waits for the specific sequence it was provided with. To do this we need a wait variable and a list tracking the checkpoint commits in progress. We can walk this list and wait for the checkpoints to change state or complete easily, an this provides the necessary synchronisation for correct operation in both cases. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: Delayed logging design documentationDave Chinner
Document the design of the delayed logging implementation. This includes assumptions made, dead ends followed, the reasoning behind the structuring of the code, the layout of various structures, how things fit together, traps and pit-falls avoided, etc. This is all too much to document in the code itself, so do it in a separate file. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>