summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-01-10Merge branch 'staging-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 * 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (510 commits) staging: speakup: fix failure handling staging: usbip: remove double giveback of URB Staging: batman-adv: Remove batman-adv from staging Staging: hv: Use only one txf buffer per channel and kmalloc/GFP_KERNEL on initialize staging: hv: remove unneeded osd_schedule_callback staging: hv: convert channel_mgmt.c to not call osd_schedule_callback staging: hv: convert vmbus_on_msg_dpc to not call osd_schedule_callback staging: brcm80211: Fix WL_<type> logging macros Staging: IIO: DDS: AD9833 / AD9834 driver Staging: IIO: dds.h convenience macros Staging: IIO: Direct digital synthesis abi documentation staging: brcm80211: Convert ETHER_TYPE_802_1X to ETH_P_PAE staging: brcm80211: Remove unused ETHER_TYPE_<foo> #defines staging: brcm80211: Remove ETHER_HDR_LEN, use ETH_HLEN staging: brcm80211: Convert ETHER_ADDR_LEN to ETH_ALEN staging: brcm80211: Convert ETHER_IS<FOO> to is_<foo>_ether_addr staging: brcm80211: Remove unused ether_<foo> #defines and struct staging: brcm80211: Convert ETHER_IS_MULTI to is_multicast_ether_addr staging: brcm80211: Remove unused #defines ETHER_<foo>_LOCALADDR Staging: comedi: Fix checkpatch.pl issues in file s526.c ... Fix up trivial conflict in drivers/video/udlfb.c
2011-01-10perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1)Arnaldo Carvalho de Melo
And a test for it: [acme@felicio linux]$ perf test 1: vmlinux symtab matches kallsyms: Ok 2: detect open syscall event: Ok 3: detect open syscall event on all cpus: Ok [acme@felicio linux]$ Translating C the test does: 1. generates different number of open syscalls on each CPU by using sched_setaffinity 2. Verifies that the expected number of events is generated on each CPU It works as expected. LKML-Reference: <new-submission> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10caif: don't set connection request param size before copying dataDan Rosenberg
The size field should not be set until after the data is successfully copied in. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (36 commits) sony-laptop: support new hotkeys on the P, Z and EC series platform/x86: Consistently select LEDS Kconfig options sony-laptop: fix sparse non-ANSI function warning intel_ips: fix sparse non-ANSI function warning Support KHLB2 in the compal laptop driver acer-wmi: Enabled Acer Launch Manager mode [PATCH] intel_pmic_gpio: modify EOI handling following change of kernel irq subsystem ACPI Thinkpad: We must always call va_end() after va_start() but do not do so in thinkpad_acpi.c::acpi_evalf() acer-wmi: Initialize wlan/bluetooth/wwan rfkill software block state acer-wmi: Detect the WiFi/Bluetooth/3G devices available acer-wmi: Add 3G rfkill sysfs file acer-wmi: Add acer wmi hotkey events support platform/x86: Kconfig: Replace select by depends on ACPI_WMI ideapad: pass ideapad_priv as argument (part 2) ideapad: pass ideapad_priv as argument (part 1) ideapad: add markups, unify comments and return result when init ideapad: add hotkey support ideapad: let camera power control entry under platform driver ideapad: add platform driver for ideapad fujitsu-laptop: fix compiler warning on pnp_ids ...
2011-01-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68knommu: Need to check __get_user()/__put_user() result m68knommu: signal.c __user annotations m68knommu: Equivalent of "m68k: handle new gcc's" m68knommu: f_pcr has been gone since headers' merge m68knommu: Don't lose state if sigframe setup fails m68knommu: Handle multiple pending signals m68knommu: Switch to saner sigsuspend m68knommu: Don't bother with SA_ONESHOT m68k: Check __get_user()/__put_user() return value m68k: Missing syscall_trace() on sigreturn m68k: Fix stack mangling logics in sigreturn m68k: If we fail to set sigframe up, just leave regs alone... m68k: Don't lose state if sigframe setup fails m68k: Simplify the singlestepping handling in signals m68k: Switch to saner sigsuspend() m68k: Resetting sa_handler in local copy of k_sigaction is pointless m68k/sun3: Kill pte_unmap() warnings
2011-01-10Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Avoid array overflow if there are too many cpus in SRAT table [IA64] Remove unlikely from cpu_is_offline [IA64] irq_ia64, use set_irq_chip [IA64] perfmon: Change vmalloc to vzalloc and drop memset. [IA64] eliminate race condition in smp_flush_tlb_mm
2011-01-10Merge branch 'for-torvalds' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson * 'for-torvalds' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson: ux500: allow 5500 and 8500 to be built together ux500: modem_irq is only for 5500 ux500: dynamic SOC detection ux500: rename MOP board Kconfig ux500: remove build-time changing macros
2011-01-10Merge branch 'msm-smp' of git://codeaurora.org/quic/kernel/davidb/linux-msmLinus Torvalds
* 'msm-smp' of git://codeaurora.org/quic/kernel/davidb/linux-msm: msm: add SMP support for msm msm: hotplug: support cpu hotplug on msm msm: timer: SMP timer support for msm msm: scm-boot: Support for setting cold/warm boot addresses msm: Secure Channel Manager (SCM) support
2011-01-10cxgb4vf: fix mailbox data/control coherency domain raceCasey Leedom
For the VFs, the Mailbox Data "registers" are actually backed by T4's "MA" interface rather than PL Registers (as is the case for the PFs). Because these are in different coherency domains, the write to the VF's PL-register-backed Mailbox Control can race in front of the writes to the MA-backed VF Mailbox Data "registers". So we need to do a read-back on at least one byte of the VF Mailbox Data registers before doing the write to the VF Mailbox Control register. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10qlcnic: change module parameter permissionsamit salecha
o Updating module parameter after driver load is not supported except auto_fw_reset parameter. Changing these parameter after driver load, can have weird result. o Update driver version to 5.0.15. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10qlcnic: fix ethtool diagnostics testSony Chacko
IRQ diag test was getting executed only when both register test and link test passed. The test should get executed if ETH_TEST_FL_OFFLINE flag is set. Signed-off-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10qlcnic: fix flash fw version readamit salecha
Driver is reading flash fw version from defined address, this address may be invalid. Indeed Driver should read address for fw version through flash layout table. Flash layout table has defined region and address for fw version address should be read from fw image region. Driver has check for old firmware, this bug can cause driver load fail. This patch will try to read fw version from flash image region, if that fails, read from defined address. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10phonet: some signedness bugsDan Carpenter
Dan Rosenberg pointed out that there were some signed comparison bugs in the phonet protocol. http://marc.info/?l=full-disclosure&m=129424528425330&w=2 The problem is that we check for array overflows but "protocol" is signed and we don't check for array underflows. If you have already have CAP_SYS_ADMIN then you could use the bugs to get root, or someone could cause an oops by mistake. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10netdev: bfin_mac: let boards set vlan masksMike Frysinger
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10netdev: bfin_mac: disable hardware checksum if writeback cache is enabledSonic Zhang
With writeback caches, corrupted RX packets will be sent up the stack without any error markings. Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10netdev: bfin_mac: drop unused Mac dataMike Frysinger
We don't use this local "Mac" data anywhere (since we rely on the netdev's storage), so punt it. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10netdev: bfin_mac: mark setup_system_regs as staticMike Frysinger
No need for this to be exported since it is only used in this driver. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10netdev: bfin_mac: clean up printk messagesMike Frysinger
Use netdev_* and pr_* helper funcs for output rather than printk. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10i2c: Constify i2c_client where possibleJean Delvare
Helper functions for I2C and SMBus transactions don't modify the i2c_client that is passed to them, so it can be marked const. Signed-off-by: Jean Delvare <khali@linux-fr.org>
2011-01-10i2c-algo-bit: Complain about masters which can't read SCLJean Delvare
The I2C specification explicitly describes both SDA and SCL as bidirectional lines. An I2C master with a read-only SCL is thus not compliant. If a slow slave stretches the clock, errors will happen, so the bus can't be considered as reliable. Signed-off-by: Jean Delvare <khali@linux-fr.org>
2011-01-10i2c-algo-bit: Refactor adapter registrationJean Delvare
Use a function pointer to decide whether to call i2c_add_adapter or i2c_add_numbered_adapter. This makes the code more compact than the current strategy of having the common code in a separate function. Signed-off-by: Jean Delvare <khali@linux-fr.org>
2011-01-10i2c: Add generic I2C multiplexer using GPIO APIPeter Korsgaard
Add an i2c mux driver providing access to i2c bus segments using a hardware MUX sitting on a master bus and controlled through gpio pins. E.G. something like: ---------- ---------- Bus segment 1 - - - - - | | SCL/SDA | |-------------- | | | |------------| | | | | | Bus segment 2 | | | Linux | GPIO 1..N | MUX |--------------- Devices | |------------| | | | | | | | Bus segment M | | | |---------------| | ---------- ---------- - - - - - SCL/SDA of the master I2C bus is multiplexed to bus segment 1..M according to the settings of the GPIO pins 1..N. Signed-off-by: Peter Korsgaard <peter.korsgaard@barco.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2011-01-10i2c-nforce2: Remove unnecessary cast of pci_get_drvdataJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2011-01-10i2c-i801: Include <linux/slab.h>Ben Hutchings
Commit 5a0e3ad6af8660be21ca98a971cd00f331318c05 added direct inclusion of <linux/slab.h> to those source files that appeared to need it, but somehow missed this. On most architectures <linux/slab.h> is still indirectly included, but there are exceptions such as alpha. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2011-01-10IB/srp: consolidate hot-path variables into cache linesDavid Dillow
Put the variables accessed together in the hot-path into common cachelines, and separate them by RW vs RO to avoid false dirtying. We keep a local copy of the lkey and rkey in the target to avoid traversing pointers (and associated cache lines) to find them. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10IB/srp: stop sharing the host lock with SCSIBart Van Assche
We don't need protection against the SCSI stack, so use our own lock to allow parallel progress on separate CPUs. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10IB/srp: reduce lock coverage of command completionBart Van Assche
We only need the lock to cover list and credit manipulations, so push those into srp_remove_req() and update the call chains. We reorder the request removal and command completion in srp_process_rsp() to avoid the SCSI mid-layer sending another command before we've released our request and added any credits returned by the target. This prevents us from returning HOST_BUSY unneccesarily. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out, small cleanups, and modified to avoid potential extraneous HOST_BUSY returns by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10IB/srp: reduce local coverage for command submission and EHBart Van Assche
We only need locks to protect our lists and number of credits available. By pre-consuming the credit for the request, we can reduce our lock coverage to just those areas. If we don't actually send the request, we'll need to put the credit back into the pool. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10IB/srp: don't move active requests to their own listBart Van Assche
We use req->scmnd != NULL to indicate an active request, so there's no need to keep a separate list for them. We can afford the array iteration during error handling, and dropping it gives us one less item that needs lock protection. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10staging: speakup: fix failure handlingWilliam Hubbs
fix the failure handling in kobjects and the main function so that we release the virtual keyboard if we exit due to another failure. Signed-off-by: William Hubbs <w.d.hubbs@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-01-10staging: usbip: remove double giveback of URBMárton Németh
In the vhci_urb_dequeue() function the TCP connection is checked twice. Each time when the TCP connection is closed the URB is unlinked and given back. Remove the second attempt of unlinking and giving back of the URB completely. This patch fixes the bug described at https://bugzilla.kernel.org/show_bug.cgi?id=24872 . Signed-off-by: Márton Németh <nm127@freemail.hu> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-01-10Merge branch 'bugfixes' into nfs-for-2.6.38Trond Myklebust
Conflicts: fs/nfs/nfs2xdr.c fs/nfs/nfs3xdr.c fs/nfs/nfs4xdr.c
2011-01-10NFS: Don't use vm_map_ram() in readdirTrond Myklebust
vm_map_ram() is not available on NOMMU platforms, and causes trouble on incoherrent architectures such as ARM when we access the page data through both the direct and the virtual mapping. The alternative is to use the direct mapping to access page data for the case when we are not crossing a page boundary, but to copy the data into a linear scratch buffer when we are accessing data that spans page boundaries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Cc: stable@kernel.org [2.6.37]
2011-01-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (30 commits) MAINTAINERS: Add tomoyo-dev-en ML. SELinux: define permissions for DCB netlink messages encrypted-keys: style and other cleanup encrypted-keys: verify datablob size before converting to binary trusted-keys: kzalloc and other cleanup trusted-keys: additional TSS return code and other error handling syslog: check cap_syslog when dmesg_restrict Smack: Transmute labels on specified directories selinux: cache sidtab_context_to_sid results SELinux: do not compute transition labels on mountpoint labeled filesystems This patch adds a new security attribute to Smack called SMACK64EXEC. It defines label that is used while task is running. SELinux: merge policydb_index_classes and policydb_index_others selinux: convert part of the sym_val_to_name array to use flex_array selinux: convert type_val_to_struct to flex_array flex_array: fix flex_array_put_ptr macro to be valid C SELinux: do not set automatic i_ino in selinuxfs selinux: rework security_netlbl_secattr_to_sid SELinux: standardize return code handling in selinuxfs.c SELinux: standardize return code handling in selinuxfs.c SELinux: standardize return code handling in policydb.c ...
2011-01-10netfilter: x_tables: dont block BH while reading countersEric Dumazet
Using "iptables -L" with a lot of rules have a too big BH latency. Jesper mentioned ~6 ms and worried of frame drops. Switch to a per_cpu seqlock scheme, so that taking a snapshot of counters doesnt need to block BH (for this cpu, but also other cpus). This adds two increments on seqlock sequence per ipt_do_table() call, its a reasonable cost for allowing "iptables -L" not block BH processing. Reported-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-10Input: add Austria Microsystem AS5011 joystick driverFabien Marteau
This is driver for EasyPoint AS5011 2 axis joystick chip. This chip is plugged on an I2C bus. Tested on ARM processor (i.MX27). Signed-off-by: Fabien Marteau <fabien.marteau@armadeus.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-01-10ext2: Resolve 'dereferencing pointer to incomplete type' when enabling ↵Josh Hunt
EXT2_XATTR_DEBUG When I enable EXT2_XATTR_DEBUG in fs/ext2/xattr.c I get a build error stating the following: CC fs/ext2/xattr.o fs/ext2/xattr.c: In function 'ext2_xattr_cache_insert': fs/ext2/xattr.c:841: error: dereferencing pointer to incomplete type fs/ext2/xattr.c:846: error: dereferencing pointer to incomplete type make[2]: *** [fs/ext2/xattr.o] Error 1 make[1]: *** [fs/ext2] Error 2 make: *** [fs] Error 2 These lines reference ext2_xattr_cache->c_entry_count which is defined in struct mb_cache. struct mb_cache is currently only defined in fs/mbcache.c. Moving struct mb_cache definition to include/linux/mbcache.h to resolve the issue. Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext3: Remove redundant unlikely()Tobias Klauser
IS_ERR() already implies unlikely(), so it can be omitted here. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext2: Remove redundant unlikely()Tobias Klauser
IS_ERR() already implies unlikely(), so it can be omitted here. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext3: speed up file creates by optimizing rec_len functionsEric Sandeen
The addition of 64k block capability in the rec_len_from_disk and rec_len_to_disk functions added a bit of math overhead which slows down file create workloads needlessly when the architecture cannot even support 64k blocks, thanks to page size limits. Similar changes already exist in the ext4 codebase. The directory entry checking can also be optimized a bit by sprinkling in some unlikely() conditions to move the error handling out of line. bonnie++ sequential file creates on a 512MB ramdisk speeds up from about 77,000/s to about 82,000/s, about a 6% improvement. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext2: speed up file creates by optimizing rec_len functionsEric Sandeen
The addition of 64k block capability in the rec_len_from_disk and rec_len_to_disk functions added a bit of math overhead which slows down file create workloads needlessly when the architecture cannot even support 64k blocks, thanks to page size limits. The directory entry checking can also be optimized a bit by sprinkling in some unlikely() conditions to move the error handling out of line. bonnie++ sequential file creates on a 512MB ramdisk speeds up from about 2200/s to about 2500/s, about a 14% improvement. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext3: Add more journal error checkNamhyung Kim
Check return value of ext3_journal_get_write_acccess() and ext3_journal_dirty_metadata(). Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext3: Add journal error check in resize.cNamhyung Kim
Check return value of ext3_journal_get_write_access() and ext3_journal_dirty_metadata(). Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10quota: Use %pV and __attribute__((format (printf in __quota_error and fix ↵Joe Perches
fallout Use %pV in __quota_error so a single printk can not be interleaved with other logging messages. Add __attribute__((format (printf, 3, 4))) so format and arguments can be verified by compiler. Make sure printk formats and arguments match. Block # needed a pointer dereference. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext3: Add FITRIM handlingLukas Czerner
The ioctl takes fstrim_range structure (defined in include/linux/fs.h) as an argument specifying a range of filesystem to trim and the minimum size of an continguous extent to trim. After the FITRIM is done, the number of bytes passed from the filesystem down the block stack to the device for potential discard is stored in fstrim_range.len. This number is a maximum discard amount from the storage device's perspective, because FITRIM called repeatedly will keep sending the same sectors for discard. fstrim_range.len will report the same potential discard bytes each time, but only sectors which had been written to between the discards would actually be discarded by the storage device. Further, the kernel block layer reserves the right to adjust the discard ranges to fit raid stripe geometry, non-trim capable devices in a LVM setup, etc. These reductions would not be reflected in fstrim_range.len. Thus fstrim_range.len can give the user better insight on how much storage space has potentially been released for wear-leveling, but it needs to be one of only one criteria the userspace tools take into account when trying to optimize calls to FITRIM. Thanks to Greg Freemyer <greg.freemyer@gmail.com> for better commit message. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext3: Add batched discard support for ext3Lukas Czerner
Walk through allocation groups and trim all free extents. It can be invoked through FITRIM ioctl on the file system. The main idea is to provide a way to trim the whole file system if needed, since some SSD's may suffer from performance loss after the whole device was filled (it does not mean that fs is full!). It search for free extents in allocation groups specified by Byte range start -> start+len. When the free extent is within this range, blocks are marked as used and then trimmed. Afterwards these blocks are marked as free in per-group bitmap. [JK: Fixed up error handling and trimming of a single group] Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-10ext4: don't pass entire map to check_eofblocks_flEric Sandeen
Since check_eofblocks_fl() only uses the m_lblk portion of the map structure, we may as well pass that directly, rather than passing the entire map, which IMHO obfuscates what parameters check_eofblocks_fl() cares about. Not a big deal, but seems tidier and less confusing, to me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: fix memory leak in ext4_free_branchesTheodore Ts'o
Commit 40389687 moved a call to ext4_forget() out of ext4_free_branches and let ext4_free_blocks() handle calling bforget(). But that change unfortunately did not replace the call to ext4_forget() with brelse(), which was needed to drop the in-use count of the indirect block's buffer head, which lead to a memory leak when deleting files that used indirect blocks. Fix this. Thanks to Hugh Dickins for pointing this out. Cc: stable@kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: remove ext4_mb_return_to_preallocation()Theodore Ts'o
This function was never implemented, except for a BUG_ON which was tripping when ext4 is run without a journal. The problem is that although the comment asserts that "truncate (which is the only way to free block) discards all preallocations", ext4_free_blocks() is also called in various error recovery paths when blocks have been allocated, but for various reasons, we were not able to use those data blocks (for example, because we ran out of memory while trying to manipulate the extent tree, or some other similar situation). In addition to the fact that this function isn't implemented except for the incorrect BUG_ON, the single caller of this function, ext4_free_blocks(), doesn't use it all if the journal is enabled. So remove the (stub) function entirely for now. If we decide it's better to add it back, it's only going to be useful with a relatively large number of code changes anyway. Google-Bug-Id: 3236408 Cc: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: flush the i_completed_io_list during ext4_truncateJiaying Zhang
Ted first found the bug when running 2.6.36 kernel with dioread_nolock mount option that xfstests #13 complained about wrong file size during fsck. However, the bug exists in the older kernels as well although it is somehow harder to trigger. The problem is that ext4_end_io_work() can happen after we have truncated an inode to a smaller size. Then when ext4_end_io_work() calls ext4_convert_unwritten_extents(), we may reallocate some blocks that have been truncated, so the inode size becomes inconsistent with the allocated blocks. The following patch flushes the i_completed_io_list during truncate to reduce the risk that some pending end_io requests are executed later and convert already truncated blocks to initialized. Note that although the fix helps reduce the problem a lot there may still be a race window between vmtruncate() and ext4_end_io_work(). The fundamental problem is that if vmtruncate() is called without either i_mutex or i_alloc_sem held, it can race with an ongoing write request so that the io_end request is processed later when the corresponding blocks have been truncated. Ted and I have discussed the problem offline and we saw a few ways to fix the race completely: a) We guarantee that i_mutex lock and i_alloc_sem write lock are both hold whenever vmtruncate() is called. The i_mutex lock prevents any new write requests from entering writeback and the i_alloc_sem prevents the race from ext4_page_mkwrite(). Currently we hold both locks if vmtruncate() is called from do_truncate(), which is probably the most common case. However, there are places where we may call vmtruncate() without holding either i_mutex or i_alloc_sem. I would like to ask for other people's opinions on what locks are expected to be held before calling vmtruncate(). There seems a disagreement among the callers of that function. b) We change the ext4 write path so that we change the extent tree to contain the newly allocated blocks and update i_size both at the same time --- when the write of the data blocks is completed. c) We add some additional locking to synchronize vmtruncate() and ext4_end_io_work(). This approach may have performance implications so we need to be careful. All of the above proposals may require more substantial changes, so we may consider to take the following patch as a bandaid. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>