summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-01-07fs: use RCU in shrink_dentry_list to reduce lock nestingNick Piggin
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: reduce dcache_inode_lock width in lru scanningNick Piggin
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce prune_one_dentry lockingNick Piggin
prune_one_dentry can avoid quite a bit of locking in the common case where ancestors have an elevated refcount. Alternatively, we could have gone the other way and made fewer trylocks in the case where d_count goes to zero, but is probably less common. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce d_parent lockingNick Piggin
Use RCU to simplify locking in dget_parent. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache rationalise dget variantsNick Piggin
dget_locked was a shortcut to avoid the lazy lru manipulation when we already held dcache_lock (lru manipulation was relatively cheap at that point). However, how that the lru lock is an innermost one, we never hold it at any caller, so the lock cost can now be avoided. We already have well working lazy dcache LRU, so it should be fine to defer LRU manipulations to scan time. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce dcache_inode_lockNick Piggin
dcache_inode_lock can be avoided in d_delete() and d_materialise_unique() in cases where it is not required. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce locking in d_allocNick Piggin
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce dput lockingNick Piggin
It is possible to run dput without taking data structure locks up-front. In many cases where we don't kill the dentry anyway, these locks are not required. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache avoid starvation in dcache multi-step operationsNick Piggin
Long lived dcache "multi-step" operations which retry on rename seq can be starved with a lot of rename activity. If they fail after the 1st pass, take the rename_lock for writing to avoid further starvation. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache remove dcache_lockNick Piggin
dcache_lock no longer protects anything. remove it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: Use rename lock and RCU for multi-step operationsNick Piggin
The remaining usages for dcache_lock is to allow atomic, multi-step read-side operations over the directory tree by excluding modifications to the tree. Also, to walk in the leaf->root direction in the tree where we don't have a natural d_lock ordering. This could be accomplished by taking every d_lock, but this would mean a huge number of locks and actually gets very tricky. Solve this instead by using the rename seqlock for multi-step read-side operations, retry in case of a rename so we don't walk up the wrong parent. Concurrent dentry insertions are not serialised against. Concurrent deletes are tricky when walking up the directory: our parent might have been deleted when dropping locks so also need to check and retry for that. We can also use the rename lock in cases where livelock is a worry (and it is introduced in subsequent patch). Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: increase d_name lock coverageNick Piggin
Cover d_name with d_lock in more cases, where there may be concurrent modification to it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: scale inode alias listNick Piggin
Add a new lock, dcache_inode_lock, to protect the inode's i_dentry list from concurrent modification. d_alias is also protected by d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache scale subdirsNick Piggin
Protect d_subdirs and d_child with d_lock, except in filesystems that aren't using dcache_lock for these anyway (eg. using i_mutex). Note: if we change the locking rule in future so that ->d_child protection is provided only with ->d_parent->d_lock, it may allow us to reduce some locking. But it would be an exception to an otherwise regular locking scheme, so we'd have to see some good results. Probably not worthwhile. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache scale d_unhashedNick Piggin
Protect d_unhashed(dentry) condition with d_lock. This means keeping DCACHE_UNHASHED bit in synch with hash manipulations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache scale dentry refcountNick Piggin
Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache scale lruNick Piggin
Add a new lock, dcache_lru_lock, to protect the dcache LRU list from concurrent modification. d_lru is also protected by d_lock, which allows LRU lists to be accessed without the lru lock, using RCU in future patches. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache scale hashNick Piggin
Add a new lock, dcache_hash_lock, to protect the dcache hash table from concurrent modification. d_hash is also protected by d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07hostfs: simplify lockingNick Piggin
Remove dcache_lock locking from hostfs filesystem, and move it into dcache helpers. All that is required is a coherent path name. Protection from concurrent modification of the namespace after path name generation is not provided in current code, because dcache_lock is dropped before the path is used. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_hash for rcu-walkNick Piggin
Change d_hash so it may be called from lock-free RCU lookups. See similar patch for d_compare for details. For in-tree filesystems, this is just a mechanical change. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_compare for rcu-walkNick Piggin
Change d_compare so it may be called from lock-free RCU lookups. This does put significant restrictions on what may be done from the callback, however there don't seem to have been any problems with in-tree fses. If some strange use case pops up that _really_ cannot cope with the rcu-walk rules, we can just add new rcu-unaware callbacks, which would cause name lookup to drop out of rcu-walk mode. For in-tree filesystems, this is just a mechanical change. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: name case update methodNick Piggin
smpfs and ncpfs want to update a live dentry name in-place. Rather than have them open code the locking, provide a documented dcache API. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07jfs: dont overwrite dentry name in d_revalidateNick Piggin
Use vfat's method for dealing with negative dentries to preserve case, rather than overwrite dentry name in d_revalidate, which is a bit ugly and also gets in the way of doing lock-free path walking. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07cifs: dont overwrite dentry name in d_revalidateNick Piggin
Use vfat's method for dealing with negative dentries to preserve case, rather than overwrite dentry name in d_revalidate, which is a bit ugly and also gets in the way of doing lock-free path walking. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_delete semanticsNick Piggin
Change d_delete from a dentry deletion notification to a dentry caching advise, more like ->drop_inode. Require it to be constant and idempotent, and not take d_lock. This is how all existing filesystems use the callback anyway. This makes fine grained dentry locking of dput and dentry lru scanning much simpler. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache documentation cleanupNick Piggin
Remove redundant (and incorrect, since dcache RCU lookup) dentry locking documentation and point to the canonical document. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07config fs: avoid switching ->d_op on live dentryNick Piggin
Switching d_op on a live dentry is racy in general, so avoid it. In this case it is a negative dentry, which is safer, but there are still concurrent ops which may be called on d_op in that case (eg. d_revalidate). So in general a filesystem may not do this. Fix configfs so as not to do this. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07cgroup fs: avoid switching ->d_op on live dentryNick Piggin
Switching d_op on a live dentry is racy in general, so avoid it. In this case it is a negative dentry, which is safer, but there are still concurrent ops which may be called on d_op in that case (eg. d_revalidate). So in general a filesystem may not do this. Fix cgroupfs so as not to do this. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: use fast counters for vfs cachesNick Piggin
percpu_counter library generates quite nasty code, so unless you need to dynamically allocate counters or take fast approximate value, a simple per cpu set of counters is much better. The percpu_counter can never be made to work as well, because it has an indirection from pointer to percpu memory, and it can't use direct this_cpu_inc interfaces because it doesn't use static PER_CPU data, so code will always be worse. In the fastpath, it is the difference between this: incl %gs:nr_dentry # nr_dentry and this: movl percpu_counter_batch(%rip), %edx # percpu_counter_batch, movl $1, %esi #, movq $nr_dentry, %rdi #, call __percpu_counter_add # (plus I clobber registers) __percpu_counter_add: pushq %rbp # movq %rsp, %rbp #, subq $32, %rsp #, movq %rbx, -24(%rbp) #, movq %r12, -16(%rbp) #, movq %r13, -8(%rbp) #, movq %rdi, %rbx # fbc, fbc #APP # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1 movq %gs:kernel_stack,%rax #, pfo_ret__ # 0 "" 2 #NO_APP incl -8124(%rax) # <variable>.preempt_count movq 32(%rdi), %r12 # <variable>.counters, tcp_ptr__ #APP # 78 "lib/percpu_counter.c" 1 add %gs:this_cpu_off, %r12 # this_cpu_off, tcp_ptr__ # 0 "" 2 #NO_APP movslq (%r12),%r13 #* tcp_ptr__, tmp73 movslq %edx,%rax # batch, batch addq %rsi, %r13 # amount, count cmpq %rax, %r13 # batch, count jge .L27 #, negl %edx # tmp76 movslq %edx,%rdx # tmp76, tmp77 cmpq %rdx, %r13 # tmp77, count jg .L28 #, .L27: movq %rbx, %rdi # fbc, call _raw_spin_lock # addq %r13, 8(%rbx) # count, <variable>.count movq %rbx, %rdi # fbc, movl $0, (%r12) #,* tcp_ptr__ call _raw_spin_unlock # .L29: #APP # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1 movq %gs:kernel_stack,%rax #, pfo_ret__ # 0 "" 2 #NO_APP decl -8124(%rax) # <variable>.preempt_count movq -8136(%rax), %rax #, D.14625 testb $8, %al #, D.14625 jne .L32 #, .L31: movq -24(%rbp), %rbx #, movq -16(%rbp), %r12 #, movq -8(%rbp), %r13 #, leave ret .p2align 4,,10 .p2align 3 .L28: movl %r13d, (%r12) # count,* jmp .L29 # .L32: call preempt_schedule # .p2align 4,,6 jmp .L31 # .size __percpu_counter_add, .-__percpu_counter_add .p2align 4,,15 Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07vfs: revert per-cpu nr_unused counters for dentry and inodesNick Piggin
The nr_unused counters count the number of objects on an LRU, and as such they are synchronized with LRU object insertion and removal and scanning, and protected under the LRU lock. Making it per-cpu does not actually get any concurrency improvements because of this lock, and summing the counter is much slower, and incrementing/decrementing it costs more code size and is slower too. These counters should stay per-LRU, which currently means global. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07kernel: kmem_ptr_validate considered harmfulNick Piggin
This is a nasty and error prone API. It is no longer used, remove it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: d_validate fixesNick Piggin
d_validate has been broken for a long time. kmem_ptr_validate does not guarantee that a pointer can be dereferenced if it can go away at any time. Even rcu_read_lock doesn't help, because the pointer might be queued in RCU callbacks but not executed yet. So the parent cannot be checked, nor the name hashed. The dentry pointer can not be touched until it can be verified under lock. Hashing simply cannot be used. Instead, verify the parent/child relationship by traversing parent's d_child list. It's slow, but only ncpfs and the destaged smbfs care about it, at this point. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-06Merge branch 'next' into for-linusDmitry Torokhov
Conflicts: include/linux/input.h
2011-01-07ACPI / PM: Blacklist Averatec machine known to require acpi_sleep=nonvsRafael J. Wysocki
Apparently, Averatec AV1020-ED2 does not resume correctly without acpi_sleep=nonvs, so add it to the ACPI sleep blacklist. References: https://bugzilla.kernel.org/show_bug.cgi?id=16396#c86 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI / PM: Report wakeup events from buttonsRafael J. Wysocki
Since ACPI buttons and lids can be configured to wake up the system from sleep states, report wakeup events from these devices. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI / PM: Drop special ACPI wakeup flagsRafael J. Wysocki
Drop special ACPI wakeup flags, wakeup.state.enabled and wakeup.flags.always_enabled, that aren't necessary any more after we've started to use standard device wakeup flags for handling ACPI wakeup devices. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI / PM: Use device wakeup flags for handling ACPI wakeup devicesRafael J. Wysocki
There are ACPI devices (buttons and the laptop lid) that can wake up the system from sleep states and have no "physical" companion devices. The ACPI subsystem uses two flags, wakeup.state.enabled and wakeup.flags.always_enabled, for handling those devices, but they are not accessible through the standard device wakeup infrastructure. User space can only control them via the /proc/acpi/wakeup interface that is not really convenient (e.g. the way in which devices are enabled to wake up the system is not portable between different systems, because it requires one to know the devices' "names" used in the system's ACPI tables). To address this problem, use standard device wakeup flags instead of the special ACPI flags for handling those devices. In particular, use device_set_wakeup_capable() to mark the ACPI wakeup devices during initialization and use device_set_wakeup_enable() to allow or disallow them to wake up the system from sleep states. Rework the /proc/acpi/wakeup interface to take these changes into account. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI / PM: Do not enable multiple devices to wake up simultaneouslyRafael J. Wysocki
If a device is enabled to wake up the system from sleep states via /proc/acpi/wakeup and there are other devices associated with the same wakeup GPE, all of these devices are automatically enabled to wake up the system. This isn't correct, because the fact the GPE is shared need not imply that wakeup power has to be enabled for all the devices at the same time (i.e. it is possible that one device will have its wakeup power enabled and it will wake up the system from a sleep state if the shared wakeup GPE is enabled, while another device having its wakeup power disabled will not wake up the system even though the GPE is enabled). Rework acpi_system_write_wakeup_device() so that it only enables wakeup for one device at a time. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI / ACPICA: Fix global lock acquisitionRafael J. Wysocki
There are two problems with the ACPICA's current implementation of the global lock acquisition. First, acpi_ev_global_lock_handler(), which in fact is an interface to the outside of the kernel, doesn't validate its input, so it only works correctly if the other side (i.e. the ACPI firmware) is fully specification-compliant (as far as the global lock is concerned). Unfortunately, that's known not to be the case on some systems (i.e. we get spurious global lock signaling interrupts without the pending flag set on some systems). Second, acpi_ev_global_lock_handler() attempts to acquire the global lock on behalf of a thread waiting for it without checking if there actually is such a thread. Both of these shortcomings need to be addressed to prevent all possible race conditions from happening. Rework acpi_ev_global_lock_handler() so that it doesn't try to acquire the global lock and make it signal the availability of the global lock to the waiting thread instead. Make sure that the availability of the global lock can only be signaled when there is a thread waiting for it and that it can't be signaled more than once in a row (to keep acpi_gbl_global_lock_semaphore in balance). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI: Use ioremap_cache()Len Brown
Although the temporary boot-time ACPI table mappings were set up with CPU caching enabled, the permanent table mappings and AML run-time region memory accesses were set up with ioremap(), which on x86 is a synonym for ioremap_nocache(). Changing this to ioremap_cache() improves performance as seen when accessing the tables via acpidump, or /sys/firmware/acpi/tables. It should also improve AML run-time performance. No change on ia64. Reported-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-01-07ACPI / PM: Make suspend_nvs_save() use acpi_os_map_memory()Rafael J. Wysocki
It turns out that the NVS memory region that suspend_nvs_save() attempts to map has been already mapped by acpi_os_map_memory(), so suspend_nvs_save() should better use acpi_os_map_memory() for mapping memory to avoid conflicts. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07ACPI / PM: Update file information and the list of includes in nvs.cRafael J. Wysocki
The file information and the list of include in drivers/acpi/nvs.c are outdated, so update them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07PM / ACPI: Move NVS saving and restoring code to drivers/acpiRafael J. Wysocki
The saving of the ACPI NVS area during hibernation and suspend and restoring it during the subsequent resume is entirely specific to ACPI, so move it to drivers/acpi and drop the CONFIG_SUSPEND_NVS configuration option which is redundant. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07radeon: consolidate asic-specific function decls for pre-r600Daniel Vetter
Move them to radeon_asic.h together with the other asic specific stuff. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-07vga_switcheroo: comparing too few characters in strncmp()Dan Carpenter
This is a copy-and-paste bug. We should be comparing 4 characters here instead of 3. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-07PM: Fix oops in suspend/hibernate code related to failing ioremap()Jiri Slaby
When ioremap() fails (which might happen for some reason), we nicely oops in suspend_nvs_save() due to NULL dereference by memcpy() in there. Fail gracefully instead. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-07drm/radeon/kms: add NI pci idsAlex Deucher
Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-07drm/radeon/kms: don't enable pcie gen2 on NI yetAlex Deucher
Still needs to be implemented. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-07drm/radeon/kms: add radeon_asic struct for NI asicsAlex Deucher
Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-07drm/radeon/kms/ni: load default sclk/mclk/vddc at pm initAlex Deucher
The vbios only partially initializes the memory controller on NI, so now we need to load the MC ucode in the driver and set the default clocks once the ucode is loaded. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>