summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2010-04-15ide: fix comment typo in ide.hSergei Shtylyov
Fix typo in the comment to the 'dma_mode' field of the 'struct ide_drive_s' introduced by the commit 3fccaa192b9501e79a57e02e62b6bf420d2b461e (ide: add drive->dma_mode field). Whilt at it, convert spaces to a tab in the declaration of the neighbouring 'dn' field... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-15Merge branch 'linus' into sched/coreIngo Molnar
Merge reason: merge the latest fixes, update to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-15net: CONFIG_SMP should be CONFIG_RPSChangli Gao
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-15sh: intc: IRQ auto-distribution support.Paul Mundt
This implements support for hardware-managed IRQ balancing as implemented by SH-X3 cores (presently only hooked up for SH7786, but can probably be carried over to other SH-X3 cores, too). CPUs need to specify their distribution register along with the mask definitions, as these follow the same format. Peripheral IRQs that don't opt out of balancing will be automatically distributed at the whim of the hardware block, while each CPU needs to verify whether it is handling the IRQ or not, especially before clearing the mask. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-04-14perf: Store active software events in a hashlistFrederic Weisbecker
Each time a software event triggers, we need to walk through the entire list of events from the current cpu and task contexts to retrieve a running perf event that matches. We also need to check a matching perf event is actually counting. This walk is wasteful and makes the event fast path scaling down with a growing number of events running on the same contexts. To solve this, we store the running perf events in a hashlist to get an immediate access to them against their type:event_id when they trigger. v2: - Fix SWEVENT_HLIST_SIZE definition (and re-learn some basic maths along the way) - Only allocate hlist for online cpus, but keep track of the refcount on offline possible cpus too, so that we allocate it if needed when it becomes online. - Drop the kref use as it's not adapted to our tricks anymore. v3: - Fix bad refcount check (address instead of value). Thanks to Eric Dumazet who spotted this. - While exiting cpu, move the hlist release out of the IPI path to lock the hlist mutex sanely. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu>
2010-04-14ARM: 6033/1: ARM: MMCI: pass max frequency from platformLinus Walleij
This introduce the field f_max into the mmci_platform_data, making it possible to pass in a desired block clocking frequency from a board configuration. This is often more desirable than using a module parameter. We keep the module parameter as a fallback as well as the default frequency specified for this parameter if a parameter is not provided. This also adds some kerneldoc style documentation to the platform data struct in mmci.h. Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-14stmmac: new descriptor field for the driver's platformGiuseppe CAVALLARO
The new enh_desc is used for selecting the enhanced descriptors structure. There are several scenarios; some chips (mac10/100 or gmac) want to use the enhanced descriptors; others want the normal ones. For example, on ST platforms: MAC10/100 uses the normal desc structure and the GMAC uses the enhanced one. It can be useful to get this information from the platform. This could also be decided at run-time looking at the chip's ID number; but it could happen that chips with the same ID want to use different descriptor structure. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-14rcu: Better explain the condition parameter of rcu_dereference_check()David Howells
Better explain the condition parameter of rcu_dereference_check() that describes the conditions under which the dereference is permitted to take place (and incorporate Yong Zhang's suggestion). This condition is only checked under lockdep proving. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: eric.dumazet@gmail.com LKML-Reference: <1270852752-25278-2-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14rcu: Add rcu_access_pointer and rcu_dereference_protectedPaul E. McKenney
This patch adds variants of rcu_dereference() that handle situations where the RCU-protected data structure cannot change, perhaps due to our holding the update-side lock, or where the RCU-protected pointer is only to be fetched, not dereferenced. These are needed due to some performance concerns with using rcu_dereference() where it is not required, aside from the need for lockdep/sparse checking. The new rcu_access_pointer() primitive is for the case where the pointer is be fetch and not dereferenced. This primitive may be used without protection, RCU or otherwise, due to the fact that it uses ACCESS_ONCE(). The new rcu_dereference_protected() primitive is for the case where updates are prevented, for example, due to holding the update-side lock. This primitive does neither ACCESS_ONCE() nor smp_read_barrier_depends(), so can only be used when updates are somehow prevented. Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: eric.dumazet@gmail.com LKML-Reference: <1270852752-25278-1-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14ARM: MXC: mxcmmc: work around a bug in the SDHC busy line handlingDaniel Mack
MX3 SoCs have a silicon bug which corrupts CRC calculation of multi-block transfers when connected SDIO peripheral doesn't drive the BUSY line as required by the specs. One way to prevent this is to only allow 1-bit transfers. Another way is playing tricks with the DMA engine, but this isn't mainline yet. So for now, we live with the performance drawback of 1-bit transfers until a nicer solution is found. This patch introduces a new host controller callback 'init_card' which is for now only called from mmc_sdio_init_card(). Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Volker Ernst <volker.ernst@txtr.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Michał Mirosław <mirqus@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2010-04-13Input: add driver for hampshire serial touchscreensAdam Bennett
Adds support for Hampshire TSHARC serial touchscreens. Implements Hampshire's 4-byte communication protocol. Signed-off-by: Adam Bennett <abennett72@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-04-13Input: add Analog Devices AD714x captouch input driverBryan Wu
AD7142 and AD7147 are integrated capacitance-to-digital converters (CDCs) with on-chip environmental calibration for use in systems requiring a novel user input method. The AD7142 and AD7147 can interface to external capacitance sensors implementing functions such as buttons, scrollwheels, sliders, touchpads and so on. The chips don't restrict the specific usage. Depending on the hardware connection, one special target board can include one or several these components. The platform_data for the device's "struct device" holds these information. The data-struct defined in head file descript the hardware feature of button/scrollwheel/slider/touchpad components on target boards, which need be filled in the arch/mach-/. As the result, the driver is independent of boards. It gets the components layout from the platform_data, registers related devices, fullfills the algorithms and state machines for these components and report related input events to up level. Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Barry Song <21cnbao@gmail.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-04-13Input: implement SysRq as a separate input handlerDmitry Torokhov
Instead of keeping SysRq support inside of legacy keyboard driver split it out into a separate input handler (filter). This stops most SysRq input events from leaking into evdev clients (some events, such as first SysRq scancode - not keycode - event, are still leaked into both legacy keyboard and evdev). [martinez.javier@gmail.com: fix compile error when CONFIG_MAGIC_SYSRQ is not defined] Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-04-13Merge branch 'for-linus' into nextDmitry Torokhov
2010-04-13ipv4: ipmr: support multiple tablesPatrick McHardy
This patch adds support for multiple independant multicast routing instances, named "tables". Userspace multicast routing daemons can bind to a specific table instance by issuing a setsockopt call using a new option MRT_TABLE. The table number is stored in the raw socket data and affects all following ipmr setsockopt(), getsockopt() and ioctl() calls. By default, a single table (RT_TABLE_DEFAULT) is created with a default routing rule pointing to it. Newly created pimreg devices have the table number appended ("pimregX"), with the exception of devices created in the default table, which are named just "pimreg" for compatibility reasons. Packets are directed to a specific table instance using routing rules, similar to how regular routing rules work. Currently iif, oif and mark are supported as keys, source and destination addresses could be supported additionally. Example usage: - bind pimd/xorp/... to a specific table: uint32_t table = 123; setsockopt(fd, IPPROTO_IP, MRT_TABLE, &table, sizeof(table)); - create routing rules directing packets to the new table: # ip mrule add iif eth0 lookup 123 # ip mrule add oif eth0 lookup 123 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13ipv4: ipmr: convert struct mfc_cache to struct list_headPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13ipv4: ipmr: remove net pointer from struct mfc_cachePatrick McHardy
Now that cache entries in unres_queue don't need to be distinguished by their network namespace pointer anymore, we can remove it from struct mfc_cache add pass the namespace as function argument to the functions that need it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13net: fib_rules: decouple address families from real address familiesPatrick McHardy
Decouple the address family values used for fib_rules from the real address families in socket.h. This allows to use fib_rules for code that is not a real address family without increasing AF_MAX/NPROTO. Values up to 127 are reserved for real address families and map directly to the corresponding AF value, values starting from 128 are for other uses. rtnetlink is changed to invoke the AF_UNSPEC dumpit/doit handlers for these families. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13Merge branch 'master' into for-2.6.35Jens Axboe
Conflicts: block/blk-cgroup.c block/cfq-iosched.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-13genirq: Remove IRQF_DISABLED from core codeThomas Gleixner
Remove all code which is related to IRQF_DISABLED from the core kernel code. IRQF_DISABLED still exists as a flag, but becomes a NOOP and will be removed after a grace period. That way we can easily revert to the previous behaviour by just restoring the core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Linus Torvalds <torvalds@osdl.org> LKML-Reference: <20100326000405.991244690@linutronix.de>
2010-04-13genirq: Introduce request_any_context_irq()Marc Zyngier
Now that we enjoy threaded interrupts, we're starting to see irq_chip implementations (wm831x, pca953x) that make use of threaded interrupts for the controller, and nested interrupts for the client interrupt. It all works very well, with one drawback: Drivers requesting an IRQ must now know whether the handler will run in a thread context or not, and call request_threaded_irq() or request_irq() accordingly. The problem is that the requesting driver sometimes doesn't know about the nature of the interrupt, specially when the interrupt controller is a discrete chip (typically a GPIO expander connected over I2C) that can be connected to a wide variety of otherwise perfectly supported hardware. This patch introduces the request_any_context_irq() function that mostly mimics the usual request_irq(), except that it checks whether the irq level is configured as nested or not, and calls the right backend. On success, it also returns either IRQC_IS_HARDIRQ or IRQC_IS_NESTED. [ tglx: Made return value an enum, simplified code and made the export of request_any_context_irq GPL ] Signed-off-by: Marc Zyngier <maz@misterjones.org> Cc: <joachim.eastwood@jotron.com> LKML-Reference: <927ea285bd0c68934ddae1a47e44a9ba@localhost> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-04-13netfilter: ipv6: add IPSKB_REROUTED exclusion to NF_HOOK/POSTROUTING invocationJan Engelhardt
Similar to how IPv4's ip_output.c works, have ip6_output also check the IPSKB_REROUTED flag. It will be set from xt_TEE for cloned packets since Xtables can currently only deal with a single packet in flight at a time. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Acked-by: David S. Miller <davem@davemloft.net> [Patrick: changed to use an IP6SKB value instead of IPSKB] Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-13Restore __ALIGN_MASK()Alexey Dobriyan
Fix lib/bitmap.c compile failure due to __ALIGN_KERNEL changes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-13time: Remove xtime_cacheJohn Stultz
With the earlier logarithmic time accumulation patch, xtime will now always be within one "tick" of the current time, instead of possibly half a second off. This removes the need for the xtime_cache value, which always stored the time at the last interrupt, so this patch cleans that up removing the xtime_cache related code. This patch also addresses an issue with an earlier version of this change, where xtime_cache was normalizing xtime, which could in some cases be not valid (ie: tv_nsec == NSEC_PER_SEC). This is fixed by handling the edge case in update_wall_time(). Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Petr Titěra <P.Titera@century.cz> LKML-Reference: <1270589451-30773-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-04-13net: uninline skb_bond_should_drop()Eric Dumazet
skb_bond_should_drop() is too big to be inlined. This patch reduces kernel text size, and its compilation time as well (shrinking include/linux/netdevice.h) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13Fix some #includes in CAN drivers (rebased for net-next-2.6)Hans J. Koch
In the current implementation, CAN drivers need to #include <linux/can.h> _before_ they #include <linux/can/dev.h>, which is both ugly and unnecessary. Fix this by including <linux/can.h> in <linux/can/dev.h> and remove the #include <linux/can.h> lines from drivers. Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13skbuff: remove unused dev_consume_skb macro definitionAlexander Duyck
dev_consume_skb and kfree_skb_clean have no users and in the case of kfree_skb_clean could cause potential build issues since I cannot find where it is defined. Based on the patch in which it was introduced it appears to have been a bit of leftover code from an earlier version of the patch in which kfree_skb_clean was dropped in favor of consume_skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13netfilter: xtables: make XT_ALIGN() usable in exported headers by exporting ↵Alexey Dobriyan
__ALIGN_KERNEL() XT_ALIGN() was rewritten through ALIGN() by commit 42107f5009da223daa800d6da6904d77297ae829 "netfilter: xtables: symmetric COMPAT_XT_ALIGN definition". ALIGN() is not exported in userspace headers, which created compile problem for tc(8) and will create problem for iptables(8). We can't export generic looking name ALIGN() but we can export less generic __ALIGN_KERNEL() (suggested by Ben Hutchings). Google knows nothing about __ALIGN_KERNEL(). COMPAT_XT_ALIGN() changed for symmetry. Reported-by: Andreas Henriksson <andreas@fatal.se> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-13packet: support for TX time stamps on RAW socketsRichard Cochran
Enable the SO_TIMESTAMPING socket infrastructure for raw packet sockets. We introduce PACKET_TX_TIMESTAMP for the control message cmsg_type. Similar support for UDP and CAN sockets was added in commit 51f31cabe3ce5345b51e4a4f82138b38c4d5dc91 Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13sh: intc: userimask support.Paul Mundt
This adds support for hardware-assisted userspace irq masking for special priority levels. Due to the SR.IMASK interactivity, only some platforms implement this in hardware (including but not limited to SH-4A interrupt controllers, and ARM-based SH-Mobile CPUs). Each CPU needs to wire this up on its own, for now only SH7786 is wired up as an example. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-04-12NFSv4: fix delegated lockingTrond Myklebust
Arnaud Giersch reports that NFSv4 locking is broken when we hold a delegation since commit 8e469ebd6dc32cbaf620e134d79f740bf0ebab79 (NFSv4: Don't allow posix locking against servers that don't support it). According to Arnaud, the lock succeeds the first time he opens the file (since we cannot do a delegated open) but then fails after we start using delegated opens. The following patch fixes it by ensuring that locking behaviour is governed by a per-filesystem capability flag that is initially set, but gets cleared if the server ever returns an OPEN without the NFS4_OPEN_RESULT_LOCKTYPE_POSIX flag being set. Reported-by: Arnaud Giersch <arnaud.giersch@iut-bm.univ-fcomte.fr> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-04-12security: remove dead hook acctEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook key_session_to_parentEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook task_setgroupsEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook task_setgidEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook task_setuidEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook cred_commitEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook inode_deleteEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook sb_post_pivotrootEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook sb_post_addmountEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook sb_post_remountEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook sb_umount_busyEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove dead hook sb_umount_closeEric Paris
Unused hook. Remove. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-12security: remove sb_check_sb hooksEric Paris
Unused hook. Remove it. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-11Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/stmmac/stmmac_main.c drivers/net/wireless/wl12xx/wl1271_cmd.c drivers/net/wireless/wl12xx/wl1271_main.c drivers/net/wireless/wl12xx/wl1271_spi.c net/core/ethtool.c net/mac80211/scan.c
2010-04-10firewire: cdev: comment fixletStefan Richter
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-04-10firewire: cdev: iso packet documentationClemens Ladisch
Add the missing documentation for iso packets. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-04-09cpufreq: Unify sysfs attribute definition macrosBorislav Petkov
Multiple modules used to define those which are with identical functionality and were needlessly replicated among the different cpufreq drivers. Push them into the header and remove duplication. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-7-git-send-email-bp@amd64.org> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-09Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits) cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch loop: Update mtime when writing using aops block: expose the statistics in blkio.time and blkio.sectors for the root cgroup backing-dev: Handle class_create() failure Block: Fix block/elevator.c elevator_get() off-by-one error drbd: lc_element_by_index() never returns NULL cciss: unlock on error path cfq-iosched: Do not merge queues of BE and IDLE classes cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging i2o: Remove the dangerous kobj_to_i2o_device macro block: remove 16 bytes of padding from struct request on 64bits cfq-iosched: fix a kbuild regression block: make CONFIG_BLK_CGROUP visible Remove GENHD_FL_DRIVERFS block: Export max number of segments and max segment size in sysfs block: Finalize conversion of block limits functions block: Fix overrun in lcm() and move it to lib vfs: improve writeback_inodes_wb() paride: fix off-by-one test drbd: fix al-to-on-disk-bitmap for 4k logical_block_size ...
2010-04-09radix_tree_tag_get() is not as safe as the docs make out [ver #2]David Howells
radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set() or radix_tree_tag_clear(). The problem is that the double tag_get() in radix_tree_tag_get(): if (!tag_get(node, tag, offset)) saw_unset_tag = 1; if (height == 1) { int ret = tag_get(node, tag, offset); may see the value change due to the action of set/clear. RCU is no protection against this as no pointers are being changed, no nodes are being replaced according to a COW protocol - set/clear alter the node directly. The documentation in linux/radix-tree.h, however, says that radix_tree_tag_get() is an exception to the rule that "any function modifying the tree or tags (...) must exclude other modifications, and exclude any functions reading the tree". The problem is that the next statement in radix_tree_tag_get() checks that the tag doesn't vary over time: BUG_ON(ret && saw_unset_tag); This has been seen happening in FS-Cache: https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various comments that the value of the tag may change whilst the RCU read lock is held, and thus that the return value of radix_tree_tag_get() may not be relied upon unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from running concurrently with it. Reported-by: Romain DEGEZ <romain.degez@smartjog.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>