diff options
Diffstat (limited to 'Documentation')
22 files changed, 442 insertions, 201 deletions
diff --git a/Documentation/DocBook/filesystems.tmpl b/Documentation/DocBook/filesystems.tmpl index 5e87ad58c0b5..f51f28531b8d 100644 --- a/Documentation/DocBook/filesystems.tmpl +++ b/Documentation/DocBook/filesystems.tmpl @@ -82,6 +82,11 @@ </sect1> </chapter> + <chapter id="fs_events"> + <title>Events based on file descriptors</title> +!Efs/eventfd.c + </chapter> + <chapter id="sysfs"> <title>The Filesystem for Exporting Kernel Objects</title> !Efs/sysfs/file.c diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index cfaac34c4557..6ef692667e2f 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -849,6 +849,37 @@ All: lockdep-checked RCU-protected pointer access See the comment headers in the source code (or the docbook generated from them) for more information. +However, given that there are no fewer than four families of RCU APIs +in the Linux kernel, how do you choose which one to use? The following +list can be helpful: + +a. Will readers need to block? If so, you need SRCU. + +b. What about the -rt patchset? If readers would need to block + in an non-rt kernel, you need SRCU. If readers would block + in a -rt kernel, but not in a non-rt kernel, SRCU is not + necessary. + +c. Do you need to treat NMI handlers, hardirq handlers, + and code segments with preemption disabled (whether + via preempt_disable(), local_irq_save(), local_bh_disable(), + or some other mechanism) as if they were explicit RCU readers? + If so, you need RCU-sched. + +d. Do you need RCU grace periods to complete even in the face + of softirq monopolization of one or more of the CPUs? For + example, is your code subject to network-based denial-of-service + attacks? If so, you need RCU-bh. + +e. Is your workload too update-intensive for normal use of + RCU, but inappropriate for other synchronization mechanisms? + If so, consider SLAB_DESTROY_BY_RCU. But please be careful! + +f. Otherwise, use RCU. + +Of course, this all assumes that you have determined that RCU is in fact +the right tool for your job. + 8. ANSWERS TO QUICK QUIZZES diff --git a/Documentation/devicetree/bindings/i2c/ce4100-i2c.txt b/Documentation/devicetree/bindings/i2c/ce4100-i2c.txt new file mode 100644 index 000000000000..569b16248514 --- /dev/null +++ b/Documentation/devicetree/bindings/i2c/ce4100-i2c.txt @@ -0,0 +1,93 @@ +CE4100 I2C +---------- + +CE4100 has one PCI device which is described as the I2C-Controller. This +PCI device has three PCI-bars, each bar contains a complete I2C +controller. So we have a total of three independent I2C-Controllers +which share only an interrupt line. +The driver is probed via the PCI-ID and is gathering the information of +attached devices from the devices tree. +Grant Likely recommended to use the ranges property to map the PCI-Bar +number to its physical address and to use this to find the child nodes +of the specific I2C controller. This were his exact words: + + Here's where the magic happens. Each entry in + ranges describes how the parent pci address space + (middle group of 3) is translated to the local + address space (first group of 2) and the size of + each range (last cell). In this particular case, + the first cell of the local address is chosen to be + 1:1 mapped to the BARs, and the second is the + offset from be base of the BAR (which would be + non-zero if you had 2 or more devices mapped off + the same BAR) + + ranges allows the address mapping to be described + in a way that the OS can interpret without + requiring custom device driver code. + +This is an example which is used on FalconFalls: +------------------------------------------------ + i2c-controller@b,2 { + #address-cells = <2>; + #size-cells = <1>; + compatible = "pci8086,2e68.2", + "pci8086,2e68", + "pciclass,ff0000", + "pciclass,ff00"; + + reg = <0x15a00 0x0 0x0 0x0 0x0>; + interrupts = <16 1>; + + /* as described by Grant, the first number in the group of + * three is the bar number followed by the 64bit bar address + * followed by size of the mapping. The bar address + * requires also a valid translation in parents ranges + * property. + */ + ranges = <0 0 0x02000000 0 0xdffe0500 0x100 + 1 0 0x02000000 0 0xdffe0600 0x100 + 2 0 0x02000000 0 0xdffe0700 0x100>; + + i2c@0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "intel,ce4100-i2c-controller"; + + /* The first number in the reg property is the + * number of the bar + */ + reg = <0 0 0x100>; + + /* This I2C controller has no devices */ + }; + + i2c@1 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "intel,ce4100-i2c-controller"; + reg = <1 0 0x100>; + + /* This I2C controller has one gpio controller */ + gpio@26 { + #gpio-cells = <2>; + compatible = "ti,pcf8575"; + reg = <0x26>; + gpio-controller; + }; + }; + + i2c@2 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "intel,ce4100-i2c-controller"; + reg = <2 0 0x100>; + + gpio@26 { + #gpio-cells = <2>; + compatible = "ti,pcf8575"; + reg = <0x26>; + gpio-controller; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/rtc/rtc-cmos.txt b/Documentation/devicetree/bindings/rtc/rtc-cmos.txt new file mode 100644 index 000000000000..7382989b3052 --- /dev/null +++ b/Documentation/devicetree/bindings/rtc/rtc-cmos.txt @@ -0,0 +1,28 @@ + Motorola mc146818 compatible RTC +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Required properties: + - compatible : "motorola,mc146818" + - reg : should contain registers location and length. + +Optional properties: + - interrupts : should contain interrupt. + - interrupt-parent : interrupt source phandle. + - ctrl-reg : Contains the initial value of the control register also + called "Register B". + - freq-reg : Contains the initial value of the frequency register also + called "Regsiter A". + +"Register A" and "B" are usually initialized by the firmware (BIOS for +instance). If this is not done, it can be performed by the driver. + +ISA Example: + + rtc@70 { + compatible = "motorola,mc146818"; + interrupts = <8 3>; + interrupt-parent = <&ioapic1>; + ctrl-reg = <2>; + freq-reg = <0x26>; + reg = <1 0x70 2>; + }; diff --git a/Documentation/devicetree/bindings/x86/ce4100.txt b/Documentation/devicetree/bindings/x86/ce4100.txt new file mode 100644 index 000000000000..b49ae593a60b --- /dev/null +++ b/Documentation/devicetree/bindings/x86/ce4100.txt @@ -0,0 +1,38 @@ +CE4100 Device Tree Bindings +--------------------------- + +The CE4100 SoC uses for in core peripherals the following compatible +format: <vendor>,<chip>-<device>. +Many of the "generic" devices like HPET or IO APIC have the ce4100 +name in their compatible property because they first appeared in this +SoC. + +The CPU node +------------ + cpu@0 { + device_type = "cpu"; + compatible = "intel,ce4100"; + reg = <0>; + lapic = <&lapic0>; + }; + +The reg property describes the CPU number. The lapic property points to +the local APIC timer. + +The SoC node +------------ + +This node describes the in-core peripherals. Required property: + compatible = "intel,ce4100-cp"; + +The PCI node +------------ +This node describes the PCI bus on the SoC. Its property should be + compatible = "intel,ce4100-pci", "pci"; + +If the OS is using the IO-APIC for interrupt routing then the reported +interrupt numbers for devices is no longer true. In order to obtain the +correct interrupt number, the child node which represents the device has +to contain the interrupt property. Besides the interrupt property it has +to contain at least the reg property containing the PCI bus address and +compatible property according to "PCI Bus Binding Revision 2.1". diff --git a/Documentation/devicetree/bindings/x86/interrupt.txt b/Documentation/devicetree/bindings/x86/interrupt.txt new file mode 100644 index 000000000000..7d19f494f19a --- /dev/null +++ b/Documentation/devicetree/bindings/x86/interrupt.txt @@ -0,0 +1,26 @@ +Interrupt chips +--------------- + +* Intel I/O Advanced Programmable Interrupt Controller (IO APIC) + + Required properties: + -------------------- + compatible = "intel,ce4100-ioapic"; + #interrupt-cells = <2>; + + Device's interrupt property: + + interrupts = <P S>; + + The first number (P) represents the interrupt pin which is wired to the + IO APIC. The second number (S) represents the sense of interrupt which + should be configured and can be one of: + 0 - Edge Rising + 1 - Level Low + 2 - Level High + 3 - Edge Falling + +* Local APIC + Required property: + + compatible = "intel,ce4100-lapic"; diff --git a/Documentation/devicetree/bindings/x86/timer.txt b/Documentation/devicetree/bindings/x86/timer.txt new file mode 100644 index 000000000000..c688af58e3bd --- /dev/null +++ b/Documentation/devicetree/bindings/x86/timer.txt @@ -0,0 +1,6 @@ +Timers +------ + +* High Precision Event Timer (HPET) + Required property: + compatible = "intel,ce4100-hpet"; diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt index 28b1c9d3d351..55fd2623445b 100644 --- a/Documentation/devicetree/booting-without-of.txt +++ b/Documentation/devicetree/booting-without-of.txt @@ -13,6 +13,7 @@ Table of Contents I - Introduction 1) Entry point for arch/powerpc + 2) Entry point for arch/x86 II - The DT block format 1) Header @@ -225,6 +226,25 @@ it with special cases. cannot support both configurations with Book E and configurations with classic Powerpc architectures. +2) Entry point for arch/x86 +------------------------------- + + There is one single 32bit entry point to the kernel at code32_start, + the decompressor (the real mode entry point goes to the same 32bit + entry point once it switched into protected mode). That entry point + supports one calling convention which is documented in + Documentation/x86/boot.txt + The physical pointer to the device-tree block (defined in chapter II) + is passed via setup_data which requires at least boot protocol 2.09. + The type filed is defined as + + #define SETUP_DTB 2 + + This device-tree is used as an extension to the "boot page". As such it + does not parse / consider data which is already covered by the boot + page. This includes memory size, reserved ranges, command line arguments + or initrd address. It simply holds information which can not be retrieved + otherwise like interrupt routing or a list of devices behind an I2C bus. II - The DT block format ======================== diff --git a/Documentation/hwmon/jc42 b/Documentation/hwmon/jc42 index 0e76ef12e4c6..a22ecf48f255 100644 --- a/Documentation/hwmon/jc42 +++ b/Documentation/hwmon/jc42 @@ -51,7 +51,8 @@ Supported chips: * JEDEC JC 42.4 compliant temperature sensor chips Prefix: 'jc42' Addresses scanned: I2C 0x18 - 0x1f - Datasheet: - + Datasheet: + http://www.jedec.org/sites/default/files/docs/4_01_04R19.pdf Author: Guenter Roeck <guenter.roeck@ericsson.com> @@ -60,7 +61,11 @@ Author: Description ----------- -This driver implements support for JEDEC JC 42.4 compliant temperature sensors. +This driver implements support for JEDEC JC 42.4 compliant temperature sensors, +which are used on many DDR3 memory modules for mobile devices and servers. Some +systems use the sensor to prevent memory overheating by automatically throttling +the memory controller. + The driver auto-detects the chips listed above, but can be manually instantiated to support other JC 42.4 compliant chips. @@ -81,15 +86,19 @@ limits. The chip supports only a single register to configure the hysteresis, which applies to all limits. This register can be written by writing into temp1_crit_hyst. Other hysteresis attributes are read-only. +If the BIOS has configured the sensor for automatic temperature management, it +is likely that it has locked the registers, i.e., that the temperature limits +cannot be changed. + Sysfs entries ------------- temp1_input Temperature (RO) -temp1_min Minimum temperature (RW) -temp1_max Maximum temperature (RW) -temp1_crit Critical high temperature (RW) +temp1_min Minimum temperature (RO or RW) +temp1_max Maximum temperature (RO or RW) +temp1_crit Critical high temperature (RO or RW) -temp1_crit_hyst Critical hysteresis temperature (RW) +temp1_crit_hyst Critical hysteresis temperature (RO or RW) temp1_max_hyst Maximum hysteresis temperature (RO) temp1_min_alarm Temperature low alarm diff --git a/Documentation/hwmon/k10temp b/Documentation/hwmon/k10temp index 6526eee525a6..d2b56a4fd1f5 100644 --- a/Documentation/hwmon/k10temp +++ b/Documentation/hwmon/k10temp @@ -9,6 +9,8 @@ Supported chips: Socket S1G3: Athlon II, Sempron, Turion II * AMD Family 11h processors: Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) +* AMD Family 12h processors: "Llano" +* AMD Family 14h processors: "Brazos" (C/E/G-Series) Prefix: 'k10temp' Addresses scanned: PCI space @@ -17,10 +19,14 @@ Supported chips: http://support.amd.com/us/Processor_TechDocs/31116.pdf BIOS and Kernel Developer's Guide (BKDG) for AMD Family 11h Processors: http://support.amd.com/us/Processor_TechDocs/41256.pdf + BIOS and Kernel Developer's Guide (BKDG) for AMD Family 14h Models 00h-0Fh Processors: + http://support.amd.com/us/Processor_TechDocs/43170.pdf Revision Guide for AMD Family 10h Processors: http://support.amd.com/us/Processor_TechDocs/41322.pdf Revision Guide for AMD Family 11h Processors: http://support.amd.com/us/Processor_TechDocs/41788.pdf + Revision Guide for AMD Family 14h Models 00h-0Fh Processors: + http://support.amd.com/us/Processor_TechDocs/47534.pdf AMD Family 11h Processor Power and Thermal Data Sheet for Notebooks: http://support.amd.com/us/Processor_TechDocs/43373.pdf AMD Family 10h Server and Workstation Processor Power and Thermal Data Sheet: @@ -34,7 +40,7 @@ Description ----------- This driver permits reading of the internal temperature sensor of AMD -Family 10h and 11h processors. +Family 10h/11h/12h/14h processors. All these processors have a sensor, but on those for Socket F or AM2+, the sensor may return inconsistent values (erratum 319). The driver diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 89835a4766a6..738c6fda3fb0 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -144,6 +144,11 @@ a fixed number of characters. This limit depends on the architecture and is between 256 and 4096 characters. It is defined in the file ./include/asm/setup.h as COMMAND_LINE_SIZE. +Finally, the [KMG] suffix is commonly described after a number of kernel +parameter values. These 'K', 'M', and 'G' letters represent the _binary_ +multipliers 'Kilo', 'Mega', and 'Giga', equalling 2^10, 2^20, and 2^30 +bytes respectively. Such letter suffixes can also be entirely omitted. + acpi= [HW,ACPI,X86] Advanced Configuration and Power Interface @@ -545,16 +550,20 @@ and is between 256 and 4096 characters. It is defined in the file Format: <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>] - crashkernel=nn[KMG]@ss[KMG] - [KNL] Reserve a chunk of physical memory to - hold a kernel to switch to with kexec on panic. + crashkernel=size[KMG][@offset[KMG]] + [KNL] Using kexec, Linux can switch to a 'crash kernel' + upon panic. This parameter reserves the physical + memory region [offset, offset + size] for that kernel + image. If '@offset' is omitted, then a suitable offset + is selected automatically. Check + Documentation/kdump/kdump.txt for further details. crashkernel=range1:size1[,range2:size2,...][@offset] [KNL] Same as above, but depends on the memory in the running system. The syntax of range is start-[end] where start and end are both a memory unit (amount[KMG]). See also - Documentation/kdump/kdump.txt for a example. + Documentation/kdump/kdump.txt for an example. cs89x0_dma= [HW,NET] Format: <dma> @@ -1262,10 +1271,9 @@ and is between 256 and 4096 characters. It is defined in the file 6 (KERN_INFO) informational 7 (KERN_DEBUG) debug-level messages - log_buf_len=n Sets the size of the printk ring buffer, in bytes. - Format: { n | nk | nM } - n must be a power of two. The default size - is set in the kernel config file. + log_buf_len=n[KMG] Sets the size of the printk ring buffer, + in bytes. n must be a power of two. The default + size is set in the kernel config file. logo.nologo [FB] Disables display of the built-in Linux logo. This may be used to provide more screen space for @@ -2436,6 +2444,10 @@ and is between 256 and 4096 characters. It is defined in the file <deci-seconds>: poll all this frequency 0: no polling (default) + threadirqs [KNL] + Force threading of all interrupt handlers except those + marked explicitely IRQF_NO_THREAD. + topology= [S390] Format: {off | on} Specify if the kernel should make use of the cpu diff --git a/Documentation/keys-request-key.txt b/Documentation/keys-request-key.txt index 09b55e461740..69686ad12c66 100644 --- a/Documentation/keys-request-key.txt +++ b/Documentation/keys-request-key.txt @@ -127,14 +127,15 @@ This is because process A's keyrings can't simply be attached to of them, and (b) it requires the same UID/GID/Groups all the way through. -====================== -NEGATIVE INSTANTIATION -====================== +==================================== +NEGATIVE INSTANTIATION AND REJECTION +==================================== Rather than instantiating a key, it is possible for the possessor of an authorisation key to negatively instantiate a key that's under construction. This is a short duration placeholder that causes any attempt at re-requesting -the key whilst it exists to fail with error ENOKEY. +the key whilst it exists to fail with error ENOKEY if negated or the specified +error if rejected. This is provided to prevent excessive repeated spawning of /sbin/request-key processes for a key that will never be obtainable. diff --git a/Documentation/keys.txt b/Documentation/keys.txt index e4dbbdb1bd96..6523a9e6f293 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt @@ -637,6 +637,9 @@ The keyctl syscall functions are: long keyctl(KEYCTL_INSTANTIATE, key_serial_t key, const void *payload, size_t plen, key_serial_t keyring); + long keyctl(KEYCTL_INSTANTIATE_IOV, key_serial_t key, + const struct iovec *payload_iov, unsigned ioc, + key_serial_t keyring); If the kernel calls back to userspace to complete the instantiation of a key, userspace should use this call to supply data for the key before the @@ -652,11 +655,16 @@ The keyctl syscall functions are: The payload and plen arguments describe the payload data as for add_key(). + The payload_iov and ioc arguments describe the payload data in an iovec + array instead of a single buffer. + (*) Negatively instantiate a partially constructed key. long keyctl(KEYCTL_NEGATE, key_serial_t key, unsigned timeout, key_serial_t keyring); + long keyctl(KEYCTL_REJECT, key_serial_t key, + unsigned timeout, unsigned error, key_serial_t keyring); If the kernel calls back to userspace to complete the instantiation of a key, userspace should use this call mark the key as negative before the @@ -669,6 +677,10 @@ The keyctl syscall functions are: that keyring, however all the constraints applying in KEYCTL_LINK apply in this case too. + If the key is rejected, future searches for it will return the specified + error code until the rejected key expires. Negating the key is the same + as rejecting the key with ENOKEY as the error code. + (*) Set the default request-key destination keyring. @@ -1062,6 +1074,13 @@ The structure has a number of fields, some of which are mandatory: viable. + (*) int (*vet_description)(const char *description); + + This optional method is called to vet a key description. If the key type + doesn't approve of the key description, it may return an error, otherwise + it should return 0. + + (*) int (*instantiate)(struct key *key, const void *data, size_t datalen); This method is called to attach a payload to a key during construction. @@ -1231,10 +1250,11 @@ hand the request off to (perhaps a path held in placed in another key by, for example, the KDE desktop manager). The program (or whatever it calls) should finish construction of the key by -calling KEYCTL_INSTANTIATE, which also permits it to cache the key in one of -the keyrings (probably the session ring) before returning. Alternatively, the -key can be marked as negative with KEYCTL_NEGATE; this also permits the key to -be cached in one of the keyrings. +calling KEYCTL_INSTANTIATE or KEYCTL_INSTANTIATE_IOV, which also permits it to +cache the key in one of the keyrings (probably the session ring) before +returning. Alternatively, the key can be marked as negative with KEYCTL_NEGATE +or KEYCTL_REJECT; this also permits the key to be cached in one of the +keyrings. If it returns with the key remaining in the unconstructed state, the key will be marked as being negative, it will be added to the session keyring, and an diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 631ad2f1b229..f0d3a8026a56 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -21,6 +21,7 @@ Contents: - SMP barrier pairing. - Examples of memory barrier sequences. - Read memory barriers vs load speculation. + - Transitivity (*) Explicit kernel barriers. @@ -959,6 +960,63 @@ the speculation will be cancelled and the value reloaded: retrieved : : +-------+ +TRANSITIVITY +------------ + +Transitivity is a deeply intuitive notion about ordering that is not +always provided by real computer systems. The following example +demonstrates transitivity (also called "cumulativity"): + + CPU 1 CPU 2 CPU 3 + ======================= ======================= ======================= + { X = 0, Y = 0 } + STORE X=1 LOAD X STORE Y=1 + <general barrier> <general barrier> + LOAD Y LOAD X + +Suppose that CPU 2's load from X returns 1 and its load from Y returns 0. +This indicates that CPU 2's load from X in some sense follows CPU 1's +store to X and that CPU 2's load from Y in some sense preceded CPU 3's +store to Y. The question is then "Can CPU 3's load from X return 0?" + +Because CPU 2's load from X in some sense came after CPU 1's store, it +is natural to expect that CPU 3's load from X must therefore return 1. +This expectation is an example of transitivity: if a load executing on +CPU A follows a load from the same variable executing on CPU B, then +CPU A's load must either return the same value that CPU B's load did, +or must return some later value. + +In the Linux kernel, use of general memory barriers guarantees +transitivity. Therefore, in the above example, if CPU 2's load from X +returns 1 and its load from Y returns 0, then CPU 3's load from X must +also return 1. + +However, transitivity is -not- guaranteed for read or write barriers. +For example, suppose that CPU 2's general barrier in the above example +is changed to a read barrier as shown below: + + CPU 1 CPU 2 CPU 3 + ======================= ======================= ======================= + { X = 0, Y = 0 } + STORE X=1 LOAD X STORE Y=1 + <read barrier> <general barrier> + LOAD Y LOAD X + +This substitution destroys transitivity: in this example, it is perfectly +legal for CPU 2's load from X to return 1, its load from Y to return 0, +and CPU 3's load from X to return 0. + +The key point is that although CPU 2's read barrier orders its pair +of loads, it does not guarantee to order CPU 1's store. Therefore, if +this example runs on a system where CPUs 1 and 2 share a store buffer +or a level of cache, CPU 2 might have early access to CPU 1's writes. +General barriers are therefore required to ensure that all CPUs agree +on the combined order of CPU 1's and CPU 2's accesses. + +To reiterate, if your code requires transitivity, use general barriers +throughout. + + ======================== EXPLICIT KERNEL BARRIERS ======================== diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index fe5c099b8fc8..4edd78dfb362 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -40,8 +40,6 @@ decnet.txt - info on using the DECnet networking layer in Linux. depca.txt - the Digital DEPCA/EtherWORKS DE1?? and DE2?? LANCE Ethernet driver -dgrs.txt - - the Digi International RightSwitch SE-X Ethernet driver dmfe.txt - info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver. e100.txt @@ -50,8 +48,6 @@ e1000.txt - info on Intel's E1000 line of gigabit ethernet boards eql.txt - serial IP load balancing -ethertap.txt - - the Ethertap user space packet reception and transmission driver ewrk3.txt - the Digital EtherWORKS 3 DE203/4/5 Ethernet driver filter.txt @@ -104,8 +100,6 @@ tuntap.txt - TUN/TAP device driver, allowing user space Rx/Tx of packets. vortex.txt - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. -wavelan.txt - - AT&T GIS (nee NCR) WaveLAN card: An Ethernet-like radio transceiver x25.txt - general info on X.25 development. x25-iface.txt diff --git a/Documentation/networking/Makefile b/Documentation/networking/Makefile index 5aba7a33aeeb..24c308dd3fd1 100644 --- a/Documentation/networking/Makefile +++ b/Documentation/networking/Makefile @@ -4,6 +4,8 @@ obj- := dummy.o # List of programs to build hostprogs-y := ifenslave +HOSTCFLAGS_ifenslave.o += -I$(objtree)/usr/include + # Tell kbuild to always build the programs always := $(hostprogs-y) diff --git a/Documentation/networking/dns_resolver.txt b/Documentation/networking/dns_resolver.txt index aefd1e681804..04ca06325b08 100644 --- a/Documentation/networking/dns_resolver.txt +++ b/Documentation/networking/dns_resolver.txt @@ -61,7 +61,6 @@ before the more general line given above as the first match is the one taken. create dns_resolver foo:* * /usr/sbin/dns.foo %k - ===== USAGE ===== @@ -104,6 +103,14 @@ implemented in the module can be called after doing: returned also. +=============================== +READING DNS KEYS FROM USERSPACE +=============================== + +Keys of dns_resolver type can be read from userspace using keyctl_read() or +"keyctl read/print/pipe". + + ========= MECHANISM ========= diff --git a/Documentation/rtc.txt b/Documentation/rtc.txt index 9104c1062084..250160469d83 100644 --- a/Documentation/rtc.txt +++ b/Documentation/rtc.txt @@ -178,38 +178,29 @@ RTC class framework, but can't be supported by the older driver. setting the longer alarm time and enabling its IRQ using a single request (using the same model as EFI firmware). - * RTC_UIE_ON, RTC_UIE_OFF ... if the RTC offers IRQs, it probably - also offers update IRQs whenever the "seconds" counter changes. - If needed, the RTC framework can emulate this mechanism. + * RTC_UIE_ON, RTC_UIE_OFF ... if the RTC offers IRQs, the RTC framework + will emulate this mechanism. - * RTC_PIE_ON, RTC_PIE_OFF, RTC_IRQP_SET, RTC_IRQP_READ ... another - feature often accessible with an IRQ line is a periodic IRQ, issued - at settable frequencies (usually 2^N Hz). + * RTC_PIE_ON, RTC_PIE_OFF, RTC_IRQP_SET, RTC_IRQP_READ ... these icotls + are emulated via a kernel hrtimer. In many cases, the RTC alarm can be a system wake event, used to force Linux out of a low power sleep state (or hibernation) back to a fully operational state. For example, a system could enter a deep power saving state until it's time to execute some scheduled tasks. -Note that many of these ioctls need not actually be implemented by your -driver. The common rtc-dev interface handles many of these nicely if your -driver returns ENOIOCTLCMD. Some common examples: +Note that many of these ioctls are handled by the common rtc-dev interface. +Some common examples: * RTC_RD_TIME, RTC_SET_TIME: the read_time/set_time functions will be called with appropriate values. - * RTC_ALM_SET, RTC_ALM_READ, RTC_WKALM_SET, RTC_WKALM_RD: the - set_alarm/read_alarm functions will be called. + * RTC_ALM_SET, RTC_ALM_READ, RTC_WKALM_SET, RTC_WKALM_RD: gets or sets + the alarm rtc_timer. May call the set_alarm driver function. - * RTC_IRQP_SET, RTC_IRQP_READ: the irq_set_freq function will be called - to set the frequency while the framework will handle the read for you - since the frequency is stored in the irq_freq member of the rtc_device - structure. Your driver needs to initialize the irq_freq member during - init. Make sure you check the requested frequency is in range of your - hardware in the irq_set_freq function. If it isn't, return -EINVAL. If - you cannot actually change the frequency, do not define irq_set_freq. + * RTC_IRQP_SET, RTC_IRQP_READ: These are emulated by the generic code. - * RTC_PIE_ON, RTC_PIE_OFF: the irq_set_state function will be called. + * RTC_PIE_ON, RTC_PIE_OFF: These are also emulated by the generic code. If all else fails, check out the rtc-test.c driver! diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt index 178c831b907d..2e3c64b1a6a5 100644 --- a/Documentation/spinlocks.txt +++ b/Documentation/spinlocks.txt @@ -86,7 +86,7 @@ to change the variables it has to get an exclusive write lock. The routines look the same as above: - rwlock_t xxx_lock = RW_LOCK_UNLOCKED; + rwlock_t xxx_lock = __RW_LOCK_UNLOCKED(xxx_lock); unsigned long flags; @@ -196,25 +196,3 @@ appropriate: For static initialization, use DEFINE_SPINLOCK() / DEFINE_RWLOCK() or __SPIN_LOCK_UNLOCKED() / __RW_LOCK_UNLOCKED() as appropriate. - -SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated. These interfere -with lockdep state tracking. - -Most of the time, you can simply turn: - static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; -into: - static DEFINE_SPINLOCK(xxx_lock); - -Static structure member variables go from: - - struct foo bar { - .lock = SPIN_LOCK_UNLOCKED; - }; - -to: - - struct foo bar { - .lock = __SPIN_LOCK_UNLOCKED(bar.lock); - }; - -Declaration of static rw_locks undergo a similar transformation. diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt index dc52bd442c92..79fcafc7fd64 100644 --- a/Documentation/trace/ftrace-design.txt +++ b/Documentation/trace/ftrace-design.txt @@ -247,6 +247,13 @@ You need very few things to get the syscalls tracing in an arch. - Support the TIF_SYSCALL_TRACEPOINT thread flags. - Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace in the ptrace syscalls tracing path. +- If the system call table on this arch is more complicated than a simple array + of addresses of the system calls, implement an arch_syscall_addr to return + the address of a given system call. +- If the symbol names of the system calls do not match the function names on + this arch, define ARCH_HAS_SYSCALL_MATCH_SYM_NAME in asm/ftrace.h and + implement arch_syscall_match_sym_name with the appropriate logic to return + true if the function name corresponds with the symbol name. - Tag this arch as HAVE_SYSCALL_TRACEPOINTS. diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 557c1edeccaf..1ebc24cf9a55 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -80,11 +80,11 @@ of ftrace. Here is a list of some of the key files: tracers listed here can be configured by echoing their name into current_tracer. - tracing_enabled: + tracing_on: - This sets or displays whether the current_tracer - is activated and tracing or not. Echo 0 into this - file to disable the tracer or 1 to enable it. + This sets or displays whether writing to the trace + ring buffer is enabled. Echo 0 into this file to disable + the tracer or 1 to enable it. trace: @@ -202,10 +202,6 @@ Here is the list of current tracers that may be configured. to draw a graph of function calls similar to C code source. - "sched_switch" - - Traces the context switches and wakeups between tasks. - "irqsoff" Traces the areas that disable interrupts and saves @@ -273,39 +269,6 @@ format, the function name that was traced "path_put" and the parent function that called this function "path_walk". The timestamp is the time at which the function was entered. -The sched_switch tracer also includes tracing of task wakeups -and context switches. - - ksoftirqd/1-7 [01] 1453.070013: 7:115:R + 2916:115:S - ksoftirqd/1-7 [01] 1453.070013: 7:115:R + 10:115:S - ksoftirqd/1-7 [01] 1453.070013: 7:115:R ==> 10:115:R - events/1-10 [01] 1453.070013: 10:115:S ==> 2916:115:R - kondemand/1-2916 [01] 1453.070013: 2916:115:S ==> 7:115:R - ksoftirqd/1-7 [01] 1453.070013: 7:115:S ==> 0:140:R - -Wake ups are represented by a "+" and the context switches are -shown as "==>". The format is: - - Context switches: - - Previous task Next Task - - <pid>:<prio>:<state> ==> <pid>:<prio>:<state> - - Wake ups: - - Current task Task waking up - - <pid>:<prio>:<state> + <pid>:<prio>:<state> - -The prio is the internal kernel priority, which is the inverse -of the priority that is usually displayed by user-space tools. -Zero represents the highest priority (99). Prio 100 starts the -"nice" priorities with 100 being equal to nice -20 and 139 being -nice 19. The prio "140" is reserved for the idle task which is -the lowest priority thread (pid 0). - - Latency trace format -------------------- @@ -491,78 +454,10 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] latencies, as described in "Latency trace format". -sched_switch ------------- - -This tracer simply records schedule switches. Here is an example -of how to use it. - - # echo sched_switch > current_tracer - # echo 1 > tracing_enabled - # sleep 1 - # echo 0 > tracing_enabled - # cat trace - -# tracer: sched_switch -# -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | - bash-3997 [01] 240.132281: 3997:120:R + 4055:120:R - bash-3997 [01] 240.132284: 3997:120:R ==> 4055:120:R - sleep-4055 [01] 240.132371: 4055:120:S ==> 3997:120:R - bash-3997 [01] 240.132454: 3997:120:R + 4055:120:S - bash-3997 [01] 240.132457: 3997:120:R ==> 4055:120:R - sleep-4055 [01] 240.132460: 4055:120:D ==> 3997:120:R - bash-3997 [01] 240.132463: 3997:120:R + 4055:120:D - bash-3997 [01] 240.132465: 3997:120:R ==> 4055:120:R - <idle>-0 [00] 240.132589: 0:140:R + 4:115:S - <idle>-0 [00] 240.132591: 0:140:R ==> 4:115:R - ksoftirqd/0-4 [00] 240.132595: 4:115:S ==> 0:140:R - <idle>-0 [00] 240.132598: 0:140:R + 4:115:S - <idle>-0 [00] 240.132599: 0:140:R ==> 4:115:R - ksoftirqd/0-4 [00] 240.132603: 4:115:S ==> 0:140:R - sleep-4055 [01] 240.133058: 4055:120:S ==> 3997:120:R - [...] - - -As we have discussed previously about this format, the header -shows the name of the trace and points to the options. The -"FUNCTION" is a misnomer since here it represents the wake ups -and context switches. - -The sched_switch file only lists the wake ups (represented with -'+') and context switches ('==>') with the previous task or -current task first followed by the next task or task waking up. -The format for both of these is PID:KERNEL-PRIO:TASK-STATE. -Remember that the KERNEL-PRIO is the inverse of the actual -priority with zero (0) being the highest priority and the nice -values starting at 100 (nice -20). Below is a quick chart to map -the kernel priority to user land priorities. - - Kernel Space User Space - =============================================================== - 0(high) to 98(low) user RT priority 99(high) to 1(low) - with SCHED_RR or SCHED_FIFO - --------------------------------------------------------------- - 99 sched_priority is not used in scheduling - decisions(it must be specified as 0) - --------------------------------------------------------------- - 100(high) to 139(low) user nice -20(high) to 19(low) - --------------------------------------------------------------- - 140 idle task priority - --------------------------------------------------------------- - -The task states are: - - R - running : wants to run, may not actually be running - S - sleep : process is waiting to be woken up (handles signals) - D - disk sleep (uninterruptible sleep) : process must be woken up - (ignores signals) - T - stopped : process suspended - t - traced : process is being traced (with something like gdb) - Z - zombie : process waiting to be cleaned up - X - unknown - + overwrite - This controls what happens when the trace buffer is + full. If "1" (default), the oldest events are + discarded and overwritten. If "0", then the newest + events are discarded. ftrace_enabled -------------- @@ -607,10 +502,10 @@ an example: # echo irqsoff > current_tracer # echo latency-format > trace_options # echo 0 > tracing_max_latency - # echo 1 > tracing_enabled + # echo 1 > tracing_on # ls -ltr [...] - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: irqsoff # @@ -715,10 +610,10 @@ is much like the irqsoff tracer. # echo preemptoff > current_tracer # echo latency-format > trace_options # echo 0 > tracing_max_latency - # echo 1 > tracing_enabled + # echo 1 > tracing_on # ls -ltr [...] - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: preemptoff # @@ -863,10 +758,10 @@ tracers. # echo preemptirqsoff > current_tracer # echo latency-format > trace_options # echo 0 > tracing_max_latency - # echo 1 > tracing_enabled + # echo 1 > tracing_on # ls -ltr [...] - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: preemptirqsoff # @@ -1026,9 +921,9 @@ Instead of performing an 'ls', we will run 'sleep 1' under # echo wakeup > current_tracer # echo latency-format > trace_options # echo 0 > tracing_max_latency - # echo 1 > tracing_enabled + # echo 1 > tracing_on # chrt -f 5 sleep 1 - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: wakeup # @@ -1140,9 +1035,9 @@ ftrace_enabled is set; otherwise this tracer is a nop. # sysctl kernel.ftrace_enabled=1 # echo function > current_tracer - # echo 1 > tracing_enabled + # echo 1 > tracing_on # usleep 1 - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: function # @@ -1180,7 +1075,7 @@ int trace_fd; [...] int main(int argc, char *argv[]) { [...] - trace_fd = open(tracing_file("tracing_enabled"), O_WRONLY); + trace_fd = open(tracing_file("tracing_on"), O_WRONLY); [...] if (condition_hit()) { write(trace_fd, "0", 1); @@ -1631,9 +1526,9 @@ If I am only interested in sys_nanosleep and hrtimer_interrupt: # echo sys_nanosleep hrtimer_interrupt \ > set_ftrace_filter # echo function > current_tracer - # echo 1 > tracing_enabled + # echo 1 > tracing_on # usleep 1 - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: ftrace # @@ -1879,9 +1774,9 @@ different. The trace is live. # echo function > current_tracer # cat trace_pipe > /tmp/trace.out & [1] 4153 - # echo 1 > tracing_enabled + # echo 1 > tracing_on # usleep 1 - # echo 0 > tracing_enabled + # echo 0 > tracing_on # cat trace # tracer: function # diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt index 5f77d94598dd..6d27ab8d6e9f 100644 --- a/Documentation/trace/kprobetrace.txt +++ b/Documentation/trace/kprobetrace.txt @@ -42,11 +42,25 @@ Synopsis of kprobe_events +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**) NAME=FETCHARG : Set NAME as the argument name of FETCHARG. FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types - (u8/u16/u32/u64/s8/s16/s32/s64) and string are supported. + (u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield + are supported. (*) only for return probe. (**) this is useful for fetching a field of data structures. +Types +----- +Several types are supported for fetch-args. Kprobe tracer will access memory +by given type. Prefix 's' and 'u' means those types are signed and unsigned +respectively. Traced arguments are shown in decimal (signed) or hex (unsigned). +String type is a special type, which fetches a "null-terminated" string from +kernel space. This means it will fail and store NULL if the string container +has been paged out. +Bitfield is another special type, which takes 3 parameters, bit-width, bit- +offset, and container-size (usually 32). The syntax is; + + b<bit-width>@<bit-offset>/<container-size> + Per-Probe Event Filtering ------------------------- |