summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-04-24net/mlx5e: Device's mtu field is u16 and not intSaeed Mahameed
For set/query MTU port firmware commands the MTU field is 16 bits, here I changed all the "int mtu" parameters of the functions wrapping those firmware commands to be u16. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24tcp-tso: do not split TSO packets at retransmit timeEric Dumazet
Linux TCP stack painfully segments all TSO/GSO packets before retransmits. This was fine back in the days when TSO/GSO were emerging, with their bugs, but we believe the dark age is over. Keeping big packets in write queues, but also in stack traversal has a lot of benefits. - Less memory overhead, because write queues have less skbs - Less cpu overhead at ACK processing. - Better SACK processing, as lot of studies mentioned how awful linux was at this ;) - Less cpu overhead to send the rtx packets (IP stack traversal, netfilter traversal, drivers...) - Better latencies in presence of losses. - Smaller spikes in fq like packet schedulers, as retransmits are not constrained by TCP Small Queues. 1 % packet losses are common today, and at 100Gbit speeds, this translates to ~80,000 losses per second. Losses are often correlated, and we see many retransmit events leading to 1-MSS train of packets, at the time hosts are already under stress. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: add missing macsec prefix in uapiSabrina Dubroca
I accidentally forgot some MACSEC_ prefixes in if_macsec.h. Fixes: dece8d2b78d1 ("uapi: add MACsec bits") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24switchdev: Adding complete operation to deferred switchdev opsElad Raz
When using switchdev deferred operation (SWITCHDEV_F_DEFER), the operation is executed in different context and the application doesn't have any way to get the operation real status. Adding a completion callback fixes that. This patch adds fields to switchdev_attr and switchdev_obj "complete_priv" field which is used by the "complete" callback. Application can set a complete function which will be called once the operation executed. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24jbd2: add support for avoiding data writes during transaction commitsJan Kara
Currently when filesystem needs to make sure data is on permanent storage before committing a transaction it adds inode to transaction's inode list. During transaction commit, jbd2 writes back all dirty buffers that have allocated underlying blocks and waits for the IO to finish. However when doing writeback for delayed allocated data, we allocate blocks and immediately submit the data. Thus asking jbd2 to write dirty pages just unnecessarily adds more work to jbd2 possibly writing back other redirtied blocks. Add support to jbd2 to allow filesystem to ask jbd2 to only wait for outstanding data writes before committing a transaction and thus avoid unnecessary writes. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, mostly from Florian Westphal to sort out the lack of sufficient validation in x_tables and connlabel preparation patches to add nf_tables support. They are: 1) Ensure we don't go over the ruleset blob boundaries in mark_source_chains(). 2) Validate that target jumps land on an existing xt_entry. This extra sanitization comes with a performance penalty when loading the ruleset. 3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables. 4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables. 5) Make sure the minimal possible target size in x_tables. 6) Similar to #3, add xt_compat_check_entry_offsets() for compat code. 7) Check that standard target size is valid. 8) More sanitization to ensure that the target_offset field is correct. 9) Add xt_check_entry_match() to validate that matches are well-formed. 10-12) Three patch to reduce the number of parameters in translate_compat_table() for {arp,ip,ip6}tables by using a container structure. 13) No need to return value from xt_compat_match_from_user(), so make it void. 14) Consolidate translate_table() so it can be used by compat code too. 15) Remove obsolete check for compat code, so we keep consistent with what was already removed in the native layout code (back in 2007). 16) Get rid of target jump validation from mark_source_chains(), obsoleted by #2. 17) Introduce xt_copy_counters_from_user() to consolidate counter copying, and use it from {arp,ip,ip6}tables. 18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump functions. 19) Move nf_connlabel_match() to xt_connlabel. 20) Skip event notification if connlabel did not change. 21) Update of nf_connlabels_get() to make the upcoming nft connlabel support easier. 23) Remove spinlock to read protocol state field in conntrack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal fixes from Eduardo Valentin: "Specifics in this pull request: - Fixes in mediatek and OF thermal drivers - Fixes in power_allocator governor - More fixes of unsigned to int type change in thermal_core.c. These change have been CI tested using KernelCI bot. \o/" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: fix Mediatek thermal controller build thermal: consistently use int for trip temp thermal: fix mtk_thermal build dependency thermal: minor mtk_thermal.c cleanups thermal: power_allocator: req_range multiplication should be a 64 bit type thermal: of: add __init attribute
2016-04-23xfrm: align nlattr properly when neededNicolas Dichtel
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: add nla_put_u64_64bit() helperNicolas Dichtel
With this function, nla_data() is aligned on a 64-bit area. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_msecs(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_s64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. In fact, there is no user of this function. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_net64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. The temporary function nla_put_be64_32bit() is removed in this patch. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_be64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. A temporary version (nla_put_be64_32bit()) is added for nla_put_net64(). This function is removed in the next patch. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_le64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts were two cases of simple overlapping changes, nothing serious. In the UDP case, we need to add a hlist_add_tail_rcu() to linux/rculist.h, because we've moved UDP socket handling away from using nulls lists. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23Merge tag 'asm-generic-4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic update from Arnd Bergmann: "Here is one patch to wire up the preadv/pwritev system calls in the generic system call table, which is required for all architectures that were merged in the last few years, including arm64. Usually these get merged along with the syscall implementation or one of the architecture trees, but this time that did not happen. Andre and Christoph both sent a version of this patch, I picked the one I got first" * tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: generic syscalls: wire up preadv2 and pwritev2 syscalls
2016-04-23iio:imu:mpu6050: enhance mounting matrix supportGregor Boirie
Add a new rotation matrix sysfs attribute compliant with IIO core mounting matrix API. Matrix is retrieved from "in_anglvel_mount_matrix" and "in_accel_mount_matrix" sysfs attributes. It is declared into mpu6050 DTS entry as a "mount-matrix" property. Old interface is kept for backward userspace compatibility and may be retrieved from legacy platform_data mechanism only. Signed-off-by: Gregor Boirie <gregor.boirie@parrot.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-04-23iio:ak8975: add mounting matrix supportGregor Boirie
Expose a rotation matrix to indicate userspace the chip orientation with respect to the overall hardware system. Matrix is retrieved from "in_mount_matrix". It is declared into ak8975 DTS entry as a "mount-matrix" property. Signed-off-by: Gregor Boirie <gregor.boirie@parrot.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-04-23iio:core: mounting matrix supportGregor Boirie
Expose a rotation matrix to indicate userspace the chip placement with respect to the overall hardware system. This is needed to adjust coordinates sampled from a sensor chip when its position deviates from the main hardware system. Final coordinates computation is delegated to userspace since: * computation may involve floating point arithmetics ; * it allows an application to combine adjustments with arbitrary transformations. This 3 dimentional space rotation matrix is expressed as 3x3 array of strings to support floating point numbers. It may be retrieved from a "[<dir>_][<type>_]mount_matrix" sysfs attribute file. It is declared into a device / driver specific DTS property or platform data. Signed-off-by: Gregor Boirie <gregor.boirie@parrot.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-04-23generic syscalls: wire up preadv2 and pwritev2 syscallsAndre Przywara
These new syscalls are implemented as generic code, so enable them for architectures like arm64 which use the generic syscall table. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-04-23Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes: pvqspinlocks: - an instrumentation fix futexes: - preempt-count vs pagefault_disable decouple corner case fix - futex requeue plist race window fix - futex UNLOCK_PI transaction fix for a corner case" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic() futex: Acknowledge a new waiter in counter before plist futex: Handle unlock_pi race gracefully locking/pvqspinlock: Fix division by zero in qstat_read()
2016-04-23sched/fair: Correctly handle nohz ticks CPU load accountingFrederic Weisbecker
Ticks can happen while the CPU is in dynticks-idle or dynticks-singletask mode. In fact "nohz" or "dynticks" only mean that we exit the periodic mode and we try to minimize the ticks as much as possible. The nohz subsystem uses a confusing terminology with the internal state "ts->tick_stopped" which is also available through its public interface with tick_nohz_tick_stopped(). This is a misnomer as the tick is instead reduced with the best effort rather than stopped. In the best case the tick can indeed be actually stopped but there is no guarantee about that. If a timer needs to fire one second later, a tick will fire while the CPU is in nohz mode and this is a very common scenario. Now this confusion happens to be a problem with CPU load updates: cpu_load_update_active() doesn't handle nohz ticks correctly because it assumes that ticks are completely stopped in nohz mode and that cpu_load_update_active() can't be called in dynticks mode. When that happens, the whole previous tickless load is ignored and the function just records the load for the current tick, ignoring potentially long idle periods behind. In order to solve this, we could account the current load for the previous nohz time but there is a risk that we account the load of a task that got freshly enqueued for the whole nohz period. So instead, lets record the dynticks load on nohz frame entry so we know what to record in case of nohz ticks, then use this record to account the tickless load on nohz ticks and nohz frame end. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1460555812-25375-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23sched/fair: Gather CPU load functions under a more conventional namespaceFrederic Weisbecker
The CPU load update related functions have a weak naming convention currently, starting with update_cpu_load_*() which isn't ideal as "update" is a very generic concept. Since two of these functions are public already (and a third is to come) that's enough to introduce a more conventional naming scheme. So let's do the following rename instead: update_cpu_load_*() -> cpu_load_update_*() Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1460555812-25375-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23Merge tag 'v4.6-rc4' into sched/core, to refresh the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/core: Add ::write_backward attribute to perf eventWang Nan
This patch introduces 'write_backward' bit to perf_event_attr, which controls the direction of a ring buffer. After set, the corresponding ring buffer is written from end to beginning. This feature is design to support reading from overwritable ring buffer. Ring buffer can be created by mapping a perf event fd. Kernel puts event records into ring buffer, user tooling like perf fetch them from address returned by mmap(). To prevent racing between kernel and tooling, they communicate to each other through 'head' and 'tail' pointers. Kernel maintains 'head' pointer, points it to the next free area (tail of the last record). Tooling maintains 'tail' pointer, points it to the tail of last consumed record (record has already been fetched). Kernel determines the available space in a ring buffer using these two pointers to avoid overwrite unfetched records. By mapping without 'PROT_WRITE', an overwritable ring buffer is created. Different from normal ring buffer, tooling is unable to maintain 'tail' pointer because writing is forbidden. Therefore, for this type of ring buffers, kernel overwrite old records unconditionally, works like flight recorder. This feature would be useful if reading from overwritable ring buffer were as easy as reading from normal ring buffer. However, there's an obscure problem. The following figure demonstrates a full overwritable ring buffer. In this figure, the 'head' pointer points to the end of last record, and a long record 'E' is pending. For a normal ring buffer, a 'tail' pointer would have pointed to position (X), so kernel knows there's no more space in the ring buffer. However, for an overwritable ring buffer, kernel ignore the 'tail' pointer. (X) head . | . V +------+-------+----------+------+---+ |A....A|B.....B|C........C|D....D| | +------+-------+----------+------+---+ Record 'A' is overwritten by event 'E': head | V +--+---+-------+----------+------+---+ |.E|..A|B.....B|C........C|D....D|E..| +--+---+-------+----------+------+---+ Now tooling decides to read from this ring buffer. However, none of these two natural positions, 'head' and the start of this ring buffer, are pointing to the head of a record. Even the full ring buffer can be accessed by tooling, it is unable to find a position to start decoding. The first attempt tries to solve this problem AFAIK can be found from [1]. It makes kernel to maintain 'tail' pointer: updates it when ring buffer is half full. However, this approach introduces overhead to fast path. Test result shows a 1% overhead [2]. In addition, this method utilizes no more tham 50% records. Another attempt can be found from [3], which allows putting the size of an event at the end of each record. This approach allows tooling to find records in a backward manner from 'head' pointer by reading size of a record from its tail. However, because of alignment requirement, it needs 8 bytes to record the size of a record, which is a huge waste. Its performance is also not good, because more data need to be written. This approach also introduces some extra branch instructions to fast path. 'write_backward' is a better solution to this problem. Following figure demonstrates the state of the overwritable ring buffer when 'write_backward' is set before overwriting: head | V +---+------+----------+-------+------+ | |D....D|C........C|B.....B|A....A| +---+------+----------+-------+------+ and after overwriting: head | V +---+------+----------+-------+---+--+ |..E|D....D|C........C|B.....B|A..|E.| +---+------+----------+-------+---+--+ In each situation, 'head' points to the beginning of the newest record. From this record, tooling can iterate over the full ring buffer and fetch records one by one. The only limitation that needs to be considered is back-to-back reading. Due to the non-deterministic of user programs, it is impossible to ensure the ring buffer keeps stable during reading. Consider an extreme situation: tooling is scheduled out after reading record 'D', then a burst of events come, eat up the whole ring buffer (one or multiple rounds). When the tooling process comes back, reading after 'D' is incorrect now. To prevent this problem, we need to find a way to ensure the ring buffer is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is suggested because its overhead is lower than ioctl(PERF_EVENT_IOC_ENABLE). By carefully verifying 'header' pointer, reader can avoid pausing the ring-buffer. For example: /* A union of all possible events */ union perf_event event; p = head = perf_mmap__read_head(); while (true) { /* copy header of next event */ fetch(&event.header, p, sizeof(event.header)); /* read 'head' pointer */ head = perf_mmap__read_head(); /* check overwritten: is the header good? */ if (!verify(sizeof(event.header), p, head)) break; /* copy the whole event */ fetch(&event, p, event.header.size); /* read 'head' pointer again */ head = perf_mmap__read_head(); /* is the whole event good? */ if (!verify(event.header.size, p, head)) break; p += event.header.size; } However, the overhead is high because: a) In-place decoding is not safe. Copying-verifying-decoding is required. b) Fetching 'head' pointer requires additional synchronization. (From Alexei Starovoitov: Even when this trick works, pause is needed for more than stability of reading. When we collect the events into overwrite buffer we're waiting for some other trigger (like all cpu utilization spike or just one cpu running and all others are idle) and when it happens the buffer has valuable info from the past. At this point new events are no longer interesting and buffer should be paused, events read and unpaused until next trigger comes.) This patch utilizes event's default overflow_handler introduced previously. perf_event_output_backward() is created as the default overflow handler for backward ring buffers. To avoid extra overhead to fast path, original perf_event_output() becomes __perf_event_output() and marked '__always_inline'. In theory, there's no extra overhead introduced to fast path. Performance testing: Calling 3000000 times of 'close(-1)', use gettimeofday() to check duration. Use 'perf record -o /dev/null -e raw_syscalls:*' to capture system calls. In ns. Testing environment: CPU : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz Kernel : v4.5.0 MEAN STDVAR BASE 800214.950 2853.083 PRE1 2253846.700 9997.014 PRE2 2257495.540 8516.293 POST 2250896.100 8933.921 Where 'BASE' is pure performance without capturing. 'PRE1' is test result of pure 'v4.5.0' kernel. 'PRE2' is test result before this patch. 'POST' is test result after this patch. See [4] for the detailed experimental setup. Considering the stdvar, this patch doesn't introduce performance overhead to the fast path. [1] http://lkml.iu.edu/hypermail/linux/kernel/1304.1/04584.html [2] http://lkml.iu.edu/hypermail/linux/kernel/1307.1/00535.html [3] http://lkml.iu.edu/hypermail/linux/kernel/1512.0/01265.html [4] http://lkml.kernel.org/g/56F89DCD.1040202@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: <acme@kernel.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459865478-53413-1-git-send-email-wangnan0@huawei.com [ Fixed the changelog some more. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23Merge branch 'perf/urgent' into perf/core, to resolve conflictIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23lockdep: Fix lock_chain::base sizePeter Zijlstra
lock_chain::base is used to store an index into the chain_hlocks[] array, however that array contains more elements than can be indexed using the u16. Change the lock_chain structure to use a bitfield to encode the data it needs and add BUILD_BUG_ON() assertions to check the fields are wide enough. Also, for DEBUG_LOCKDEP, assert that we don't run out of elements of that array; as that would wreck the collision detectoring. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160330093659.GS3408@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-22libnvdimm, pmem, pfn: make pmem_rw_bytes generic and refactor pfn setupDan Williams
In preparation for providing an alternative (to block device) access mechanism to persistent memory, convert pmem_rw_bytes() to nsio_rw_bytes(). This allows ->rw_bytes() functionality without requiring a 'struct pmem_device' to be instantiated. In other words, when ->rw_bytes() is in use i/o is driven through 'struct nd_namespace_io', otherwise it is driven through 'struct pmem_device' and the block layer. This consolidates the disjoint calls to devm_exit_badblocks() and devm_memunmap() into a common devm_nsio_disable() and cleans up the init path to use a unified pmem_attach_disk() implementation. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-22libnvdimm, blk: move i/o infrastructure to nd_namespace_blkDan Williams
Consolidate the information for issuing i/o to a blk-namespace, and eliminate some pointer chasing. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-22time: Introduce do_sys_settimeofday64()Baolin Wang
The do_sys_settimeofday() function uses a timespec, which is not year 2038 safe on 32bit systems. Thus this patch introduces do_sys_settimeofday64(), which allows us to transition users of do_sys_settimeofday() to using 64bit time types. Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> [jstultz: Include errno-base.h to avoid build issue on some arches] Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-04-22security: Introduce security_settime64()Baolin Wang
security_settime() uses a timespec, which is not year 2038 safe on 32bit systems. Thus this patch introduces the security_settime64() function with timespec64 type. We also convert the cap_settime() helper function to use the 64bit types. This patch then moves security_settime() to the header file as an inline helper function so that existing users can be iteratively converted. None of the existing hooks is using the timespec argument and therefor the patch is not making any functional changes. Cc: Serge Hallyn <serge.hallyn@canonical.com>, Cc: James Morris <james.l.morris@oracle.com>, Cc: "Serge E. Hallyn" <serge@hallyn.com>, Cc: Paul Moore <pmoore@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Kees Cook <keescook@chromium.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> [jstultz: Reworded commit message] Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-04-22clocksource: Add missing include of of.h.David Lechner
This header uses OF_DELCARE_1 which is defined in linux/of.h. This fixes getting unhelpful compiler error messages about missing ')' before a string constant. Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-04-22Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "i915, nouveau and amdgpu/radeon fixes in this: nouveau: Two fixes, one for a regression with dithering and one for a bug hit by the userspace drivers. i915: A few fixes, mostly things heading for stable, two important skylake GT3/4 hangs. radeon/amdgpu: Some audio, suspend/resume and some runtime PM fixes, along with two patches to harden the userptr ABI a bit" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (24 commits) drm: Loongson-3 doesn't fully support wc memory drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries amdgpu/uvd: add uvd fw version for amdgpu drm/amdgpu: forbid mapping of userptr bo through radeon device file drm/radeon: forbid mapping of userptr bo through radeon device file drm/amdgpu: bump the afmt limit for CZ, ST, Polaris drm/amdgpu: use defines for CRTCs and AMFT blocks drm/dp/mst: Validate port in drm_dp_payload_send_msg() drm/nouveau/kms: fix setting of default values for dithering properties drm/radeon: print a message if ATPX dGPU power control is missing Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control" drm/amdgpu/acp: fix resume on CZ systems with AZ audio drm/radeon: add a quirk for a XFX R9 270X drm/radeon: print pci revision as well as pci ids on driver load drm/i915: Use fw_domains_put_with_fifo() on HSW drm/i915: Force ringbuffers to not be at offset 0 drm/i915: Adjust size of PIPE_CONTROL used for gen8 render seqno write drm/i915/skl: Fix spurious gpu hang with gt3/gt4 revs drm/i915/skl: Fix rc6 based gpu/system hang drm/i915/userptr: Hold mmref whilst calling get-user-pages ...
2016-04-22Merge tag 'sound-4.6-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Again a relatively calm week without surprise: most of fixes are about HD-audio, including fixes for Cirrus codec regression and a race over regmap access. Although both change are slightly unintuitive, the risk of further breakage is quite low, I hope. Other than that, all the rest are trivial" * tag 'sound-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Fix possible race on regmap bypass flip ALSA: pcxhr: Fix missing mutex unlock ALSA: hda - add PCI ID for Intel Broxton-T ALSA: hda - Keep powering up ADCs on Cirrus codecs ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m ALSA - hda: hdmi check NULL pointer in hdmi_set_chmap ALSA: hda - Don't trust the reported actual power state
2016-04-22Bluetooth: Add defines for SPI and I2CJohan Hedberg
Extend the set of possible HCI bus types with SPI and I2C. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-22i2c: mux: drop old unused i2c-mux apiPeter Rosin
All i2c mux users are using an explicit i2c mux core, drop support for implicit i2c mux cores. Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-04-22i2c: mux: add common data for every i2c-mux instancePeter Rosin
All i2c-muxes have a parent adapter and one or many child adapters. A mux also has some means of selection. Previously, this was stored per child adapter, but it is only needed to keep track of this per mux. Add an i2c mux core, that keeps track of this consistently. Also add some glue for users of the old interface, which will create one implicit mux core per child adapter. Signed-off-by: Peter Rosin <peda@axentia.se> Tested-by: Antti Palosaari <crope@iki.fi> Tested-by: Crestez Dan Leonard <leonard.crestez@intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-04-22ARM: tegra: Add DT binding for Tegra186 GPIO controllersStephen Warren
Tegra186 contains two separate but mostly similar GPIO controllers. Register layout differs significantly from previous Tegra generations, and so a new binding is required to describe them in device tree. This patch adds that binding. Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-04-22ARM: tegra: Fix naming in GPIO DT binding headerStephen Warren
According to the Tegra TRM, GPIOs are aggregated into /ports/ of 8 GPIOs, not into /banks/. Fix <dt-bindings/gpio/tegra-gpio.h> to correctly reflect this naming convention. While this seems like silly churn, it will become slightly more important once we introduce the GPIO binding for upcoming Tegra chips. Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-04-22mm: replace open coded page to virt conversion with page_to_virt()Ard Biesheuvel
The open coded conversion from struct page address to virtual address in lowmem_page_address() involves an intermediate conversion step to pfn number/physical address. Since the placement of the struct page array relative to the linear mapping may be completely independent from the placement of physical RAM (as is that case for arm64 after commit dfd55ad85e 'arm64: vmemmap: use virtual projection of linear region'), the conversion to physical address and back again should factor out of the equation, but unfortunately, the shifting and pointer arithmetic involved prevent this from happening, and the resulting calculation essentially subtracts the address of the start of physical memory and adds it back again, in a way that prevents the compiler from optimizing it away. Since the start of physical memory is not a build time constant on arm64, the resulting conversion involves an unnecessary memory access, which we would like to get rid of. So replace the open coded conversion with a call to page_to_virt(), and use the open coded conversion as its default definition, to be overriden by the architecture, if desired. The existing arch specific definitions of page_to_virt are all equivalent to this default definition, so by itself this patch is a no-op. Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-22x86, drivers/pnpbios: Replace paravirt_enabled() check with legacy device checkLuis R. Rodriguez
Since we are removing paravirt_enabled() replace it with a logical equivalent. Even though PNPBIOS is x86 specific we add an arch-specific type call, which can be implemented by any architecture to show how other legacy attribute devices can later be also checked for with other ACPI legacy attribute flags. This implicates the first ACPI 5.2.9.3 IA-PC Boot Architecture ACPI_FADT_LEGACY_DEVICES flag device, and shows how to add more. The reason pnpbios gets a defined structure and as such uses a different approach than the RTC legacy quirk is that ACPI has a respective RTC flag, while pnpbios does not. We fold the pnpbios quirk under ACPI_FADT_LEGACY_DEVICES ACPI flag use case, and use a struct of possible devices to enable future extensions of this. As per 0-day, this bumps the vmlinux size using i386-tinyconfig as follows: TOTAL TEXT init.text x86_early_init_platform_quirks() +32 +28 +28 +28 That's 4 byte overhead total, the rest is cleared out on init as its all __init text. v2: split out subarch handlng on switch to make it easier later to add other subarchs. The 'fall-through' switch handling can be confusing and we'll remove it later when we add handling for X86_SUBARCH_CE4100. v3: document vmlinux size impact as per 0-day, and also explain why pnpbios is treated differently than the RTC legacy feature. Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andrew.cooper3@citrix.com Cc: andriy.shevchenko@linux.intel.com Cc: bigeasy@linutronix.de Cc: boris.ostrovsky@oracle.com Cc: david.vrabel@citrix.com Cc: ffainelli@freebox.fr Cc: george.dunlap@citrix.com Cc: glin@suse.com Cc: jgross@suse.com Cc: jlee@suse.com Cc: josh@joshtriplett.org Cc: julien.grall@linaro.org Cc: konrad.wilk@oracle.com Cc: kozerkov@parallels.com Cc: lenb@kernel.org Cc: lguest@lists.ozlabs.org Cc: linux-acpi@vger.kernel.org Cc: lv.zheng@intel.com Cc: matt@codeblueprint.co.uk Cc: mbizon@freebox.fr Cc: rjw@rjwysocki.net Cc: robert.moore@intel.com Cc: rusty@rustcorp.com.au Cc: tiwai@suse.de Cc: toshi.kani@hp.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1460592286-300-12-git-send-email-mcgrof@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-22soc: renesas: rcar-sysc: Make rcar_sysc_power_is_off() staticGeert Uytterhoeven
As of commit b12ff41658171f53 ("ARM: shmobile: r8a7779: Remove legacy PM Domain remainings"), rcar_sysc_power_is_off() is no longer used from SoC-specific code. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2016-04-22soc: renesas: Move pm-rcar to drivers/soc/renesas/rcar-syscGeert Uytterhoeven
Move the pm-rcar driver from arch/arm/mach-shmobile/ to drivers/soc/renesas/, and its header file to include/linux/soc/renesas/, so it can be shared between arm32 (R-Car H1 and Gen2) and arm64 (R-Car Gen3). Rename it to rcar-sysc as it's really a driver for the R-Car System Controller (SYSC). Kill the intermediate PM_RCAR config symbol, as it's not user configurable anymore, and to prepare for SoC-specific make rules. Add the missing #include <linux/types.h> to rcar-sysc.h, which was exposed by different include order. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2016-04-22Merge tag 'clk-renesas-for-v4.7-tag2' of ↵Simon Horman
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into rcar-sysc-for-v4.7 clk: renesas: R-Car SYSC PM Domain Preparation - Export the CPG/MSSR and CPG/MSTP attach/detach_dev callbacks, so they can be called by the R-Car SYSC PM Domain driver.
2016-04-22locking/rwsem: Provide down_write_killable()Michal Hocko
Now that all the architectures implement the necessary glue code we can introduce down_write_killable(). The only difference wrt. regular down_write() is that the slow path waits in TASK_KILLABLE state and the interruption by the fatal signal is reported as -EINTR to the caller. Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Zankel <chris@zankel.net> Cc: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Signed-off-by: Jason Low <jason.low2@hp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1460041951-22347-12-git-send-email-mhocko@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-22drm/mode: move framebuffer reference into object.Dave Airlie
This is the initial code to add references to some mode objects. In the future we need to start reference counting connectors so firstly I want to reorganise the code so the framebuffer ref counting uses the same paths. This patch shouldn't change any functionality, just moves the kref. [airlied: move kerneldoc as well] Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-22drm/mode: introduce wrapper to read framebuffer refcount.Dave Airlie
Avoids drivers knowing where the kref is stored. [airlied: add kerneldoc] Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-22PM / Domains: Remove ->save|restore_state() callbacksUlf Hansson
As a part of the ongoing consolidation of genpd, it's become questionable whether clients actually needs to be able to assign their own set of ->save|restore_state() callbacks. Currently all users copes fine with the default callbacks, so let's remove the configuration option and stick to the default ones. This enables further clarifications of the related code and let's also rename pm_genpd_default_save|restore_state() into __genpd_runtime_suspend|resume() to apply the rule of static functionnames in genpd. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-22PM / Domains: Rename stop_ok to suspend_ok for the genpd governorUlf Hansson
The genpd governor validates the latency constraints to find out whether it's acceptable to runtime suspend a device. Earlier this validation was made to know whether it was okay to invoke the ->stop() callback for the device, hence the governor used the name "stop_ok" for the related variables. To clarify the code around this, let's rename these variables from "stop_ok" to "suspend_ok". Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-22drm: Loongson-3 doesn't fully support wc memoryHuacai Chen
Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>