summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2018-03-13bsg: split handling of SCSI CDBs vs transport requeuesChristoph Hellwig
The current BSG design tries to shoe-horn the transport-specific passthrough commands into the overall framework for SCSI passthrough requests. This has a couple problems: - each passthrough queue has to set the QUEUE_FLAG_SCSI_PASSTHROUGH flag despite not dealing with SCSI commands at all. Because of that these queues could also incorrectly accept SCSI commands from in-kernel users or through the legacy SCSI_IOCTL_SEND_COMMAND ioctl. - the real SCSI bsg queues also incorrectly accept bsg requests of the BSG_SUB_PROTOCOL_SCSI_TRANSPORT type - the bsg transport code is almost unredable because it tries to reuse different SCSI concepts for its own purpose. This patch instead adds a new bsg_ops structure to handle the two cases differently, and thus solves all of the above problems. Another side effect is that the bsg-lib queues also don't need to embedd a struct scsi_request anymore. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-13bsg-lib: remove bsg_job.reqChristoph Hellwig
Users of the bsg-lib interface should only use the bsg_job data structure and not know about implementation details of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-13bsg-lib: introduce a timeout field in struct bsg_jobChristoph Hellwig
The zfcp driver wants to know the timeout for a bsg job, so add a field to struct bsg_job for it in preparation of not exposing the request to the bsg-lib users. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-13perf/core: Implement fast breakpoint modification via _IOC_MODIFY_ATTRIBUTESMilind Chabbi
Problem and motivation: Once a breakpoint perf event (PERF_TYPE_BREAKPOINT) is created, there is no flexibility to change the breakpoint type (bp_type), breakpoint address (bp_addr), or breakpoint length (bp_len). The only option is to close the perf event and configure a new breakpoint event. This inflexibility has a significant performance overhead. For example, sampling-based, lightweight performance profilers (and also concurrency bug detection tools), monitor different addresses for a short duration using PERF_TYPE_BREAKPOINT and change the address (bp_addr) to another address or change the kind of breakpoint (bp_type) from "write" to a "read" or vice-versa or change the length (bp_len) of the address being monitored. The cost of these modifications is prohibitive since it involves unmapping the circular buffer associated with the perf event, closing the perf event, opening another perf event and mmaping another circular buffer. Solution: The new ioctl flag for perf events, PERF_EVENT_IOC_MODIFY_ATTRIBUTES, introduced in this patch takes a pointer to a struct perf_event_attr as an argument to update an old breakpoint event with new address, type, and size. This facility allows retaining a previous mmaped perf events ring buffer and avoids having to close and reopen another perf event. This patch supports only changing PERF_TYPE_BREAKPOINT event type; future implementations can extend this feature. The patch replicates some of its functionality of modify_user_hw_breakpoint() in kernel/events/hw_breakpoint.c. modify_user_hw_breakpoint cannot be called directly since perf_event_ctx_lock() is already held in _perf_ioctl(). Evidence: Experiments show that the baseline (not able to modify an already created breakpoint) costs an order of magnitude (~10x) more than the suggested optimization (having the ability to dynamically modifying a configured breakpoint via ioctl). When the breakpoints typically do not trap, the speedup due to the suggested optimization is ~10x; even when the breakpoints always trap, the speedup is ~4x due to the suggested optimization. Testing: tests posted at https://github.com/linux-contrib/perf_event_modify_bp demonstrate the performance significance of this patch. Tests also check the functional correctness of the patch. Signed-off-by: Milind Chabbi <chabbi.milind@gmail.com> [ Using modify_user_hw_breakpoint_check function. ] [ Reformated PERF_EVENT_IOC_*, so the values are all in one column. ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-8-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-13tracing: Unify the "boot" and "mono" tracing clocksThomas Gleixner
Unify the "boot" and "mono" tracing clocks and document the new behaviour. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20180301165150.489635255@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-13hrtimer: Unify MONOTONIC and BOOTTIME clock behaviorThomas Gleixner
Now that th MONOTONIC and BOOTTIME clocks are indentical remove all the special casing. The user space visible interfaces still support both clocks, but their behavior is identical. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20180301165150.410218515@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-13timekeeping: Remove boot time specific codeThomas Gleixner
Now that the MONOTONIC and BOOTTIME clocks are the same, remove all the special handling from timekeeping. Keep wrappers for the existing users of the *boot* timekeeper interfaces. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20180301165150.236279497@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-13timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clockThomas Gleixner
The planned change to unify the behaviour of the MONOTONIC and BOOTTIME clocks vs. suspend removes the ability to retrieve the active non-suspended time of a system. Provide a new CLOCK_MONOTONIC_ACTIVE clock which returns the active non-suspended time of the system via clock_gettime(). This preserves the old behaviour of CLOCK_MONOTONIC before the BOOTTIME/MONOTONIC unification. This new clock also allows applications to detect programmatically that the MONOTONIC and BOOTTIME clocks are identical. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20180301165149.965235774@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-13f2fs: support hot file extensionChao Yu
This patch supports to recognize hot file extension in f2fs, so that we can allocate proper hot segment location for its data, which can lead to better hot/cold seperation in filesystem. In addition, we changes a bit on query/add/del operation method for extension_list sysfs entry as below: - Query: cat /sys/fs/f2fs/<disk>/extension_list - Add: echo 'extension' > /sys/fs/f2fs/<disk>/extension_list - Del: echo '!extension' > /sys/fs/f2fs/<disk>/extension_list - Add: echo '[h/c]extension' > /sys/fs/f2fs/<disk>/extension_list - Del: echo '[h/c]!extension' > /sys/fs/f2fs/<disk>/extension_list - [h] means add/del hot file extension - [c] means add/del cold file extension Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: expose extension_list sysfs entryChao Yu
This patch adds a sysfs entry 'extension_list' to support query/add/del item in extension list. Query: cat /sys/fs/f2fs/<device>/extension_list Add: echo 'extension' > /sys/fs/f2fs/<device>/extension_list Del: echo '!extension' > /sys/fs/f2fs/<device>/extension_list Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: support large nat bitmapChao Yu
Previously, we will store all nat version bitmap in checkpoint pack block, so our total node entry number has a limitation which caused total node number can not exceed (3900 * 8) block * 455 node/block = 14196000. So that once user wants to create more nodes in large size image, it becomes a bottleneck, that's unreasonable. This patch detects the new layout of nat/sit version bitmap in image in order to enable supporting large nat bitmap. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: don't put dentry page in pagecache into highmemYunlong Song
Previous dentry page uses highmem, which will cause panic in platforms using highmem (such as arm), since the address space of dentry pages from highmem directly goes into the decryption path via the function fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not from highmem, and then cause panic since it doesn't call kmap_high but kunmap_high is triggered at the end. To fix this problem in a simple way, this patch avoids to put dentry page in pagecache into highmem. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix coding style] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-12clk: divider: read-only divider can propagate rate changeJerome Brunet
When a divider clock has CLK_DIVIDER_READ_ONLY set, it means that the register shall be left un-touched, but it does not mean the clock should stop rate propagation if CLK_SET_RATE_PARENT is set This is properly handled in qcom clk-regmap-divider but it was not in the generic divider To fix this situation, introduce a new helper function divider_ro_round_rate, on the same model as divider_round_rate. Fixes: e6d5e7d90be9 ("clk-divider: Fix READ_ONLY when divider > 1") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Tested-By: David Lechner <david@lechnology.com> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-03-12clk: fix mux clock documentationJerome Brunet
The mux documentation mentions the non-existing parameter width instead of mask, so just sed this. The table field is missing in the documentation of clk_mux. Add a small blurb explaining what it is Fixes: 9d9f78ed9af0 ("clk: basic clock hardware types") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-03-12clk: mux: add helper function for index/value translationJerome Brunet
Add helper functions for the translation between parent index and register value in the generic multiplexer function. The purpose of this change is avoid duplicating the code in other clock providers, using the same generic logic. Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-03-12clk: divider: export clk_div_mask() helperJerome Brunet
Export clk_div_mask() in clk-provider header so every clock providers derived from the generic clock divider may share the definition instead of redefining it. Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-03-12Merge remote-tracking branches 'regmap/topic/debugfs' and ↵Mark Brown
'regmap/topic/mmio-clk' into regmap-next
2018-03-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fixed hashtable representation doesn't support timeout flag, skip it otherwise rules to add elements from the packet fail bogusly fail with EOPNOTSUPP. 2) Fix bogus error with 32-bits ebtables userspace and 64-bits kernel, patch from Florian Westphal. 3) Sanitize proc names in several x_tables extensions, also from Florian. 4) Add sanitization to ebt_among wormhash logic, from Florian. 5) Missing release of hook array in flowtable. ====================
2018-03-12direct-io: Remove unused DIO_SKIP_DIO_COUNT logicNikolay Borisov
This flag was added by fe0f07d08ee3 ("direct-io: only inc/deci inode->i_dio_count for file systems") as means to optimise the atomic modificaiton of the variable for blockdevices. However with the advent of 542ff7bf18c6 ("block: new direct I/O implementation") it became unused. So let's remove it. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-12direct-io: Remove unused DIO_ASYNC_EXTEND flagNikolay Borisov
This flag was added by 6039257378e4 ("direct-io: add flag to allow aio writes beyond i_size") to support XFS. However, with the rework of XFS' DIO's path to use iomap in acdda3aae146 ("xfs: use iomap_dio_rw") it became redundant. So let's remove it. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-12sock_diag: request _diag module only when the family or proto has been ↵Xin Long
registered Now when using 'ss' in iproute, kernel would try to load all _diag modules, which also causes corresponding family and proto modules to be loaded as well due to module dependencies. Like after running 'ss', sctp, dccp, af_packet (if it works as a module) would be loaded. For example: $ lsmod|grep sctp $ ss $ lsmod|grep sctp sctp_diag 16384 0 sctp 323584 5 sctp_diag inet_diag 24576 4 raw_diag,tcp_diag,sctp_diag,udp_diag libcrc32c 16384 3 nf_conntrack,nf_nat,sctp As these family and proto modules are loaded unintentionally, it could cause some problems, like: - Some debug tools use 'ss' to collect the socket info, which loads all those diag and family and protocol modules. It's noisy for identifying issues. - Users usually expect to drop sctp init packet silently when they have no sense of sctp protocol instead of sending abort back. - It wastes resources (especially with multiple netns), and SCTP module can't be unloaded once it's loaded. ... In short, it's really inappropriate to have these family and proto modules loaded unexpectedly when just doing debugging with inet_diag. This patch is to introduce sock_load_diag_module() where it loads the _diag module only when it's corresponding family or proto has been already registered. Note that we can't just load _diag module without the family or proto loaded, as some symbols used in _diag module are from the family or proto module. v1->v2: - move inet proto check to inet_diag to avoid a compiling err. v2->v3: - define sock_load_diag_module in sock.c and export one symbol only. - improve the changelog. Reported-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Phil Sutter <phil@nwl.cc> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12net: phy: Tell caller result of phy_change()Brad Mouring
In 664fcf123a30e (net: phy: Threaded interrupts allow some simplification) the phy_interrupt system was changed to use a traditional threaded interrupt scheme instead of a workqueue approach. With this change, the phy status check moved into phy_change, which did not report back to the caller whether or not the interrupt was handled. This means that, in the case of a shared phy interrupt, only the first phydev's interrupt registers are checked (since phy_interrupt() would always return IRQ_HANDLED). This leads to interrupt storms when it is a secondary device that's actually the interrupt source. Signed-off-by: Brad Mouring <brad.mouring@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12perf/core: Optimize ctx_sched_out()Peter Zijlstra
When an event group contains more events than can be scheduled on the hardware, iterating the full event group for ctx_sched_out is a waste of time. Keep track of the events that got programmed on the hardware, such that we can iterate this smaller list in order to schedule them out. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12perf/core: Remove perf_event::group_entryPeter Zijlstra
Now that all the grouping is done with RB trees, we no longer need group_entry and can replace the whole thing with sibling_list. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12perf/cor: Use RB trees for pinned/flexible groupsAlexey Budankov
Change event groups into RB trees sorted by CPU and then by a 64bit index, so that multiplexing hrtimer interrupt handler would be able skipping to the current CPU's list and ignore groups allocated for the other CPUs. New API for manipulating event groups in the trees is implemented as well as adoption on the API in the current implementation. pinned_group_sched_in() and flexible_group_sched_in() API are introduced to consolidate code enabling the whole group from pinned and flexible groups appropriately. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/372f9c8b-0cfe-4240-e44d-83d863d40813@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12Merge tag 'pxa-for-4.17' of https://github.com/rjarzmik/linux into next/socArnd Bergmann
Pull "This is the pxa changes for v4.17 cycle" from Robert Jarzmik: - minor changes for property API - clock API fix for ULPI driver warning It exceptionally contains a merge from the mtd tree from Boris to prevent any merge conflicts in the PXA tree. * tag 'pxa-for-4.17' of https://github.com/rjarzmik/linux: ARM: pxa/raumfeld: use PROPERTY_ENTRY_U32() directly ARM: pxa: ulpi: fix ulpi timeout and slowpath warn ARM: pxa: cm-x300: remove inline directive ARM: pxa: fix static checker warning in pxa3xx-ulpi MAINTAINERS: remove entry for deleted pxa3xx_nand driver arm: dts: pxa: use reworked NAND controller driver dt-bindings: mtd: remove pxa3xx NAND controller documentation mtd: nand: remove useless fields from pxa3xx NAND platform data mtd: nand: remove deprecated pxa3xx_nand driver mtd: nand: use Marvell reworked NAND controller driver with all platforms
2018-03-12Merge branch 'x86/pti' into x86/mm, to pick up dependenciesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12sched/core: Remove TASK_ALLPeter Zijlstra
It's unused: $ git grep "\<TASK_ALL\>" | wc -l 1 ... and it is also dangerous, kill the bugger. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180227160510.10829-1-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12efi: Use efi_mm in x86 as well as ARMSai Praneeth
Presently, only ARM uses mm_struct to manage EFI page tables and EFI runtime region mappings. As this is the preferred approach, let's make this data structure common across architectures. Specially, for x86, using this data structure improves code maintainability and readability. Tested-by: Bhupesh Sharma <bhsharma@redhat.com> [ardb: don't #include the world to get a declaration of struct mm_struct] Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Lee, Chun-Yi <jlee@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180312084500.10764-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-11soc: mediatek: avoid hardcoded value with bus_prot_maskSean Wang
use a meaningful definition for bus_prot_mask instead of just hardcoded for it. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2018-03-11netfilter: x_tables: add and use xt_check_proc_nameFlorian Westphal
recent and hashlimit both create /proc files, but only check that name is 0 terminated. This can trigger WARN() from procfs when name is "" or "/". Add helper for this and then use it for both. Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: <syzbot+0502b00edac2a0680b61@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-11leds: Extends disk trigger for reads and writesLinus Walleij
This adds two new disk triggers for triggering on reads and writes respectively, named "disk-read" and "disk-write". The use case comes from working on the D-Link DNS-313 NAS box. This features an RGB LED for disk activity. with these two triggers I can couple the green LED to read activity and the red LED to write activity, which gives the appropriate user feedback about what is happening on the disk. When tested it gave exactly the feedback desired. The in-kernel interface is simply changed to pass a bool indicating if the activity is write activity and update each trigger (and the composite "disk-activity" trigger) depending on what is passed in. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
2018-03-11Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Miscellaneous fixes, perhaps most notably removing obsolete code whose only purpose in life was to gather information for the now-removed RCU debugfs facility. Other notable changes include removing NO_HZ_FULL_ALL in favor of the nohz_full kernel boot parameter, minor optimizations for expedited grace periods, some added tracing, creating an RCU-specific workqueue using Tejun's new WQ_MEM_RECLAIM flag, and several cleanups to code and comments. - SRCU cleanups and optimizations. - Torture-test updates, perhaps most notably the adding of ARMv8 support, but also including numerous cleanups and usability fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-10ring-buffer: Add nesting for adding events within eventsSteven Rostedt (VMware)
The ring-buffer code has recusion protection in case tracing ends up tracing itself, the ring-buffer will detect that it was called at the same context (normal, softirq, interrupt or NMI), and not continue to record the event. With the histogram synthetic events, they are called while tracing another event at the same context. The recusion protection triggers because it detects tracing at the same context and stops it. Add ring_buffer_nest_start() and ring_buffer_nest_end() that will notify the ring buffer that a trace is about to happen within another trace and that it is intended, and not to trigger the recursion blocking. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-03-10tracing: Give event triggers access to ring_buffer_eventTom Zanussi
The ring_buffer event can provide a timestamp that may be useful to various triggers - pass it into the handlers for that purpose. Link: http://lkml.kernel.org/r/6de592683b59fa70ffa5d43d0109896623fc1367.1516069914.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-03-10ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMPTom Zanussi
RINGBUF_TYPE_TIME_STAMP is defined but not used, and from what I can gather was reserved for something like an absolute timestamp feature for the ring buffer, if not a complete replacement of the current time_delta scheme. This code redefines RINGBUF_TYPE_TIME_STAMP to implement absolute time stamps. Another way to look at it is that it essentially forces extended time_deltas for all events. The motivation for doing this is to enable time_deltas that aren't dependent on previous events in the ring buffer, making it feasible to use the ring_buffer_event timetamps in a more random-access way, for purposes other than serial event printing. To set/reset this mode, use tracing_set_timestamp_abs() from the previous interface patch. Link: http://lkml.kernel.org/r/477b362dba1ce7fab9889a1a8e885a62c472f041.1516069914.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-03-10ring-buffer: Add interface for setting absolute time stampsTom Zanussi
Define a new function, tracing_set_time_stamp_abs(), which can be used to enable or disable the use of absolute timestamps rather than time deltas for a trace array. Only the interface is added here; a subsequent patch will add the underlying implementation. Link: http://lkml.kernel.org/r/ce96119de44c7fe0ee44786d15254e9b493040d3.1516069914.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Baohong Liu <baohong.liu@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-03-10Merge branch 'linus' into locking/core, to pick up fixes and dependenciesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-10timekeeping/ntp: Determine the multiplier directly from NTP tick lengthMiroslav Lichvar
When the length of the NTP tick changes significantly, e.g. when an NTP/PTP application is correcting the initial offset of the clock, a large value may accumulate in the NTP error before the multiplier converges to the correct value. It may then take a very long time (hours or even days) before the error is corrected. This causes the clock to have an unstable frequency offset, which has a negative impact on the stability of synchronization with precise time sources (e.g. NTP/PTP using hardware timestamping or the PTP KVM clock). Use division to determine the correct multiplier directly from the NTP tick length and replace the iterative approach. This removes the last major source of the NTP error. The only remaining source is now limited resolution of the multiplier, which is corrected by adding 1 to the multiplier when the system clock is behind the NTP time. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1520620971-9567-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-09mn10300: Remove the architectureDavid Howells
Remove the MN10300 arch as the hardware is defunct. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com> cc: Masahiro Yamada <yamada.masahiro@socionext.com> cc: linux-am33-list@redhat.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-09Merge tag 'pci-v4.16-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - fix sparc build issue when OF_IRQ not enabled (Guenter Roeck) - fix enumeration of devices below switches on DesignWare-based controllers (Koen Vandeputte) * tag 'pci-v4.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: dwc: Fix enumeration end when reaching root subordinate PCI: Move of_irq_parse_and_map_pci() declaration under OF_IRQ
2018-03-09net: introduce IFF_NO_RX_HANDLERPaolo Abeni
Some network devices - notably ipvlan slave - are not compatible with any kind of rx_handler. Currently the hook can be installed but any configuration (bridge, bond, macsec, ...) is nonfunctional. This change allocates a priv_flag bit to mark such devices and explicitly forbid installing a rx_handler if such bit is set. The new bit is used by ipvlan slave device. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09usb: core: hcd: remove support for initializing a single PHYMartin Blumenstingl
With the new PHY wrapper in place we can now handle multiple PHYs. Remove the code which handles only one generic PHY as this is now covered (with support for multiple PHYs as well as suspend/resume support) by the new PHY wrapper. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Neil Armstrong <narmstrong@baylibre.con> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09usb: core: hcd: integrate the PHY wrapper into the HCD coreMartin Blumenstingl
This integrates the PHY wrapper into the core hcd infrastructure. Multiple PHYs which are part of the HCD's device tree node are now managed (= powered on/off when needed), by the new usb_phy_roothub code. Suspend and resume is also supported, however not for runtime/auto-suspend (which is triggered for example when no devices are connected to the USB bus). This is needed on some SoCs (for example Amlogic Meson GXL) because if the PHYs are disabled during auto-suspend then devices which are plugged in afterwards are not seen by the host. One example where this is required is the Amlogic GXL and GXM SoCs: They are using a dwc3 USB controller with up to three ports enabled on the internal roothub. Each port has it's own PHY which must be enabled (if one of the PHYs is left disabled then none of the USB ports works at all). The new logic works on the Amlogic GXL and GXM SoCs because the dwc3 driver internally creates a xhci-hcd which then registers a HCD which then triggers our new PHY wrapper. USB controller drivers can opt out of this by setting "skip_phy_initialization" in struct usb_hcd to true. This is identical to how it works for a single USB PHY, so the "multiple PHY" handling is disabled for drivers that opted out of the management logic of a single PHY. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Tested-by: Yixun Lan <yixun.lan@amlogic.com> Tested-by: Neil Armstrong <narmstrong@baylibre.con> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09usb: add a flag to skip PHY initialization to struct usb_hcdMartin Blumenstingl
The USB HCD core driver parses the device-tree node for "phys" and "usb-phys" properties. It also manages the power state of these PHYs automatically. However, drivers may opt-out of this behavior by setting "phy" or "usb_phy" in struct usb_hcd to a non-null value. An example where this is required is the "Qualcomm USB2 controller", implemented by the chipidea driver. The hardware requires that the PHY is only powered on after the "reset completed" event from the controller is received. A follow-up patch will allow the USB HCD core driver to manage more than one PHY. Add a new "skip_phy_initialization" bitflag to struct usb_hcd so drivers can opt-out of any PHY management provided by the USB HCD core driver. This also updates the existing drivers so they use the new flag if they want to opt out of the PHY management provided by the USB HCD core driver. This means that for these drivers the new "multiple PHY" handling (which will be added in a follow-up patch) will be disabled as well. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Peter Chen <peter.chen@nxp.com> Tested-by: Neil Armstrong <narmstrong@baylibre.con> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09typec: tcpm: Add SDB header for Status message handlingAdam Thomson
This commit adds a header providing definitions for handling Status messages. Currently the header only focuses on handling incoming Status messages. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09typec: tcpm: Add ADO header for Alert message handlingAdam Thomson
This commit adds a header providing definitions for handling Alert messages. Currently the header only focuses on handling incoming alerts. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09typec: tcpm: Add PD Rev 3.0 definitions to PD headerAdam Thomson
This commit adds definitions for PD Rev 3.0 messages, including APDO PPS and extended message support for TCPM. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09vhost_net: examine pointer types during un-producingJason Wang
After commit fc72d1d54dd9 ("tuntap: XDP transmission"), we can actually queueing XDP pointers in the pointer ring, so we should examine the pointer type before freeing the pointer. Fixes: fc72d1d54dd9 ("tuntap: XDP transmission") Reported-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: do not create fallback tunnels for non-default namespacesEric Dumazet
fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0, ip6tnl0, ip6gre0) are automatically created when the corresponding module is loaded. These tunnels are also automatically created when a new network namespace is created, at a great cost. In many cases, netns are used for isolation purposes, and these extra network devices are a waste of resources. We are using thousands of netns per host, and hit the netns creation/delete bottleneck a lot. (Many thanks to Kirill for recent work on this) Add a new sysctl so that we can opt-out from this automatic creation. Note that these tunnels are still created for the initial namespace, to be the least intrusive for typical setups. Tested: lpk43:~# cat add_del_unshare.sh for i in `seq 1 40` do (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) & done wait lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net lpk43:~# time ./add_del_unshare.sh real 0m37.521s user 0m0.886s sys 7m7.084s lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net lpk43:~# time ./add_del_unshare.sh real 0m4.761s user 0m0.851s sys 1m8.343s lpk43:~# Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>