summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-07lightnvm: add media manager mark_blk helperJavier González
Expose media manager mark_blk() to targets, as done for the rest of the media manager callback functions. Signed-off-by: Javier González <javier@cnexlabs.com> Updated description Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: break the loop when rqd is not nullWenwei Tao
Break the loop when rqd is not null to reduce an unnecessary schedule. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07nvmet: fix an error codeDan Carpenter
We accidentally return zero here when ERR_PTR(-ENOMEM) is intended. Fixes: a07b4970f464 ('nvmet: add a generic NVMe target') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07nvme-loop: add configfs dependencyArnd Bergmann
CONFIG_NVME_TARGET has a correct CONFIG_CONFIGFS_FS dependency, but the newly added NVME_TARGET_LOOP is missing this, resulting in a link failure: drivers/nvme/built-in.o: In function `nvmet_init_configfs': loop.c:(.init.text+0x2a0): undefined reference to `config_group_init' loop.c:(.init.text+0x2c0): undefined reference to `config_group_init_type_name' loop.c:(.init.text+0x318): undefined reference to `configfs_register_subsystem' drivers/nvme/built-in.o: In function `nvmet_exit_configfs': loop.c:(.exit.text+0x9c): undefined reference to `configfs_unregister_subsystem' This adds the same dependency here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 3a85a5de29ea ("nvme-loop: add a NVMe loopback host driver") Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07kbuild: Remove stale asm-generic wrappersJames Hogan
When a header file is removed from generic-y (often accompanied by the addition of an arch specific header), the generated wrapper file will persist, and in some cases may still take precedence over the new arch header. For example commit f1fe2d21f4e1 ("MIPS: Add definitions for extended context") removed ucontext.h from generic-y in arch/mips/include/asm/, and added an arch/mips/include/uapi/asm/ucontext.h. The continued use of the wrapper when reusing a dirty build tree resulted in build failures in arch/mips/kernel/signal.c: arch/mips/kernel/signal.c: In function ‘sc_to_extcontext’: arch/mips/kernel/signal.c:142:12: error: ‘struct ucontext’ has no member named ‘uc_extcontext’ return &uc->uc_extcontext; ^ Fix by detecting and removing wrapper headers in generated header directories that do not correspond to a filename in generic-y, genhdr-y, or the newly introduced generated-y. Reported-by: Jacek Anaszewski <j.anaszewski@samsung.com> Reported-by: Hauke Mehrtens <hauke@hauke-m.de> Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Cc: linux-arch@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-kbuild@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Michal Marek <mmarek@suse.com> Link: http://lkml.kernel.org/r/1466808144-23209-3-git-send-email-james.hogan@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-07kbuild, x86: Track generated headers with generated-yJames Hogan
Track generated header files which aren't already in genhdr-y, alongside generic-y wrappers in the */include/generated/[uapi/]asm/ directories. Currently only x86 generates extra headers in these directories, for the purposes of enumerating system calls for different ABIs, and xen hypercalls. This will allow the asm-generic wrapper handling code to remove stale wrappers when files are removed from generic-y, without also removing these headers which are generated separately. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-kbuild@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: Michal Marek <mmarek@suse.com> Link: http://lkml.kernel.org/r/1466808144-23209-2-git-send-email-james.hogan@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-07Merge branch 'clockevents/4.8' of ↵Thomas Gleixner
http://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull the clockevents/clocksource tree from Daniel Lezcano: - Convert the clocksource-probe init functions to return a value in order to prepare the consolidation of the drivers using the DT. It is a big patchset but went through 01.org (kbuild bot), linux next and kernel-ci (continuous integration) (Daniel Lezcano) - Fix a bad error handling by returning the right value for cadence_ttc (Christophe Jaillet) - Fix typo in the Kconfig for the Samsung pwm (Alexandre Belloni) - Change functions to static for armada-370-xp and digicolor (Ben Dooks) - Add support for the rk3399 SoC timer by adding bindings and a slight change in the base address. Take the opportunity to add the DYNIRQ flag (Huang Tao) - Fix endian accessors for the Samsung pwm timer (Matthew Leach) - Add Oxford Semiconductor RPS Dual Timer driver (Neil Armstrong) - Add a kernel parameter to swich on/off the event stream feature of the arch arm timer (Will Deacon)
2016-07-07x86: remove duplicate turbo ratio limit MSRsSrinivas Pandruvada
Remove MSR_NHM_TURBO_RATIO_LIMIT and MSR_IVT_TURBO_RATIO_LIMIT as they are duplicate. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07tools/power turbostat: Replace MSR_NHM_TURBO_RATIO_LIMITSrinivas Pandruvada
Replace MSR_NHM_TURBO_RATIO_LIMIT with MSR_TURBO_RATIO_LIMIT. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07cpufreq: intel_pstate: Replace MSR_NHM_TURBO_RATIO_LIMITSrinivas Pandruvada
Replace MSR_NHM_TURBO_RATIO_LIMIT with MSR_TURBO_RATIO_LIMIT. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07Merge branch 'pm-tools' into pm-cpuRafael J. Wysocki
2016-07-07Merge branch 'pm-cpufreq' into pm-cpuRafael J. Wysocki
2016-07-07doc-rst: auto-generate video.h.rstMauro Carvalho Chehab
This file comes from the uAPI definition header, and should be auto-generated, to be in sync with Kernel changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: auto-generate net.h.rstMauro Carvalho Chehab
This file comes from the uAPI definition header, and should be auto-generated, to be in sync with Kernel changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: auto-generate ca.h.rstMauro Carvalho Chehab
This file comes from the uAPI definition header, and should be auto-generated, to be in sync with Kernel changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: auto-generate audio.h.rstMauro Carvalho Chehab
This is an auto-generated header. Remove the hardcoded one and do the right thing here. NOTE: this is a deprecated API. So, we won't make any effort to try identifying the meaning of this obscure API that is used only on a legacy driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: auto-generate dmx.h.rstMauro Carvalho Chehab
This file should be auto-generated from the header files, and not hardcoded. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07xenbus: don't BUG() on user mode induced conditionJan Beulich
Inability to locate a user mode specified transaction ID should not lead to a kernel crash. For other than XS_TRANSACTION_START also don't issue anything to xenbus if the specified ID doesn't match that of any active transaction. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-07doc-rst: parse-headers: fix multiline typedef handlerMauro Carvalho Chehab
The typedef handler should do two things to be generic: 1) parse typedef enums; 2) accept both possible syntaxes: typedef struct foo { .. } foo_t; typedef struct { .. } foo_t; Unfortunately, this is needed to parse some legacy DVB files, like dvb/audio.h. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: parse-headers: better handle typedefsMauro Carvalho Chehab
When typedef is used on its multiline format, we need to also parse enum and struct in the same line. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: parse-headers: be more formal about the valid symbolsMauro Carvalho Chehab
Be more formal about the valid symbols that are expected by the parser, to match what c language expects. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07spi: pxa2xx-pci: Support both chipselects on BraswellAndy Shevchenko
The commit 30f3a6ab44d8 ("spi: pxa2xx: Add support for both chip selects on Intel Braswell") introduces a support of chipselects for Intel Braswell SPI host controller. Though it missed to convert the PCI part of the driver. Do conversion here which enables both chipselects on Intel Braswell when enumerated via PCI. We don't care about num_chipselect value since it is overrided inside core driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-07doc-rst: fix parsing comments and '{' on a separate lineMauro Carvalho Chehab
The dmx.h header has two things that causes the parser to break while handling enums: per-header enums and the '{' starts on a new line Both makes the parser to get lexical marks to be detected as if they were symbols. Fix it. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07spi: pxa2xx: Clear all RFT bits in reset_sccr1() on Intel QuarkAndy Shevchenko
It seems the commit e5262d0568dc ("spi: spi-pxa2xx: SPI support for Intel Quark X1000") misses one place to be adapted for Intel Quark, i.e. in reset_sccr1(). Clear all RFT bits when call reset_sccr1() on Intel Quark. Fixes: e5262d0568dc ("spi: spi-pxa2xx: SPI support for Intel Quark X1000") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2016-07-07regulator: pwm: Fix regulator ramp delay for continuous modeDouglas Anderson
The original commit adding support for continuous voltage mode didn't handle the regulator ramp delay properly. It treated the delay as a fixed delay in uS despite the property being defined as uV / uS. Let's adjust it. Luckily there appear to be no users of this ramp delay for PWM regulators (as per grepping through device trees in linuxnext). Note also that the upper bound of usleep_range probably shouldn't be a full 1 ms longer than the lower bound since I've seen plenty of hardware with a ramp rate of ~5000 uS / uV and for small jumps the total delays are in the tens of uS. 1000 is way too much. We'll try to be dynamic and use 10%. NOTE: This commit doesn't add support for regulator-enable-ramp-delay. That could be done in a future patch when someone has a user of that featre. Though this patch is shows as "fixing" a bug, there are no actual known users of continuous mode PWM regulator w/ ramp delay in mainline and so this likely won't have any effect on anyone unless they are working out-of-tree with private patches. For anyone in this state, it is highly encouraged to also pick Boris Brezillon's WIP patches to get yourself a reliable and glitch-free regulator. Fixes: 4773be185a0f ("regulator: pwm-regulator: Add support for continuous-voltage") Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-07spi: add binding for clps711x SPIArnd Bergmann
This documents the binding used by Alexander Shiyan's DT support for the clps711x SPI controller. I've left the file name to match the ARM platform port name "clps711x" for consistency with the other bindings, even though the compatible string refers to the later ep7309 chip. Linux no longer supports the old clps711x and ep72xx product lines, but we still use the name. The entire family is now discontinued by the manufacturer. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-07spi: clps711x: Driver refactorAlexander Shiyan
This is a complex patch for refactoring CLPS711X SPI driver. This change adds devicetree support and removes board support. Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-07doc-dst: parse-headers: highlight deprecated commentsMauro Carvalho Chehab
When something is deprecated, highlight it, as we want it to be clearer to the reader. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: parse-headers: improve delimiters to detect symbolsMauro Carvalho Chehab
As we had to escape the symbols for the ReST markup to not do the wrong thing, the logic to discover start/end of strings are not trivial. Improve the end delimiter detection, in order to highlight more occurrences of the strings. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07doc-rst: auto-build the frontend.h.rstMauro Carvalho Chehab
This file is auto-generated with DocBook, from the uapi header. Do the same with Sphinx. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07spi: s3c64xx: do not disable the clock while configuring the spiAndi Shyti
When the clock is coming from the cmu it is not required to be disabled and then re-enabled in order to change the rate. Besides, some exynos chipsets (e.g. exynos5433) do not deliver any to the SFR if one from the pclk ("spi" in this case) or sclk ("busclk") is disabled. Remove the clock disabling/enabling to avoid falling into this situation. Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Andi Shyti <andi.shyti@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-07doc-rst: add parse-headers.pl scriptMauro Carvalho Chehab
This script parses a header file and converts it into a parsed-literal block, creating references for ioctls, defines, typedefs, enums and structs. It also allow an external file to modify the rules, in order to fix the expressions. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-07Merge branch 'timers/fast-wheel' into timers/coreIngo Molnar
2016-07-07timers: Implement optimization for same expiry time in mod_timer()Anna-Maria Gleixner
The existing optimization for same expiry time in mod_timer() checks whether the timer expiry time is the same as the new requested expiry time. In the old timer wheel implementation this does not take the slack batching into account, neither does the new implementation evaluate whether the new expiry time will requeue the timer to the same bucket. To optimize that, we can calculate the resulting bucket and check if the new expiry time is different from the current expiry time. This calculation happens outside the base lock held region. If the resulting bucket is the same we can avoid taking the base lock and requeueing the timer. If the timer needs to be requeued then we have to check under the base lock whether the base time has changed between the lockless calculation and taking the lock. If it has changed we need to recalculate under the lock. This optimization takes effect for timers which are enqueued into the less granular wheel levels (1 and above). With a simple test case the functionality has been verified: Before After Match: 5.5% 86.6% Requeue: 94.5% 13.4% Recalc: <0.01% In the non optimized case the timer is requeued in 94.5% of the cases. With the index optimization in place the requeue rate drops to 13.4%. The case where the lockless index calculation has to be redone is less than 0.01%. With a real world test case (networking) we observed the following changes: Before After Match: 97.8% 99.7% Requeue: 2.2% 0.3% Recalc: <0.001% That means two percent fewer lock/requeue/unlock operations done in one of the hot path use cases of timers. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.778527749@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Split out index calculationAnna-Maria Gleixner
For further optimizations we need to seperate index calculation from queueing. No functional change. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.691159619@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Only wake softirq if necessaryThomas Gleixner
With the wheel forwading in place and with the HZ=1000 4ms folding we can avoid running the softirq at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.607650550@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Forward the wheel clock whenever possibleThomas Gleixner
The wheel clock is stale when a CPU goes into a long idle sleep. This has the side effect that timers which are queued end up in the outer wheel levels. That results in coarser granularity. To solve this, we keep track of the idle state and forward the wheel clock whenever possible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers/nohz: Remove pointless tick_nohz_kick_tick() functionThomas Gleixner
This was a failed attempt to optimize the timer expiry in idle, which was disabled and never revisited. Remove the cruft. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.431073782@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Optimize collect_expired_timers() for NOHZAnna-Maria Gleixner
After a NOHZ idle sleep the timer wheel must be forwarded to current jiffies. There might be expired timers so the current code loops and checks the expired buckets for timers. This can take quite some time for long NOHZ idle periods. The pending bitmask in the timer base allows us to do a quick search for the next expiring timer and therefore a fast forward of the base time which prevents pointless long lasting loops. For a 3 seconds idle sleep this reduces the catchup time from ~1ms to 5us. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.351296290@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Move __run_timers() functionAnna-Maria Gleixner
Move __run_timers() below __next_timer_interrupt() and next_pending_bucket() in preparation for __run_timers() NOHZ optimization. No functional change. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.271872665@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Remove set_timer_slack() leftoversThomas Gleixner
We now have implicit batching in the timer wheel. The slack API is no longer used, so remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Andrew F. Davis <afd@ti.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jaehoon Chung <jh80.chung@samsung.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: John Stultz <john.stultz@linaro.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Nyman <mathias.nyman@intel.com> Cc: Pali Rohár <pali.rohar@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-usb@vger.kernel.org Cc: netdev@vger.kernel.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.189813118@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Switch to a non-cascading wheelThomas Gleixner
The current timer wheel has some drawbacks: 1) Cascading: Cascading can be an unbound operation and is completely pointless in most cases because the vast majority of the timer wheel timers are canceled or rearmed before expiration. (They are used as timeout safeguards, not as real timers to measure time.) 2) No fast lookup of the next expiring timer: In NOHZ scenarios the first timer soft interrupt after a long NOHZ period must fast forward the base time to the current value of jiffies. As we have no way to find the next expiring timer fast, the code loops linearly and increments the base time one by one and checks for expired timers in each step. This causes unbound overhead spikes exactly in the moment when we should wake up as fast as possible. After a thorough analysis of real world data gathered on laptops, workstations, webservers and other machines (thanks Chris!) I came to the conclusion that the current 'classic' timer wheel implementation can be modified to address the above issues. The vast majority of timer wheel timers is canceled or rearmed before expiry. Most of them are timeouts for networking and other I/O tasks. The nature of timeouts is to catch the exception from normal operation (TCP ack timed out, disk does not respond, etc.). For these kinds of timeouts the accuracy of the timeout is not really a concern. Timeouts are very often approximate worst-case values and in case the timeout fires, we already waited for a long time and performance is down the drain already. The few timers which actually expire can be split into two categories: 1) Short expiry times which expect halfways accurate expiry 2) Long term expiry times are inaccurate today already due to the batching which is done for NOHZ automatically and also via the set_timer_slack() API. So for long term expiry timers we can avoid the cascading property and just leave them in the less granular outer wheels until expiry or cancelation. Timers which are armed with a timeout larger than the wheel capacity are no longer cascaded. We expire them with the longest possible timeout (6+ days). We have not observed such timeouts in our data collection, but at least we handle them, applying the rule of the least surprise. To avoid extending the wheel levels for HZ=1000 so we can accomodate the longest observed timeouts (5 days in the network conntrack code) we reduce the first level granularity on HZ=1000 to 4ms, which effectively is the same as the HZ=250 behaviour. From our data analysis there is nothing which relies on that 1ms granularity and as a side effect we get better batching and timer locality for the networking code as well. Contrary to the classic wheel the granularity of the next wheel is not the capacity of the first wheel. The granularities of the wheels are in the currently chosen setting 8 times the granularity of the previous wheel. So for HZ=250 we end up with the following granularity levels: Level Offset Granularity Range 0 0 4 ms 0 ms - 252 ms 1 64 32 ms 256 ms - 2044 ms (256ms - ~2s) 2 128 256 ms 2048 ms - 16380 ms (~2s - ~16s) 3 192 2048 ms (~2s) 16384 ms - 131068 ms (~16s - ~2m) 4 256 16384 ms (~16s) 131072 ms - 1048572 ms (~2m - ~17m) 5 320 131072 ms (~2m) 1048576 ms - 8388604 ms (~17m - ~2h) 6 384 1048576 ms (~17m) 8388608 ms - 67108863 ms (~2h - ~18h) 7 448 8388608 ms (~2h) 67108864 ms - 536870911 ms (~18h - ~6d) That's a worst case inaccuracy of 12.5% for the timers which are queued at the beginning of a level. So the new wheel concept addresses the old issues: 1) Cascading is avoided completely 2) By keeping the timers in the bucket until expiry/cancelation we can track the buckets which have timers enqueued in a bucket bitmap and therefore can look up the next expiring timer very fast and O(1). A further benefit of the concept is that the slack calculation which is done on every timer start is no longer necessary because the granularity levels provide natural batching already. Our extensive testing with various loads did not show any performance degradation vs. the current wheel implementation. This patch does not address the 'fast lookup' issue as we wanted to make sure that there is no regression introduced by the wheel redesign. The optimizations are in follow up patches. This patch contains fixes from Anna-Maria Gleixner and Richard Cochran. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.108621834@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Reduce the CPU index space to 256kThomas Gleixner
We want to store the array index in the flags space. 256k CPUs should be enough for a while. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.030144293@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Give a few structs and members proper namesThomas Gleixner
Some of the names in the internal implementation of the timer code are not longer correct and others are simply too long to type. Clean it up before we switch the wheel implementation over to the new scheme. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.948752516@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07hlist: Add hlist_is_singular_node() helperThomas Gleixner
Required to figure out whether the entry is the only one in the hlist. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.867631372@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07signals: Use hrtimer for sigtimedwait()Thomas Gleixner
We've converted most timeout related syscalls to hrtimers, but sigtimedwait() did not get this treatment. Convert it so we get a reasonable accuracy and remove the user space exposure to the timer wheel properties. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Cyril Hrubis <chrubis@suse.cz> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.787164909@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers: Remove the deprecated mod_timer_pinned() APIThomas Gleixner
We switched all users to initialize the timers as pinned and call mod_timer(). Remove the now unused timer API function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.706205231@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers, net/ipv4/inet: Initialize connection request timers as pinnedThomas Gleixner
Pinned timers must carry the pinned attribute in the timer structure itself, so convert the code to the new API. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.617891430@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinnedThomas Gleixner
Pinned timers must carry the pinned attribute in the timer structure itself, so convert the code to the new API. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.537448301@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07timers, drivers/tty/metag_da: Initialize the poll timer as pinnedThomas Gleixner
Pinned timers must carry the pinned attribute in the timer structure itself, so convert the code to the new API. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.456452642@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>