summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2017-07-03Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "A rather large update for timers/timekeeping: - compat syscall consolidation (Al Viro) - Posix timer consolidation (Christoph Helwig / Thomas Gleixner) - Cleanup of the device tree based initialization for clockevents and clocksources (Daniel Lezcano) - Consolidation of the FTTMR010 clocksource/event driver (Linus Walleij) - The usual set of small fixes and updates all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (93 commits) timers: Make the cpu base lock raw clocksource/drivers/mips-gic-timer: Fix an error code in 'gic_clocksource_of_init()' clocksource/drivers/fsl_ftm_timer: Unmap region obtained by of_iomap clocksource/drivers/tcb_clksrc: Make IO endian agnostic clocksource/drivers/sun4i: Switch to the timer-of common init clocksource/drivers/timer-of: Fix invalid iomap check Revert "ktime: Simplify ktime_compare implementation" clocksource/drivers: Fix uninitialized variable use in timer_of_init kselftests: timers: Add test for frequency step kselftests: timers: Fix inconsistency-check to not ignore first timestamp time: Add warning about imminent deprecation of CONFIG_GENERIC_TIME_VSYSCALL_OLD time: Clean up CLOCK_MONOTONIC_RAW time handling posix-cpu-timers: Make timespec to nsec conversion safe itimer: Make timeval to nsec conversion range limited timers: Fix parameter description of try_to_del_timer_sync() ktime: Simplify ktime_compare implementation clocksource/drivers/fttmr010: Factor out clock read code clocksource/drivers/fttmr010: Implement delay timer clocksource/drivers: Add timer-of common init routine clocksource/drivers/tcb_clksrc: Save timer context on suspend/resume ...
2017-07-03Merge tag 'm68k-for-v4.13-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - NuBus improvements and cleanups - defconfig updates - Fix debugger syscall restart interactions, leading to the global removal of ptrace_signal_deliver() * tag 'm68k-for-v4.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Remove ptrace_signal_deliver m68k/defconfig: Update defconfigs for v4.12-rc1 nubus: Fix pointer validation nubus: Remove slot zero probe
2017-07-03Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull nohz updates from Ingo Molnar: "The main changes in this cycle relate to fixing another bad (but sporadic and hard to detect) interaction between the dynticks scheduler tick and hrtimers, plus related improvements to better detection and handling of similar problems - by Frédéric Weisbecker" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: Fix spurious warning when hrtimer and clockevent get out of sync nohz: Fix buggy tick delay on IRQ storms nohz: Reset next_tick cache even when the timer has no regs nohz: Fix collision between tick and other hrtimers, again nohz: Add hrtimer sanity check
2017-07-03Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...
2017-07-03Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Most of the changes are for tooling, the main changes in this cycle were: - Improve Intel-PT hardware tracing support, both on the kernel and on the tooling side: PTWRITE instruction support, power events for C-state tracing, etc. (Adrian Hunter) - Add support to measure SMI cost to the x86 architecture, with tooling support in 'perf stat' (Kan Liang) - Support function filtering in 'perf ftrace', plus related improvements (Namhyung Kim) - Allow adding and removing fields to the default 'perf script' columns, using + or - as field prefixes to do so (Andi Kleen) - Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso' (Mark Santaniello) - Add perf tooling unwind support for PowerPC (Paolo Bonzini) - ... and various other improvements as well" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits) perf auxtrace: Add CPU filter support perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSC perf intel-pt: Update documentation to include new ptwrite and power events perf intel-pt: Add example script for power events and PTWRITE perf intel-pt: Synthesize new power and "ptwrite" events perf intel-pt: Move code in intel_pt_synth_events() to simplify attr setting perf intel-pt: Factor out intel_pt_set_event_name() perf intel-pt: Tidy messages into called function intel_pt_synth_event() perf intel-pt: Tidy Intel PT evsel lookup into separate function perf intel-pt: Join needlessly wrapped lines perf intel-pt: Remove unused instructions_sample_period perf intel-pt: Factor out common code synthesizing event samples perf script: Add synthesized Intel PT power and ptwrite events perf/x86/intel: Constify the 'lbr_desc[]' array and make a function static perf script: Add 'synth' field for synthesized event payloads perf auxtrace: Add itrace option to output power events perf auxtrace: Add itrace option to output ptwrite events tools include: Add byte-swapping macros to kernel.h perf script: Add 'synth' event type for synthesized events x86/insn: perf tools: Add new ptwrite instruction ...
2017-07-03Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Add CONFIG_REFCOUNT_FULL=y to allow the disabling of the 'full' (robustness checked) refcount_t implementation with slightly lower runtime overhead. (Kees Cook) The lighter weight variant is the default. The two variants use the same API. Having this variant was a precondition by some maintainers to merge refcount_t cleanups. - Add lockdep support for rtmutexes (Peter Zijlstra) - liblockdep fixes and improvements (Sasha Levin, Ben Hutchings) - ... misc fixes and improvements" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) locking/refcount: Remove the half-implemented refcount_sub() API locking/refcount: Create unchecked atomic_t implementation locking/rtmutex: Don't initialize lockdep when not required locking/selftest: Add RT-mutex support locking/selftest: Remove the bad unlock ordering test rt_mutex: Add lockdep annotations MAINTAINERS: Claim atomic*_t maintainership locking/x86: Remove the unused atomic_inc_short() methd tools/lib/lockdep: Remove private kernel headers tools/lib/lockdep: Hide liblockdep output from test results tools/lib/lockdep: Add dummy current_gfp_context() tools/include: Add IS_ERR_OR_NULL to err.h tools/lib/lockdep: Add empty __is_[module,kernel]_percpu_address tools/lib/lockdep: Include err.h tools/include: Add (mostly) empty include/linux/sched/mm.h tools/lib/lockdep: Use LDFLAGS tools/lib/lockdep: Remove double-quotes from soname tools/lib/lockdep: Fix object file paths used in an out-of-tree build tools/lib/lockdep: Fix compilation for 4.11 tools/lib/lockdep: Don't mix fd-based and stream IO ...
2017-07-03Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The sole purpose of these changes is to shrink and simplify the RCU code base, which has suffered from creeping bloat over the past couple of years. The end result is a net removal of ~2700 lines of code: 79 files changed, 1496 insertions(+), 4211 deletions(-) Plus there's a marked reduction in the Kconfig space complexity as well, here's the number of matches on 'grep RCU' in the .config: before after x86-defconfig 17 15 x86-allmodconfig 33 20" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (86 commits) rcu: Remove RCU CPU stall warnings from Tiny RCU rcu: Remove event tracing from Tiny RCU rcu: Move RCU debug Kconfig options to kernel/rcu rcu: Move RCU non-debug Kconfig options to kernel/rcu rcu: Eliminate NOCBs CPU-state Kconfig options rcu: Remove debugfs tracing srcu: Remove Classic SRCU srcu: Fix rcutorture-statistics typo rcu: Remove SPARSE_RCU_POINTER Kconfig option rcu: Remove the now-obsolete PROVE_RCU_REPEATEDLY Kconfig option rcu: Remove typecheck() from RCU locking wrapper functions rcu: Remove #ifdef moving rcu_end_inkernel_boot from rcupdate.h rcu: Remove nohz_full full-system-idle state machine rcu: Remove the RCU_KTHREAD_PRIO Kconfig option rcu: Remove *_SLOW_* Kconfig options srcu: Use rnp->lock wrappers to replace explicit memory barriers rcu: Move rnp->lock wrappers for SRCU use rcu: Convert rnp->lock wrappers to macros for SRCU use rcu: Refactor #includes from include/linux/rcupdate.h bcm47xx: Fix build regression ...
2017-07-03Merge branch 'core-objtool-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: "This is an extensive rewrite of the objdump tool to track all stack pointer modifications through the machine instructions of disassembled functions found in kernel .o files. This re-design removes the prior dependency on CONFIG_FRAME_POINTERS, with the goal to prepare the tool to generate kernel debuginfo data in the future. There's also an increase in checking/tracking robustness as a side effect as well. No (intended) changes to existing functionality" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Silence warnings for functions which use IRET objtool: Implement stack validation 2.0 objtool, x86: Add several functions and files to the objtool whitelist objtool: Move checking code to check.c
2017-07-03Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block/IO updates from Jens Axboe: "This is the main pull request for the block layer for 4.13. Not a huge round in terms of features, but there's a lot of churn related to some core cleanups. Note this depends on the UUID tree pull request, that Christoph already sent out. This pull request contains: - A series from Christoph, unifying the error/stats codes in the block layer. We now use blk_status_t everywhere, instead of using different schemes for different places. - Also from Christoph, some cleanups around request allocation and IO scheduler interactions in blk-mq. - And yet another series from Christoph, cleaning up how we handle and do bounce buffering in the block layer. - A blk-mq debugfs series from Bart, further improving on the support we have for exporting internal information to aid debugging IO hangs or stalls. - Also from Bart, a series that cleans up the request initialization differences across types of devices. - A series from Goldwyn Rodrigues, allowing the block layer to return failure if we will block and the user asked for non-blocking. - Patch from Hannes for supporting setting loop devices block size to that of the underlying device. - Two series of patches from Javier, fixing various issues with lightnvm, particular around pblk. - A series from me, adding support for write hints. This comes with NVMe support as well, so applications can help guide data placement on flash to improve performance, latencies, and write amplification. - A series from Ming, improving and hardening blk-mq support for stopping/starting and quiescing hardware queues. - Two pull requests for NVMe updates. Nothing major on the feature side, but lots of cleanups and bug fixes. From the usual crew. - A series from Neil Brown, greatly improving the bio rescue set support. Most notably, this kills the bio rescue work queues, if we don't really need them. - Lots of other little bug fixes that are all over the place" * 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits) lightnvm: pblk: set line bitmap check under debug lightnvm: pblk: verify that cache read is still valid lightnvm: pblk: add initialization check lightnvm: pblk: remove target using async. I/Os lightnvm: pblk: use vmalloc for GC data buffer lightnvm: pblk: use right metadata buffer for recovery lightnvm: pblk: schedule if data is not ready lightnvm: pblk: remove unused return variable lightnvm: pblk: fix double-free on pblk init lightnvm: pblk: fix bad le64 assignations nvme: Makefile: remove dead build rule blk-mq: map all HWQ also in hyperthreaded system nvmet-rdma: register ib_client to not deadlock in device removal nvme_fc: fix error recovery on link down. nvmet_fc: fix crashes on bad opcodes nvme_fc: Fix crash when nvme controller connection fails. nvme_fc: replace ioabort msleep loop with completion nvme_fc: fix double calls to nvme_cleanup_cmd() nvme-fabrics: verify that a controller returns the correct NQN nvme: simplify nvme_dev_attrs_are_visible ...
2017-07-03Merge tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuidLinus Torvalds
Pull uuid subsystem from Christoph Hellwig: "This is the new uuid subsystem, in which Amir, Andy and I have started consolidating our uuid/guid helpers and improving the types used for them. Note that various other subsystems have pulled in this tree, so I'd like it to go in early. UUID/GUID summary: - introduce the new uuid_t/guid_t types that are going to replace the somewhat confusing uuid_be/uuid_le types and make the terminology fit the various specs, as well as the userspace libuuid library. (me, based on a previous version from Amir) - consolidated generic uuid/guid helper functions lifted from XFS and libnvdimm (Amir and me) - conversions to the new types and helpers (Amir, Andy and me)" * tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid: (34 commits) ACPI: hns_dsaf_acpi_dsm_guid can be static mmc: sdhci-pci: make guid intel_dsm_guid static uuid: Take const on input of uuid_is_null() and guid_is_null() thermal: int340x_thermal: fix compile after the UUID API switch thermal: int340x_thermal: Switch to use new generic UUID API acpi: always include uuid.h ACPI: Switch to use generic guid_t in acpi_evaluate_dsm() ACPI / extlog: Switch to use new generic UUID API ACPI / bus: Switch to use new generic UUID API ACPI / APEI: Switch to use new generic UUID API acpi, nfit: Switch to use new generic UUID API MAINTAINERS: add uuid entry tmpfs: generate random sb->s_uuid scsi_debug: switch to uuid_t nvme: switch to uuid_t sysctl: switch to use uuid_t partitions/ldm: switch to use uuid_t overlayfs: use uuid_t instead of uuid_be fs: switch ->s_uuid to uuid_t ima/policy: switch to use uuid_t ...
2017-06-30Merge tag 'trace-v4.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull last-minute tracing fixes from Steven Rostedt: "Two fixes: One is for a crash when using the :mod: trace probe command into stack_trace_filter. This bug was introduced during the last merge window. The other was there forever. It's a small bug that makes it impossible to name a module function for kprobes when the module starts with a digit" * tag 'trace-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Allow to create probe with a module name starting with a digit ftrace: Fix regression with module command in stack_trace_filter
2017-06-30objtool, x86: Add several functions and files to the objtool whitelistJosh Poimboeuf
In preparation for an objtool rewrite which will have broader checks, whitelist functions and files which cause problems because they do unusual things with the stack. These whitelists serve as a TODO list for which functions and files don't yet have undwarf unwinder coverage. Eventually most of the whitelists can be removed in favor of manual CFI hint annotations or objtool improvements. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-30sched/cputime: Refactor the cputime_adjust() codeGustavo A. R. Silva
Address a Coverity false positive, which is caused by overly convoluted code: Value assigned to variable 'utime' at line 619:utime = rtime; is overwritten at line 642:utime = rtime - stime; before it can be used. This makes such variable assignment useless. Remove this variable assignment and refactor the code related. Addresses-Coverity-ID: 1371643 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Cc: Frans Klaver <fransklaver@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/20170629184128.GA5271@embeddedgus Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-30sched/debug: Expose the number of RT/DL tasks that can migrateDaniel Bristot de Oliveira
Add the value of the rt_rq.rt_nr_migratory and dl_rq.dl_nr_migratory to the sched_debug output, for instance: rt_rq[0]: .rt_nr_running : 2 .rt_nr_migratory : 1 <--- Like this .rt_throttled : 0 .rt_time : 828.645877 .rt_runtime : 1000.000000 This is useful to debug problems related to the RT/DL schedulers. This also fixes the format of some variables, that were unsigned, rather than signed. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Link: http://lkml.kernel.org/r/7896f71cada54ee7dd8507bb666063a2e051c3d4.1498482127.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-29tracing/kprobes: Allow to create probe with a module name starting with a digitSabrina Dubroca
Always try to parse an address, since kstrtoul() will safely fail when given a symbol as input. If that fails (which will be the case for a symbol), try to parse a symbol instead. This allows creating a probe such as: p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0 Which is necessary for this command to work: perf probe -m 8021q -a vlan_gro_receive Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net Cc: stable@vger.kernel.org Fixes: 413d37d1e ("tracing: Add kprobe-based event tracer") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Need to access netdev->num_rx_queues behind an accessor in netvsc driver otherwise the build breaks with some configs, from Arnd Bergmann. 2) Add dummy xfrm_dev_event() so that build doesn't fail when CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu. 3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan Carpenter. 4) Fix MCDI command size for filter operations in sfc driver, from Martin Habets. 5) Fix UFO segmenting so that we don't calculate incorrect checksums, from Michal Kubecek. 6) When ipv6 datagram connects fail, reset destination address and port. From Wei Wang. 7) TCP disconnect must reset the cached receive DST, from WANG Cong. 8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric Dumazet. 9) fman driver has to depend on HAS_DMA, from Madalin Bucur. 10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann. 11) Fix negative page counts with GFO, from Michal Kubecek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) sfc: fix attempt to translate invalid filter ID net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() bpf: prevent leaking pointer via xadd on unpriviledged arcnet: com20020-pci: add missing pdev setup in netdev structure arcnet: com20020-pci: fix dev_id calculation arcnet: com20020: remove needless base_addr assignment Trivial fix to spelling mistake in arc_printk message arcnet: change irq handler to lock irqsave rocker: move dereference before free mlxsw: spectrum_router: Fix NULL pointer dereference net: sched: Fix one possible panic when no destroy callback virtio-net: serialize tx routine during reset net: usb: asix88179_178a: Add support for the Belkin B2B128 fsl/fman: add dependency on HAS_DMA net: prevent sign extension in dev_get_stats() tcp: reset sk_rx_dst in tcp_disconnect() net: ipv6: reset daddr and dport in sk if connect() fails bnx2x: Don't log mc removal needlessly bnxt_en: Fix netpoll handling. bnxt_en: Add missing logic to handle TPA end error conditions. ...
2017-06-29bpf: prevent leaking pointer via xadd on unpriviledgedDaniel Borkmann
Leaking kernel addresses on unpriviledged is generally disallowed, for example, verifier rejects the following: 0: (b7) r0 = 0 1: (18) r2 = 0xffff897e82304400 3: (7b) *(u64 *)(r1 +48) = r2 R2 leaks addr into ctx Doing pointer arithmetic on them is also forbidden, so that they don't turn into unknown value and then get leaked out. However, there's xadd as a special case, where we don't check the src reg for being a pointer register, e.g. the following will pass: 0: (b7) r0 = 0 1: (7b) *(u64 *)(r1 +48) = r0 2: (18) r2 = 0xffff897e82304400 ; map 4: (db) lock *(u64 *)(r1 +48) += r2 5: (95) exit We could store the pointer into skb->cb, loose the type context, and then read it out from there again to leak it eventually out of a map value. Or more easily in a different variant, too: 0: (bf) r6 = r1 1: (7a) *(u64 *)(r10 -8) = 0 2: (bf) r2 = r10 3: (07) r2 += -8 4: (18) r1 = 0x0 6: (85) call bpf_map_lookup_elem#1 7: (15) if r0 == 0x0 goto pc+3 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp 8: (b7) r3 = 0 9: (7b) *(u64 *)(r0 +0) = r3 10: (db) lock *(u64 *)(r0 +0) += r6 11: (b7) r0 = 0 12: (95) exit from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp 11: (b7) r0 = 0 12: (95) exit Prevent this by checking xadd src reg for pointer types. Also add a couple of test cases related to this. Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29ftrace: Fix regression with module command in stack_trace_filterSteven Rostedt (VMware)
When doing the following command: # echo ":mod:kvm_intel" > /sys/kernel/tracing/stack_trace_filter it triggered a crash. This happened with the clean up of probes. It required all callers to the regex function (doing ftrace filtering) to have ops->private be a pointer to a trace_array. But for the stack tracer, that is not the case. Allow for the ops->private to be NULL, and change the function command callbacks to handle the trace_array pointer being NULL as well. Fixes: d2afd57a4b96 ("tracing/ftrace: Allow instances to have their own function probes") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-29sched/numa: Hide numa_wake_affine() from UP buildThomas Gleixner
Stephen reported the following build warning in UP: kernel/sched/fair.c:2657:9: warning: 'struct sched_domain' declared inside parameter list ^ /home/sfr/next/next/kernel/sched/fair.c:2657:9: warning: its scope is only this definition or declaration, which is probably not what you want Hide the numa_wake_affine() inline stub on UP builds to get rid of it. Fixes: 3fed382b46ba ("sched/numa: Implement NUMA node level wake_affine()") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org>
2017-06-29timers: Make the cpu base lock rawSebastian Andrzej Siewior
The timers cpu base lock could not be converted to a raw spinlock becaue the lock held time was non-deterministic due to cascading and long lasting timer wheel traversals. The rework of the timer wheel to the new non-cascading model removed also the wheel traversals and the lock held times are deterministic now. This allows to make the lock raw and thereby unbreaks NOHz* on preempt-RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: http://lkml.kernel.org/r/20170627161538.30257-1-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few fixes for timekeeping and timers: - Plug a subtle race due to a missing READ_ONCE() in the timekeeping code where reloading of a pointer results in an inconsistent callback argument being supplied to the clocksource->read function. - Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the time keeping core code, to prevent a possible discontuity. - Apply a similar fix to the arm64 vdso clock_gettime() implementation - Add missing includes to clocksource drivers, which relied on indirect includes which fails in certain configs. - Use the proper iomem pointer for read/iounmap in a probe function" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting time: Fix clock->read(clock) race around clocksource changes clocksource: Explicitly include linux/clocksource.h when needed clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
2017-06-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Three fixlets for perf: - Return the proper error code if aux buffers for a event are not supported. - Calculate the probe offset for inlined functions correctly - Update the Skylake DTLB load/store miss event so it can count 1G TLB entries as well" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Fix probe definition for inlined functions perf/x86/intel: Add 1G DTLB load/store miss support for SKL perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
2017-06-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull timer fix from Eric Biederman: "This fixes an issue of confusing injected signals with the signals from posix timers that has existed since posix timers have been in the kernel. This patch is slightly simpler than my earlier version of this patch as I discovered in testing that I had misspelled "#ifdef CONFIG_POSIX_TIMERS". So I deleted that unnecessary test and made setting of resched_timer uncondtional. I have tested this and verified that without this patch there is a nasty hang that is easy to trigger, and with this patch everything works properly" Thomas Gleixner dixit: "It fixes the problem at hand and covers the ptrace case as well, which I missed. Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de>" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal: Only reschedule timers on signals timers have sent
2017-06-24sched/fair: Remove effective_load()Rik van Riel
The effective_load() function was only used by the NUMA balancing code, and not by the regular load balancing code. Now that the NUMA balancing code no longer uses it either, get rid of it. Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jhladky@redhat.com Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170623165530.22514-5-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24sched/numa: Implement NUMA node level wake_affine()Rik van Riel
Since select_idle_sibling() can place a task anywhere on a socket, comparing loads between individual CPU cores makes no real sense for deciding whether to do an affine wakeup across sockets, either. Instead, compare the load between the sockets in a similar way the load balancer and the numa balancing code do. Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jhladky@redhat.com Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170623165530.22514-4-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24sched/fair: Simplify wake_affine() for the single socket caseRik van Riel
Then 'this_cpu' and 'prev_cpu' are in the same socket, select_idle_sibling() will do its thing regardless of the return value of wake_affine(). Just return true and don't look at all the other things. Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jhladky@redhat.com Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170623165530.22514-3-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24sched/numa: Override part of migrate_degrades_locality() when idle balancingRik van Riel
Several tests in the NAS benchmark seem to run a lot slower with NUMA balancing enabled, than with NUMA balancing disabled. The slower run time corresponds with increased idle time. Overriding the final test of migrate_degrades_locality (but still doing the other NUMA tests first) seems to improve performance of those benchmarks. Reported-by: Jirka Hladky <jhladky@redhat.com> Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170623165530.22514-2-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23sched/rt: Move RT related code from sched/core.c to sched/rt.cNicolas Pitre
This helps making sched/core.c smaller and hopefully easier to understand and maintain. Signed-off-by: Nicolas Pitre <nico@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170621182203.30626-3-nicolas.pitre@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23sched/deadline: Move DL related code from sched/core.c to sched/deadline.cNicolas Pitre
This helps making sched/core.c smaller and hopefully easier to understand and maintain. Signed-off-by: Nicolas Pitre <nico@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170621182203.30626-2-nicolas.pitre@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabledNicolas Pitre
Make CONFIG_CPUSETS=y depend on SMP as this feature makes no sense on UP. This allows for configuring out cpuset_cpumask_can_shrink() and task_can_attach() entirely, which shrinks the kernel a bit. Signed-off-by: Nicolas Pitre <nico@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170614171926.8345-2-nicolas.pitre@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22sched/fair: Spare idle load balancing on nohz_full CPUsFrederic Weisbecker
Although idle load balancing obviously only concerns idle CPUs, it can be a disturbance on a busy nohz_full CPU. Indeed a CPU can only get rid of an idle load balancing duty once a tick fires while it runs a task and this can take a while on a nohz_full CPU. We could fix that and escape the idle load balancing duty from the very idle exit path but that would bring unecessary overhead. Lets just not bother and leave that job to housekeeping CPUs (those outside nohz_full range). The nohz_full CPUs simply don't want any disturbance. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1497838322-10913-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22nohz: Move idle balancer registration to the idle pathFrederic Weisbecker
The idle load balancing registration path assumes that we only stop the tick when the CPU is idle, ignoring the nohz full case. As a result, a nohz full CPU that is running a task may be chosen to perform idle load balancing. Lets make sure that only CPUs in dynticks idle mode can be picked as idle load balancers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1497838322-10913-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22sched/loadavg: Generalize "_idle" naming to "_nohz"Frederic Weisbecker
The loadavg naming code still assumes that nohz == idle whereas its code is actually handling well both nohz idle and nohz full. So lets fix the naming according to what the code actually does, to unconfuse the reader. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1497838322-10913-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fix from Jiri Kosina: "Fix the way how livepatches are being stacked with respect to RCU, from Petr Mladek" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Fix stacking of patches with respect to RCU
2017-06-21perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)Hendrik Brueckner
If the event for which an AUX area is about to be allocated, does not support setting up an AUX area, rb_alloc_aux() return -ENOTSUPP. This error condition is being returned unfiltered to the user space, and, for example, the perf tools fails with: failed to mmap with 524 (INTERNAL ERROR: strerror_r(524, 0x3fff497a1c8, 512)=22) This error can be easily seen with "perf record -m 128,256 -e cpu-clock". The 524 error code maps to -ENOTSUPP (in rb_alloc_aux()). The -ENOTSUPP error code shall be only used within the kernel. So the correct error code would then be -EOPNOTSUPP. With this commit, the perf tool then reports: failed to mmap with 95 (Operation not supported) which is more clear. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Hou <bjhoupu@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Cc: acme@kernel.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/1497954399-6355-1-git-send-email-brueckner@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-21Merge branch 'fortglx/4.13/time' of ↵Thomas Gleixner
https://git.linaro.org/people/john.stultz/linux into timers/core Merge time(keeping) updates from John Stultz: "Just a small set of changes, the biggest changes being the MONOTONIC_RAW handling cleanup, and a new kselftest from Miroslav. Also a a clear warning deprecating CONFIG_GENERIC_TIME_VSYSCALL_OLD, which affects ppc and ia64."
2017-06-21Merge branch 'timers/urgent' into timers/coreThomas Gleixner
Pick up dependent changes.
2017-06-20time: Add warning about imminent deprecation of CONFIG_GENERIC_TIME_VSYSCALL_OLDJohn Stultz
CONFIG_GENERIC_TIME_VSYSCALL_OLD was introduced five years ago to allow a transition from the old vsyscall implementations to the new method (which simplified internal accounting and made timekeeping more precise). However, PPC and IA64 have yet to make the transition, despite in some cases me sending test patches to try to help it along. http://patches.linaro.org/patch/30501/ http://patches.linaro.org/patch/35412/ If its helpful, my last pass at the patches can be found here: https://git.linaro.org/people/john.stultz/linux.git dev/oldvsyscall-cleanup So I think its time to set a deadline and make it clear this is going away. So this patch adds warnings about this functionality being dropped. Likely to be in v4.15. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-06-20time: Clean up CLOCK_MONOTONIC_RAW time handlingJohn Stultz
Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW, remove the duplicitive tk->raw_time.tv_nsec, which can be stored in tk->tkr_raw.xtime_nsec (similarly to how its handled for monotonic time). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Mentz <danielmentz@google.com> Tested-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-06-20posix-cpu-timers: Make timespec to nsec conversion safeThomas Gleixner
The expiry time of a posix cpu timer is supplied through sys_timer_set() via a struct timespec. The timespec is validated for correctness. In the actual set timer implementation the timespec is converted to a scalar nanoseconds value. If the tv_sec part of the time spec is large enough the conversion to nanoseconds (sec * NSEC_PER_SEC) overflows 64bit. Mitigate that by using the timespec_to_ktime() conversion function, which checks the tv_sec part for a potential mult overflow and clamps the result to KTIME_MAX, which is about 292 years. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170620154113.588276707@linutronix.de
2017-06-20itimer: Make timeval to nsec conversion range limitedThomas Gleixner
The expiry time of a itimer is supplied through sys_setitimer() via a struct timeval. The timeval is validated for correctness. In the actual set timer implementation the timeval is converted to a scalar nanoseconds value. If the tv_sec part of the time spec is large enough the conversion to nanoseconds (sec * NSEC_PER_SEC) overflows 64bit. Mitigate that by using the timeval_to_ktime() conversion function, which checks the tv_sec part for a potential mult overflow and clamps the result to KTIME_MAX, which is about 292 years. Reported-by: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170620154113.505981643@linutronix.de
2017-06-20timers: Fix parameter description of try_to_del_timer_sync()Peter Meerwald-Stadler
Signed-off-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Link: http://lkml.kernel.org/r/20170530194103.7454-1-pmeerw@pmeerw.net Cc: John Stultz <john.stultz@linaro.org> Cc: trivial@rustcorp.com.au Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-20sched/core: Drop the unused try_get_task_struct() helper functionDavidlohr Bueso
This function was introduced by: 150593bf8693 ("sched/api: Introduce task_rcu_dereference() and try_get_task_struct()") ... to allow easier usage of task_rcu_dereference(), however no users were ever added. Drop the helper. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/20170615023730.22827-1-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20Merge branch 'WIP.sched/core' into sched/coreIngo Molnar
Conflicts: kernel/sched/Makefile Pick up the waitqueue related renames - it didn't get much feedback, so it appears to be uncontroversial. Famous last words? ;-) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/fair: WARN() and refuse to set buddy when !se->on_rqDaniel Axtens
If we set a next or last buddy for a se that is not on_rq, we will end up taking a NULL pointer dereference in wakeup_preempt_entity via pick_next_task_fair. Detect when we would be about to do that, throw a warning and then refuse to actually set it. This has been suggested at least twice: https://marc.info/?l=linux-kernel&m=146651668921468&w=2 https://lkml.org/lkml/2016/6/16/663 I recently had to debug a problem with these (we hadn't backported Konstantin's patches in this area) and this would have saved a lot of time/pain. Just do it. Signed-off-by: Daniel Axtens <dja@axtens.net> Cc: Ben Segall <bsegall@google.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170510201139.16236-1-dja@axtens.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as ↵Ingo Molnar
well This definition of SCHED_WARN_ON(): #define SCHED_WARN_ON(x) ((void)(x)) is not fully compatible with the 'real' WARN_ON_ONCE() primitive, as it has no return value, so it cannot be used in conditionals. Fix it. Cc: Daniel Axtens <dja@axtens.net> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list namingIngo Molnar
So I've noticed a number of instances where it was not obvious from the code whether ->task_list was for a wait-queue head or a wait-queue entry. Furthermore, there's a number of wait-queue users where the lists are not for 'tasks' but other entities (poll tables, etc.), in which case the 'task_list' name is actively confusing. To clear this all up, name the wait-queue head and entry list structure fields unambiguously: struct wait_queue_head::task_list => ::head struct wait_queue_entry::task_list => ::entry For example, this code: rqw->wait.task_list.next != &wait->task_list ... is was pretty unclear (to me) what it's doing, while now it's written this way: rqw->wait.head.next != &wait->entry ... which makes it pretty clear that we are iterating a list until we see the head. Other examples are: list_for_each_entry_safe(pos, next, &x->task_list, task_list) { list_for_each_entry(wq, &fence->wait.task_list, task_list) { ... where it's unclear (to me) what we are iterating, and during review it's hard to tell whether it's trying to walk a wait-queue entry (which would be a bug), while now it's written as: list_for_each_entry_safe(pos, next, &x->head, entry) { list_for_each_entry(wq, &fence->wait.head, entry) { Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Move bit_wait_table[] and related functionality from ↵Ingo Molnar
sched/core.c to sched/wait_bit.c The key hashed waitqueue data structures and their initialization was done in the main scheduler file for no good reason, move them to sched/wait_bit.c instead. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>