summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2018-01-22Merge branch 'for-4.16-console-waiter-logic' into for-4.16Petr Mladek
2018-01-22Merge branch 'for-4.16-print-symbol' into for-4.16Petr Mladek
2018-01-22Merge branch 'for-4.16-deprecate-printk-pf' into for-4.16Petr Mladek
2018-01-22printk: drop redundant devkmsg_log_str memsetsSergey Senozhatsky
We copy in null terminated strings "on" and "off", no need to zero out devkmsg_log_str in control_devkmsg(). Link: http://lkml.kernel.org/r/20180119043901.1728-1-sergey.senozhatsky@gmail.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-21Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single fix for the new matrix allocator to prevent vector exhaustion by certain network drivers which allocate gazillions of unused vectors which cannot be put into reservation mode due to MSI and the lack of MSI entry masking. The fix/workaround is to spread the vectors across CPUs by searching the supplied target CPU mask for the CPU with the smallest number of allocated vectors" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irq/matrix: Spread interrupts on allocation
2018-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-19 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) bpf array map HW offload, from Jakub. 2) support for bpf_get_next_key() for LPM map, from Yonghong. 3) test_verifier now runs loaded programs, from Alexei. 4) xdp cpumap monitoring, from Jesper. 5) variety of tests, cleanups and small x64 JIT optimization, from Daniel. 6) user space can now retrieve HW JITed program, from Jiong. Note there is a minor conflict between Russell's arm32 JIT fixes and removal of bpf_jit_enable variable by Daniel which should be resolved by keeping Russell's comment and removing that variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The BPF verifier conflict was some minor contextual issue. The TUN conflict was less trivial. Cong Wang fixed a memory leak of tfile->tx_array in 'net'. This is an skb_array. But meanwhile in net-next tun changed tfile->tx_arry into tfile->tx_ring which is a ptr_ring. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19bpf: add upper complexity limit to verifier logDaniel Borkmann
Given the limit could potentially get further adjustments in the future, add it to the log so it becomes obvious what the current limit is w/o having to check the source first. This may also be helpful for debugging complexity related issues on kernels that backport from upstream. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19bpf: get rid of pure_initcall dependency to enable jitsDaniel Borkmann
Having a pure_initcall() callback just to permanently enable BPF JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave a small race window in future where JIT is still disabled on boot. Since we know about the setting at compilation time anyway, just initialize it properly there. Also consolidate all the individual bpf_jit_enable variables into a single one and move them under one location. Moreover, don't allow for setting unspecified garbage values on them. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19bpf, verifier: detect misconfigured mem, size argument pairDaniel Borkmann
I've seen two patch proposals now for helper additions that used ARG_PTR_TO_MEM or similar in reg_X but no corresponding ARG_CONST_SIZE in reg_X+1. Verifier won't complain in such case, but it will omit verifying the memory passed to the helper thus ending up badly. Detect such buggy helper function signature and bail out during verification rather than finding them through review. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19mm: Fix devm_memremap_pages() collision handlingJan H. Schönherr
If devm_memremap_pages() detects a collision while adding entries to the radix-tree, we call pgmap_radix_release(). Unfortunately, the function removes *all* entries for the range -- including the entries that caused the collision in the first place. Modify pgmap_radix_release() to take an additional argument to indicate where to stop, so that only newly added entries are removed from the tree. Cc: <stable@vger.kernel.org> Fixes: 9476df7d80df ("mm: introduce find_dev_pagemap()") Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-19mm: Fix memory size alignment in devm_memremap_pages_release()Jan H. Schönherr
The functions devm_memremap_pages() and devm_memremap_pages_release() use different ways to calculate the section-aligned amount of memory. The latter function may use an incorrect size if the memory region is small but straddles a section border. Use the same code for both. Cc: <stable@vger.kernel.org> Fixes: 5f29a77cd957 ("mm: fix mixed zone detection in devm_memremap_pages") Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-19bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE mapYonghong Song
Current LPM_TRIE map type does not implement MAP_GET_NEXT_KEY command. This command is handy when users want to enumerate keys. Otherwise, a different map which supports key enumeration may be required to store the keys. If the map data is sparse and all map data are to be deleted without closing file descriptor, using MAP_GET_NEXT_KEY to find all keys is much faster than enumerating all key space. This patch implements MAP_GET_NEXT_KEY command for LPM_TRIE map. If user provided key pointer is NULL or the key does not have an exact match in the trie, the first key will be returned. Otherwise, the next key will be returned. In this implemenation, key enumeration follows a postorder traversal of internal trie. More specific keys will be returned first than less specific ones, given a sequence of MAP_GET_NEXT_KEY syscalls. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19Merge tag 'trace-v4.15-rc4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two more small fixes - The conversion of enums into their actual numbers to display in the event format file had an off-by-one bug, that could cause an enum not to be converted, and break user space parsing tools. - A fix to a previous fix to bring back the context recursion checks. The interrupt case checks for NMI, IRQ and softirq, but the softirq returned the same number regardless if it was set or not, although the logic would force it to be set if it were hit" * tag 'trace-v4.15-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix converting enum's from the map in trace_event_eval_update() ring-buffer: Fix duplicate results in mapping context to bits in recursive lock
2018-01-19Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "cgroup.threads should be delegatable (ie. a container should be able to write to it from inside) but was missing the flag. The change is very low risk" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: make cgroup.threads delegatable
2018-01-19Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixlet from Tejun Heo: "One patch to add touch_nmi_watchdog() while dumping workqueue debug messages to avoid triggering the lockup detector spuriously. The change is very low risk" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: avoid hard lockups in show_workqueue_state()
2018-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix BPF divides by zero, from Eric Dumazet and Alexei Starovoitov. 2) Reject stores into bpf context via st and xadd, from Daniel Borkmann. 3) Fix a memory leak in TUN, from Cong Wang. 4) Disable RX aggregation on a specific troublesome configuration of r8152 in a Dell TB16b dock. 5) Fix sw_ctx leak in tls, from Sabrina Dubroca. 6) Fix program replacement in cls_bpf, from Daniel Borkmann. 7) Fix uninitialized station_info structures in cfg80211, from Johannes Berg. 8) Fix miscalculation of transport header offset field in flow dissector, from Eric Dumazet. 9) Fix LPM tree leak on failure in mlxsw driver, from Ido Schimmel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits) ibmvnic: Fix IPv6 packet descriptors ibmvnic: Fix IP offload control buffer ipv6: don't let tb6_root node share routes with other node ip6_gre: init dev->mtu and dev->hard_header_len correctly mlxsw: spectrum_router: Free LPM tree upon failure flow_dissector: properly cap thoff field fm10k: mark PM functions as __maybe_unused cfg80211: fix station info handling bugs netlink: reset extack earlier in netlink_rcv_skb can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once bpf: mark dst unknown on inconsistent {s, u}bounds adjustments bpf: fix cls_bpf on filter replace Net: ethernet: ti: netcp: Fix inbound ping crash if MTU size is greater than 1500 tls: reset crypto_info when do_tls_setsockopt_tx fails tls: return -EBUSY if crypto_info is already set tls: fix sw_ctx leak net/tls: Only attach to sockets in ESTABLISHED state net: fs_enet: do not call phy_stop() in interrupts r8152: disable RX aggregation on Dell TB16 dock ...
2018-01-19string: drop __must_check from strscpy() and restore strscpy() usages in cgroupTejun Heo
e7fd37ba1217 ("cgroup: avoid copying strings longer than the buffers") converted possibly unsafe strncpy() usages in cgroup to strscpy(). However, although the callsites are completely fine with truncated copied, because strscpy() is marked __must_check, it led to the following warnings. kernel/cgroup/cgroup.c: In function ‘cgroup_file_name’: kernel/cgroup/cgroup.c:1400:10: warning: ignoring return value of ‘strscpy’, declared with attribute warn_unused_result [-Wunused-result] strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX); ^ To avoid the warnings, 50034ed49645 ("cgroup: use strlcpy() instead of strscpy() to avoid spurious warning") switched them to strlcpy(). strlcpy() is worse than strlcpy() because it unconditionally runs strlen() on the source string, and the only reason we switched to strlcpy() here was because it was lacking __must_check, which doesn't reflect any material differences between the two function. It's just that someone added __must_check to strscpy() and not to strlcpy(). These basic string copy operations are used in variety of ways, and one of not-so-uncommon use cases is safely handling truncated copies, where the caller naturally doesn't care about the return value. The __must_check doesn't match the actual use cases and forces users to opt for inferior variants which lack __must_check by happenstance or spread ugly (void) casts. Remove __must_check from strscpy() and restore strscpy() usages in cgroup. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chris Metcalf <cmetcalf@ezchip.com>
2018-01-18bpf: offload: report device information about offloaded mapsJakub Kicinski
Tell user space about device on which the map was created. Unfortunate reality of user ABI makes sharing this code with program offload difficult but the information is the same. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18bpf: offload: allow array map offloadJakub Kicinski
The special handling of different map types is left to the driver. Allow offload of array maps by simply adding it to accepted types. For nfp we have to make sure array elements are not deleted. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18bpf: arraymap: use bpf_map_init_from_attr()Jakub Kicinski
Arraymap was not converted to use bpf_map_init_from_attr() to avoid merge conflicts with emergency fixes. Do it now. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18bpf: arraymap: move checks out of alloc functionJakub Kicinski
Use the new callback to perform allocation checks for array maps. The fd maps don't need a special allocation callback, they only need a special check callback. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18bpf: allow socket_filter programs to use bpf_prog_test_runAlexei Starovoitov
in order to improve test coverage allow socket_filter program type to be run via bpf_prog_test_run command. Since such programs can be loaded by non-root tighten permissions for bpf_prog_test_run to be root only to avoid surprises. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18tracing: Fix converting enum's from the map in trace_event_eval_update()Steven Rostedt (VMware)
Since enums do not get converted by the TRACE_EVENT macro into their values, the event format displaces the enum name and not the value. This breaks tools like perf and trace-cmd that need to interpret the raw binary data. To solve this, an enum map was created to convert these enums into their actual numbers on boot up. This is done by TRACE_EVENTS() adding a TRACE_DEFINE_ENUM() macro. Some enums were not being converted. This was caused by an optization that had a bug in it. All calls get checked against this enum map to see if it should be converted or not, and it compares the call's system to the system that the enum map was created under. If they match, then they call is processed. To cut down on the number of iterations needed to find the maps with a matching system, since calls and maps are grouped by system, when a match is made, the index into the map array is saved, so that the next call, if it belongs to the same system as the previous call, could start right at that array index and not have to scan all the previous arrays. The problem was, the saved index was used as the variable to know if this is a call in a new system or not. If the index was zero, it was assumed that the call is in a new system and would keep incrementing the saved index until it found a matching system. The issue arises when the first matching system was at index zero. The next map, if it belonged to the same system, would then think it was the first match and increment the index to one. If the next call belong to the same system, it would begin its search of the maps off by one, and miss the first enum that should be converted. This left a single enum not converted properly. Also add a comment to describe exactly what that index was for. It took me a bit too long to figure out what I was thinking when debugging this issue. Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com Cc: stable@vger.kernel.org Fixes: 0c564a538aa93 ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values") Reported-by: Chuck Lever <chuck.lever@oracle.com> Teste-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-01-18ring-buffer: Fix duplicate results in mapping context to bits in recursive lockSteven Rostedt (VMware)
In bringing back the context checks, the code checks first if its normal (non-interrupt) context, and then for NMI then IRQ then softirq. The final check is redundant. Since the if branch is only hit if the context is one of NMI, IRQ, or SOFTIRQ, if it's not NMI or IRQ there's no reason to check if it is SOFTIRQ. The current code returns the same result even if its not a SOFTIRQ. Which is confusing. pc & SOFTIRQ_OFFSET ? 2 : RB_CTX_SOFTIRQ Is redundant as RB_CTX_SOFTIRQ *is* 2! Fixes: a0e3a18f4baf ("ring-buffer: Bring back context level recursive checks") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-18 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a divide by zero due to wrong if (src_reg == 0) check in 64-bit mode. Properly handle this in interpreter and mask it also generically in verifier to guard against similar checks in JITs, from Eric and Alexei. 2) Fix a bug in arm64 JIT when tail calls are involved and progs have different stack sizes, from Daniel. 3) Reject stores into BPF context that are not expected BPF_STX | BPF_MEM variant, from Daniel. 4) Mark dst reg as unknown on {s,u}bounds adjustments when the src reg has derived bounds from dead branches, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18lockdep: Make lockdep checking constantMatthew Wilcox
There are several places in the kernel which would like to pass a const pointer to lockdep_is_held(). Constify the entire path so nobody has to trick the compiler. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Link: https://lkml.kernel.org/r/20180117151414.23686-3-willy@infradead.org
2018-01-18lockdep: Assign lock keys on registrationMatthew Wilcox
Lockdep is assigning lock keys when a lock was looked up. This is unnecessary; if the lock has never been registered then it is known that it is not locked. It also complicates the calling convention. Switch to assigning the lock key in register_lock_class(). Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Link: https://lkml.kernel.org/r/20180117151414.23686-2-willy@infradead.org
2018-01-18irq/matrix: Spread interrupts on allocationThomas Gleixner
Keith reported an issue with vector space exhaustion on a server machine which is caused by the i40e driver allocating 168 MSI interrupts when the driver is initialized, even when most of these interrupts are not used at all. The x86 vector allocation code tries to avoid the immediate allocation with the reservation mode, but the card uses MSI and does not support MSI entry masking, which prevents reservation mode and requires immediate vector allocation. The matrix allocator is a bit naive and prefers the first CPU in the cpumask which describes the possible target CPUs for an allocation. That results in allocating all 168 vectors on CPU0 which later causes vector space exhaustion when the NVMe driver tries to allocate managed interrupts on each CPU for the per CPU queues. Avoid this by finding the CPU which has the lowest vector allocation count to spread out the non managed interrupt accross the possible target CPUs. Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator") Reported-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Keith Busch <keith.busch@intel.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801171557330.1777@nanos
2018-01-18Merge branches 'acpi-pm' and 'pm-sleep'Rafael J. Wysocki
* acpi-pm: platform/x86: surfacepro3: Support for wakeup from suspend-to-idle ACPI / PM: Use Low Power S0 Idle on more systems ACPI / PM: Make it possible to ignore the system sleep blacklist * pm-sleep: PM / hibernate: Drop unused parameter of enough_swap block, scsi: Fix race between SPI domain validation and system suspend PM / sleep: Make lock/unlock_system_sleep() available to kernel modules PM: hibernate: Do not subtract NR_FILE_MAPPED in minimum_image_size()
2018-01-18Merge branches 'pm-domains', 'pm-kconfig', 'pm-cpuidle' and 'powercap'Rafael J. Wysocki
* pm-domains: PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume() PM / domains: Don't skip driver's ->suspend|resume_noirq() callbacks PM / Domains: Remove obsolete "samsung,power-domain" check * pm-kconfig: bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate PM: Provide a config snippet for disabling PM * pm-cpuidle: cpuidle: Avoid NULL argument in cpuidle_switch_governor() * powercap: powercap: intel_rapl: Fix trailing semicolon powercap: add suspend and resume mechanism for SOC power limit powercap: Simplify powercap_init()
2018-01-18bpf: change fake_ip for bpf_trace_printk helperYonghong Song
Currently, for bpf_trace_printk helper, fake ip address 0x1 is used with comments saying that fake ip will not be printed. This is indeed true for 4.12 and earlier version, but for 4.13 and later version, the ip address will be printed if it cannot be resolved with kallsym. Running samples/bpf/tracex5 program and you will have the following in the debugfs trace_pipe output: ... <...>-1819 [003] .... 443.497877: 0x00000001: mmap <...>-1819 [003] .... 443.498289: 0x00000001: syscall=102 (one of get/set uid/pid/gid) ... The kernel commit changed this behavior is: commit feaf1283d11794b9d518fcfd54b6bf8bee1f0b4b Author: Steven Rostedt (VMware) <rostedt@goodmis.org> Date: Thu Jun 22 17:04:55 2017 -0400 tracing: Show address when function names are not found ... This patch changed the comment and also altered the fake ip address to 0x0 as users may think 0x1 has some special meaning while it doesn't. The new output: ... <...>-1799 [002] .... 25.953576: 0: mmap <...>-1799 [002] .... 25.953865: 0: read(fd=0, buf=00000000053936b5, size=512) ... Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18bpf: add new jited info fields in bpf_dev_offload and bpf_prog_infoJiong Wang
For host JIT, there are "jited_len"/"bpf_func" fields in struct bpf_prog used by all host JIT targets to get jited image and it's length. While for offload, targets are likely to have different offload mechanisms that these info are kept in device private data fields. Therefore, BPF_OBJ_GET_INFO_BY_FD syscall needs an unified way to get JIT length and contents info for offload targets. One way is to introduce new callback to parse device private data then fill those fields in bpf_prog_info. This might be a little heavy, the other way is to add generic fields which will be initialized by all offload targets. This patch follow the second approach to introduce two new fields in struct bpf_dev_offload and teach bpf_prog_get_info_by_fd about them to fill correct jited_prog_len and jited_prog_insns in bpf_prog_info. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17bpf: mark dst unknown on inconsistent {s, u}bounds adjustmentsDaniel Borkmann
syzkaller generated a BPF proglet and triggered a warning with the following: 0: (b7) r0 = 0 1: (d5) if r0 s<= 0x0 goto pc+0 R0=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 2: (1f) r0 -= r1 R0=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 verifier internal error: known but bad sbounds What happens is that in the first insn, r0's min/max value are both 0 due to the immediate assignment, later in the jsle test the bounds are updated for the min value in the false path, meaning, they yield smin_val = 1, smax_val = 0, and when ctx pointer is subtracted from r0, verifier bails out with the internal error and throwing a WARN since smin_val != smax_val for the known constant. For min_val > max_val scenario it means that reg_set_min_max() and reg_set_min_max_inv() (which both refine existing bounds) demonstrated that such branch cannot be taken at runtime. In above scenario for the case where it will be taken, the existing [0, 0] bounds are kept intact. Meaning, the rejection is not due to a verifier internal error, and therefore the WARN() is not necessary either. We could just reject such cases in adjust_{ptr,scalar}_min_max_vals() when either known scalars have smin_val != smax_val or umin_val != umax_val or any scalar reg with bounds smin_val > smax_val or umin_val > umax_val. However, there may be a small risk of breakage of buggy programs, so handle this more gracefully and in adjust_{ptr,scalar}_min_max_vals() just taint the dst reg as unknown scalar when we see ops with such kind of src reg. Reported-by: syzbot+6d362cadd45dc0a12ba4@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-17Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "A delayacct statistics correctness fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: delayacct: Account blkio completion on the correct task
2018-01-17Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two futex fixes: a input parameters robustness fix, and futex race fixes" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Prevent overflow by strengthen input validation futex: Avoid violating the 10th rule of futex
2018-01-17Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A one-liner fix which prevents deferrable timers becoming stale when the system does not switch into NOHZ mode" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Unconditionally check deferrable base
2018-01-17Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-17PM / hibernate: Drop unused parameter of enough_swapKyungsik Lee
Parameter flags is no longer used, remove it. Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-17Expand INIT_STRUCT_PID and removeDavid Howells
Expand INIT_STRUCT_PID in the single place that uses it and then remove it. There doesn't seem any point in the macro. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Will Deacon <will.deacon@arm.com> (arm64) Tested-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2018-01-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17bpf: annotate bpf_insn_print_t with __printfJakub Kicinski
Functions of type bpf_insn_print_t take printf-like format string, mark the type accordingly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17bpf: offload: make bpf_offload_dev_match() reject host+host caseJakub Kicinski
Daniel suggests it would be more logical for bpf_offload_dev_match() to return false is either the program or the map are not offloaded, rather than treating the both not offloaded case as a "matching CPU/host device". This makes no functional difference today, since verifier only calls bpf_offload_dev_match() when one of the objects is offloaded. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17bpf: cpumap: make some functions staticWei Yongjun
Fixes the following sparse warnings: kernel/bpf/cpumap.c:146:6: warning: symbol '__cpu_map_queue_destructor' was not declared. Should it be static? kernel/bpf/cpumap.c:225:16: warning: symbol 'cpu_map_build_skb' was not declared. Should it be static? kernel/bpf/cpumap.c:340:26: warning: symbol '__cpu_map_entry_alloc' was not declared. Should it be static? kernel/bpf/cpumap.c:398:6: warning: symbol '__cpu_map_entry_free' was not declared. Should it be static? kernel/bpf/cpumap.c:441:6: warning: symbol '__cpu_map_entry_replace' was not declared. Should it be static? kernel/bpf/cpumap.c:454:5: warning: symbol 'cpu_map_delete_elem' was not declared. Should it be static? kernel/bpf/cpumap.c:467:5: warning: symbol 'cpu_map_update_elem' was not declared. Should it be static? kernel/bpf/cpumap.c:505:6: warning: symbol 'cpu_map_free' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-16bpf: reject stores into ctx via st and xaddDaniel Borkmann
Alexei found that verifier does not reject stores into context via BPF_ST instead of BPF_STX. And while looking at it, we also should not allow XADD variant of BPF_STX. The context rewriter is only assuming either BPF_LDX_MEM- or BPF_STX_MEM-type operations, thus reject anything other than that so that assumptions in the rewriter properly hold. Add test cases as well for BPF selftests. Fixes: d691f9e8d440 ("bpf: allow programs to write to certain skb fields") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Two read past end of buffer fixes in AF_KEY, from Eric Biggers. 2) Memory leak in key_notify_policy(), from Steffen Klassert. 3) Fix overflow with bpf arrays, from Daniel Borkmann. 4) Fix RDMA regression with mlx5 due to mlx5 no longer using pci_irq_get_affinity(), from Saeed Mahameed. 5) Missing RCU read locking in nl80211_send_iface() when it calls ieee80211_bss_get_ie(), from Dominik Brodowski. 6) cfg80211 should check dev_set_name()'s return value, from Johannes Berg. 7) Missing module license tag in 9p protocol, from Stephen Hemminger. 8) Fix crash due to too small MTU in udp ipv6 sendmsg, from Mike Maloney. 9) Fix endless loop in netlink extack code, from David Ahern. 10) TLS socket layer sets inverted error codes, resulting in an endless loop. From Robert Hering. 11) Revert openvswitch erspan tunnel support, it's mis-designed and we need to kill it before it goes into a real release. From William Tu. 12) Fix lan78xx failures in full speed USB mode, from Yuiko Oshino. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits) net, sched: fix panic when updating miniq {b,q}stats qed: Fix potential use-after-free in qed_spq_post() nfp: use the correct index for link speed table lan78xx: Fix failure in USB Full Speed sctp: do not allow the v4 socket to bind a v4mapped v6 address sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf sctp: reinit stream if stream outcnt has been change by sinit in sendmsg ibmvnic: Fix pending MAC address changes netlink: extack: avoid parenthesized string constant warning ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY net: Allow neigh contructor functions ability to modify the primary_key sh_eth: fix dumping ARSTR Revert "openvswitch: Add erspan tunnel support." net/tls: Fix inverted error codes to avoid endless loop ipv6: ip6_make_skb() needs to clear cork.base.dst sctp: avoid compiler warning on implicit fallthru net: ipv4: Make "ip route get" match iif lo rules again. netlink: extack needs to be reset each time through loop tipc: fix a memory leak in tipc_nl_node_get_link() ipv6: fix udpv6 sendmsg crash caused by too small MTU ...
2018-01-16Merge tag 'trace-v4.15-rc4-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Bring back context level recursive protection in ring buffer. The simpler counter protection failed, due to a path when tracing with trace_clock_global() as it could not be reentrant and depended on the ring buffer recursive protection to keep that from happening. - Prevent branch profiling when FORTIFY_SOURCE is enabled. It causes 50 - 60 MB in warning messages. Branch profiling should never be run on production systems, so there's no reason that it needs to be enabled with FORTIFY_SOURCE. * tag 'trace-v4.15-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y ring-buffer: Bring back context level recursive checks
2018-01-16ptrace: Use copy_siginfo in setsiginfo and getsiginfoEric W. Biederman
Now that copy_siginfo copies all of the fields this is safe, safer (as all of the bits are guaranteed to be copied), clearer, and less error prone than using a structure copy. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-01-16printk: Never set console_may_schedule in console_trylock()Sergey Senozhatsky
This patch, basically, reverts commit 6b97a20d3a79 ("printk: set may_schedule for some of console_trylock() callers"). That commit was a mistake, it introduced a big dependency on the scheduler, by enabling preemption under console_sem in printk()->console_unlock() path, which is rather too critical. The patch did not significantly reduce the possibilities of printk() lockups, but made it possible to stall printk(), as has been reported by Tetsuo Handa [1]. Another issues is that preemption under console_sem also messes up with Steven Rostedt's hand off scheme, by making it possible to sleep with console_sem both in console_unlock() and in vprintk_emit(), after acquiring the console_sem ownership (anywhere between printk_safe_exit_irqrestore() in console_trylock_spinning() and printk_safe_enter_irqsave() in console_unlock()). This makes hand off less likely and, at the same time, may result in a significant amount of pending logbuf messages. Preempted console_sem owner makes it impossible for other CPUs to emit logbuf messages, but does not make it impossible for other CPUs to append new messages to the logbuf. Reinstate the old behavior and make printk() non-preemptible. Should any printk() lockup reports arrive they must be handled in a different way. [1] http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20I-love%20!%20SAKURA%20!%20ne%20!%20jp Fixes: 6b97a20d3a79 ("printk: set may_schedule for some of console_trylock() callers") Link: http://lkml.kernel.org/r/20180116044716.GE6607@jagdpanzerIV To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: akpm@linux-foundation.org Cc: linux-mm@kvack.org Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-16printk: Hide console waiter logic into helpersPetr Mladek
The commit ("printk: Add console owner and waiter logic to load balance console writes") made vprintk_emit() and console_unlock() even more complicated. This patch extracts the new code into 3 helper functions. They should help to keep it rather self-contained. It will be easier to use and maintain. This patch just shuffles the existing code. It does not change the functionality. Link: http://lkml.kernel.org/r/20180112160837.GD24497@linux.suse Cc: akpm@linux-foundation.org Cc: linux-mm@kvack.org Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: rostedt@home.goodmis.org Cc: Byungchul Park <byungchul.park@lge.com> Cc: Tejun Heo <tj@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: linux-kernel@vger.kernel.org Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>