summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-05-10x86/KASLR: Return earliest overlap when avoiding regionsKees Cook
In preparation for being able to detect where to split up contiguous memory regions that overlap with memory regions to avoid, we need to pass back what the earliest overlapping region was. This modifies the overlap checker to return that information. Based on a separate mem_min_overlap() implementation by Baoquan He. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-5-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/KASLR: Add 'struct slot_area' to manage random_addr slotsBaoquan He
In order to support KASLR moving the kernel anywhere in physical memory (which could be up to 64TB), we need to handle counting the potential randomization locations in a more efficient manner. In the worst case with 64TB, there could be roughly 32 * 1024 * 1024 randomization slots if CONFIG_PHYSICAL_ALIGN is 0x1000000. Currently the starting address of candidate positions is stored into the slots[] array, one at a time. This method would cost too much memory and it's also very inefficient to get and save the slot information into the slot array one by one. This patch introduces 'struct slot_area' to manage each contiguous region of randomization slots. Each slot_area will contain the starting address and how many available slots are in this area. As with the original code, the slot_areas[] will avoid the mem_avoid[] regions. Since setup_data is a linked list, it could contain an unknown number of memory regions to be avoided, which could cause us to fragment the contiguous memory that the slot_area array is tracking. In normal operation this level of fragmentation will be extremely rare, but we choose a suitably large value (100) for the array. If setup_data forces the slot_area array to become highly fragmented and there are more slots available beyond the first 100 found, the rest will be ignored for KASLR selection. The function store_slot_info() is used to calculate the number of slots available in the passed-in memory region and stores it into slot_areas[] after adjusting for alignment and size requirements. Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote changelog, squashed with new functions. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-4-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/boot: Add missing file header commentsKees Cook
There were some files with missing header comments. Since they are included from both compressed and regular kernels, make note of that. Also corrects a typo in the mem_avoid comments. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-3-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/KASLR: Initialize mapping_info every timeKees Cook
As it turns out, mapping_info DOES need to be initialized every time, because pgt_data address could be changed during kernel relocation. So it can not be build time assigned. Without this, page tables were not being corrected updated, which could cause reboots when a physical address beyond 2G was chosen. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/boot: Comment what finalize_identity_maps() doesBorislav Petkov
So it is not really obvious that finalize_identity_maps() doesn't do any finalization but it *actually* writes CR3 with the ident PGD. Comment that at the call site. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: bhe@redhat.com Cc: dyoung@redhat.com Cc: jkosina@suse.cz Cc: linux-tip-commits@vger.kernel.org Cc: luto@kernel.org Cc: vgoyal@redhat.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20160507100541.GA24613@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10sched/rt, sched/dl: Don't push if task's scheduling class was changedXunlei Pang
We got this warning: WARNING: CPU: 1 PID: 2468 at kernel/sched/core.c:1161 set_task_cpu+0x1af/0x1c0 [...] Call Trace: dump_stack+0x63/0x87 __warn+0xd1/0xf0 warn_slowpath_null+0x1d/0x20 set_task_cpu+0x1af/0x1c0 push_dl_task.part.34+0xea/0x180 push_dl_tasks+0x17/0x30 __balance_callback+0x45/0x5c __sched_setscheduler+0x906/0xb90 SyS_sched_setattr+0x150/0x190 do_syscall_64+0x62/0x110 entry_SYSCALL64_slow_path+0x25/0x25 This corresponds to: WARN_ON_ONCE(p->state == TASK_RUNNING && p->sched_class == &fair_sched_class && (p->on_rq && !task_on_rq_migrating(p))) It happens because in find_lock_later_rq(), the task whose scheduling class was changed to fair class is still pushed away as if it were a deadline task ... So, check in find_lock_later_rq() after double_lock_balance(), if the scheduling class of the deadline task was changed, break and retry. Apply the same logic to RT tasks. Signed-off-by: Xunlei Pang <xlpang@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Juri Lelli <juri.lelli@arm.com> Link: http://lkml.kernel.org/r/1462767091-1215-1-git-send-email-xlpang@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10drm/i915: Bail out of pipe config compute loop on LPTDaniel Vetter
LPT is pch, so might run into the fdi bandwidth constraint (especially since it has only 2 lanes). But right now we just force pipe_bpp back to 24, resulting in a nice loop (which we bail out with a loud WARN_ON). Fix this. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vetter@ffwll.ch (cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-05-10x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=nThomas Gleixner
Josef reported that the uncore driver trips over with CONFIG_SMP=n because x86_max_cores is 16 instead of 12. The reason is, that for SMP=n the extended topology detection is a NOOP and the cache leaf is used to determine the number of cores. That's wrong in two aspects: 1) The cache leaf enumerates the maximum addressable number of cores in the package, which is obviously not correct 2) UP has no business with topology bits at all. Make intel_num_cpu_cores() return 1 for CONFIG_SMP=n Reported-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team <Kernel-team@fb.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/761b4a2a-0332-7954-f030-c6639f949612@fb.com
2016-05-10ARM: dts: at91: sama5d2: use "atmel,sama5d3-nfc" compatible for nfcWenyou Yang
An error in documentation of the NAND Flash Controller (NFC) led to choose another compatibility string for sama5d2 with an impact on the NAND flash ready/busy information. It was producing the error message: atmel_nand 80000000.nand: Time out to wait for interrupt: 0x08000000 and had an impact on performance. So, switch back to the classical "atmel,sama5d3-nfc" compatibility string for this SoC which gives the proper ready/busy bit information. The NAND flash driver will be updated to remove the support for this different implementation. Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com> Acked-by: Romain Izard <romain.izard.pro@gmail.com> [nicolas.ferre@atmel.com: change commit message] Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2016-05-10export tc ife uapi headerJamal Hadi Salim
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10netfilter: conntrack: remove uninitialized shadow variableArnd Bergmann
A recent commit introduced an unconditional use of an uninitialized variable, as reported in this gcc warning: net/netfilter/nf_conntrack_core.c: In function '__nf_conntrack_confirm': net/netfilter/nf_conntrack_core.c:632:33: error: 'ctinfo' may be used uninitialized in this function [-Werror=maybe-uninitialized] bytes = atomic64_read(&counter[CTINFO2DIR(ctinfo)].bytes); ^ net/netfilter/nf_conntrack_core.c:628:26: note: 'ctinfo' was declared here enum ip_conntrack_info ctinfo; The problem is that a local variable shadows the function parameter. This removes the local variable, which looks like what Pablo originally intended. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 71d8c47fc653 ("netfilter: conntrack: introduce clash resolution on insertion race") Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10Merge tag 'wireless-drivers-for-davem-2016-05-09' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.6 iwlwifi * fix P2P rates (and possibly other issues) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contain Netfilter simple fixes for your net tree, two one-liner and one two-liner: 1) Oneliner to fix missing spinlock definition that triggers 'BUG: spinlock bad magic on CPU#' when spinlock debugging is enabled, from Florian Westphal. 2) Fix missing workqueue cancelation on IDLETIMER removal, from Liping Zhang. 3) Fix insufficient validation of netlink of NFACCT_QUOTA in nfnetlink_acct, from Phil Turnbull. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10Merge branch 'ip6-tunnel-tx-fixes'David S. Miller
Tom Herbert says: ==================== ip6: Transmit tunneling fixes Several fixes suggested by Alexander. Tested: Running netperf TCP_STREAM with gretap and keyid configured. Visually verified that MTU is correctly being set. Did not test HW offload (Alexander plese try) ==================== Tested-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10ip6_gre: Use correct flags for reading TUNNEL_SEQTom Herbert
Fix two spots where o_flags in a tunnel are being compared to GRE_SEQ instead of TUNNEL_SEQ. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10ip6: Don't set transport header in IPv6 tunnelingTom Herbert
We only need to reset network header here. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10ip6_gre: Set inner protocol correctly in __gre6_xmitTom Herbert
Need to use adjusted protocol value for setting inner protocol. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10gre6: Fix flag translationsTom Herbert
GRE for IPv6 does not properly translate for GRE flags to tunnel flags and vice versa. This patch fixes that. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10ip6_gre: Fix MTU settingTom Herbert
In ip6gre_tnl_link_config set t->tun_len and t->hlen correctly for the configuration. For hard_header_len and mtu calculation include IPv6 header and encapsulation overhead. In ip6gre_tunnel_init_common set t->tun_len and t->hlen correctly for the configuration. Revert to setting hard_header_len instead of needed_headroom. Tested: ./ip link add name tun8 type ip6gretap remote \ 2401:db00:20:911a:face:0:27:0 local \ 2401:db00:20:911a:face:0:25:0 ttl 225 Gives MTU of 1434. That is equal to 1500 - 40 - 14 - 4 - 8. ./ip link add name tun8 type ip6gretap remote \ 2401:db00:20:911a:face:0:27:0 local \ 2401:db00:20:911a:face:0:25:0 ttl 225 okey 123 Gives MTU of 1430. That is equal to 1500 - 40 - 14 - 4 - 8 - 4. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10net: thunderx: avoid exposing kernel stackxypron.glpk@gmx.de
Reserved fields should be set to zero to avoid exposing bits from the kernel stack. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09net: fix a kernel infoleak in x25 moduleKangjie Lu
Stack object "dte_facilities" is allocated in x25_rx_call_request(), which is supposed to be initialized in x25_negotiate_facilities. However, 5 fields (8 bytes in total) are not initialized. This object is then copied to userland via copy_to_user, thus infoleak occurs. Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09stmmac: dwmac-socfpga: make socfpga_dwmac_pm_ops staticJoachim Eastwood
Fix the following sparse warning: drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c:274:1: warning: symbol 'socfpga_dwmac_pm_ops' was not declared. Should it be static? Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09Merge branch 'l3mdev-send-enslaved'David S. Miller
David Ahern says: ==================== net: l3mdev: Allow send on enslaved interface First patch preps for the second. The second is required for several use cases such as ping on an interface and BFD that need to send packets on a specific interface, including ones enslaved to a VRF device. v2 - fixed brackets on both patches per comment from DaveM ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09net: l3mdev: Allow send on enslaved interfaceDavid Ahern
Allow udp and raw sockets to send by oif that is an enslaved interface versus the l3mdev/VRF device. For example, this allows BFD to use ifindex from IP_PKTINFO on a receive to send a response without the need to convert to the VRF index. It also allows ping and ping6 to work when specifying an enslaved interface (e.g., ping -I swp1 <ip>) which is a natural use case. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09net: l3mdev: Move get_saddr and rt6_dstDavid Ahern
Move l3mdev_rt6_dst_by_oif and l3mdev_get_saddr to l3mdev.c. Collapse l3mdev_get_rt6_dst into l3mdev_rt6_dst_by_oif since it is the only user and keep the l3mdev_get_rt6_dst name for consistency with other hooks. A follow-on patch adds more code to these functions making them long for inlined functions. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09ravb: Add missing free_irq() call to ravb_close()Geert Uytterhoeven
When reopening the network device on ra7795/salvator-x, e.g. after a DHCP timeout: IP-Config: Reopening network devices... genirq: Flags mismatch irq 139. 00000000 (eth0:ch24:emac) vs. 00000000 (eth0:ch24:emac) ravb e6800000.ethernet eth0: cannot request IRQ eth0:ch24:emac IP-Config: Failed to open eth0 IP-Config: No network devices available The "mismatch" is due to requesting an IRQ that is already in use, while IRQF_PROBE_SHARED wasn't set. However, the real cause is that ravb_close() doesn't release the R-Car Gen3-specific secondary IRQ. Add the missing free_irq() call to fix this. Fixes: 22d4df8ff3a3cc72 ("ravb: Add support for r8a7795 SoC") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09fjes: Fix unnecessary spinlock_irqsaveTaku Izumi
commit-bd5a256 introduces a deadlock bug in fjes_change_mtu(). This spin_lock_irqsave() is obviously unnecessary. This patch eliminates unnecessary spin_lock_irqsave() in fjes_change_mtu() Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09mlx5: Fix merge errors.David S. Miller
I accidently let Arnd's VXLAN dependency changes slip into net-next, they are only appropriate for net. Also the flow steering structural changes to mlx5e_priv got scrambled during the merge resolution as well. Fix that all up. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09uapi glibc compat: fix compile errors when glibc net/if.h included before ↵Mikko Rapeli
linux/if.h glibc's net/if.h contains copies of definitions from linux/if.h and these conflict and cause build failures if both files are included by application source code. Changes in uapi headers, which fixed header file dependencies to include linux/if.h when it was needed, e.g. commit 1ffad83d, made the net/if.h and linux/if.h incompatibilities visible as build failures for userspace applications like iproute2 and xtables-addons. This patch fixes compile errors when glibc net/if.h is included before linux/if.h: ./linux/if.h:99:21: error: redeclaration of enumerator ‘IFF_NOARP’ ./linux/if.h:98:23: error: redeclaration of enumerator ‘IFF_RUNNING’ ./linux/if.h:97:26: error: redeclaration of enumerator ‘IFF_NOTRAILERS’ ./linux/if.h:96:27: error: redeclaration of enumerator ‘IFF_POINTOPOINT’ ./linux/if.h:95:24: error: redeclaration of enumerator ‘IFF_LOOPBACK’ ./linux/if.h:94:21: error: redeclaration of enumerator ‘IFF_DEBUG’ ./linux/if.h:93:25: error: redeclaration of enumerator ‘IFF_BROADCAST’ ./linux/if.h:92:19: error: redeclaration of enumerator ‘IFF_UP’ ./linux/if.h:252:8: error: redefinition of ‘struct ifconf’ ./linux/if.h:203:8: error: redefinition of ‘struct ifreq’ ./linux/if.h:169:8: error: redefinition of ‘struct ifmap’ ./linux/if.h:107:23: error: redeclaration of enumerator ‘IFF_DYNAMIC’ ./linux/if.h:106:25: error: redeclaration of enumerator ‘IFF_AUTOMEDIA’ ./linux/if.h:105:23: error: redeclaration of enumerator ‘IFF_PORTSEL’ ./linux/if.h:104:25: error: redeclaration of enumerator ‘IFF_MULTICAST’ ./linux/if.h:103:21: error: redeclaration of enumerator ‘IFF_SLAVE’ ./linux/if.h:102:22: error: redeclaration of enumerator ‘IFF_MASTER’ ./linux/if.h:101:24: error: redeclaration of enumerator ‘IFF_ALLMULTI’ ./linux/if.h:100:23: error: redeclaration of enumerator ‘IFF_PROMISC’ The cases where linux/if.h is included before net/if.h need a similar fix in the glibc side, or the order of include files can be changed userspace code as a workaround. This change was tested in x86 userspace on Debian unstable with scripts/headers_compile_test.sh: $ make headers_install && \ cd usr/include && ../../scripts/headers_compile_test.sh -l -k ... cc -Wall -c -nostdinc -I /usr/lib/gcc/i586-linux-gnu/5/include -I /usr/lib/gcc/i586-linux-gnu/5/include-fixed -I . -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH/i586-linux-gnu -o /dev/null ./linux/if.h_libc_before_kernel.h PASSED libc before kernel test: ./linux/if.h Reported-by: Jan Engelhardt <jengelh@inai.de> Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Reported-by: Stephen Hemminger <shemming@brocade.com> Reported-by: Waldemar Brodkorb <mail@waldemar-brodkorb.de> Cc: Gabriel Laskar <gabriel@lse.epita.fr> Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm build fix from Dan Williams: "A build fix for the usage of HPAGE_SIZE in the last libnvdimm pull request. I have taken note that the kbuild robot build success test does not include results for alpha_allmodconfig. Thanks to Guenter for the report. It's tagged for -stable since the original fix will land there and cause build problems" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm, pfn: fix ARCH=alpha allmodconfig build failure
2016-05-09perf/core: Change the default paranoia level to 2Andy Lutomirski
Allowing unprivileged kernel profiling lets any user dump follow kernel control flow and dump kernel registers. This most likely allows trivial kASLR bypassing, and it may allow other mischief as well. (Off the top of my head, the PERF_SAMPLE_REGS_INTR output during /dev/urandom reads could be quite interesting.) Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-09Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "2 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: zsmalloc: fix zs_can_compact() integer overflow Revert "proc/base: make prompt shell start from new line after executing "cat /proc/$pid/wchan""
2016-05-09zsmalloc: fix zs_can_compact() integer overflowSergey Senozhatsky
zs_can_compact() has two race conditions in its core calculation: unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) - zs_stat_get(class, OBJ_USED); 1) classes are not locked, so the numbers of allocated and used objects can change by the concurrent ops happening on other CPUs 2) shrinker invokes it from preemptible context Depending on the circumstances, thus, OBJ_ALLOCATED can become less than OBJ_USED, which can result in either very high or negative `total_scan' value calculated later in do_shrink_slab(). do_shrink_slab() has some logic to prevent those cases: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 However, due to the way `total_scan' is calculated, not every shrinker->count_objects() overflow can be spotted and handled. To demonstrate the latter, I added some debugging code to do_shrink_slab() (x86_64) and the results were: vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615] vmscan: but total_scan > 0: 92679974445502 vmscan: resulting total_scan: 92679974445502 [..] vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615] vmscan: but total_scan > 0: 22634041808232578 vmscan: resulting total_scan: 22634041808232578 Even though shrinker->count_objects() has returned an overflowed value, the resulting `total_scan' is positive, and, what is more worrisome, it is insanely huge. This value is getting used later on in shrinker->scan_objects() loop: while (total_scan >= batch_size || total_scan >= freeable) { unsigned long ret; unsigned long nr_to_scan = min(batch_size, total_scan); shrinkctl->nr_to_scan = nr_to_scan; ret = shrinker->scan_objects(shrinker, shrinkctl); if (ret == SHRINK_STOP) break; freed += ret; count_vm_events(SLABS_SCANNED, nr_to_scan); total_scan -= nr_to_scan; cond_resched(); } `total_scan >= batch_size' is true for a very-very long time and 'total_scan >= freeable' is also true for quite some time, because `freeable < 0' and `total_scan' is large enough, for example, 22634041808232578. The only break condition, in the given scheme of things, is shrinker->scan_objects() == SHRINK_STOP test, which is a bit too weak to rely on, especially in heavy zsmalloc-usage scenarios. To fix the issue, take a pool stat snapshot and use it instead of racy zs_stat_get() calls. Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> [4.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-09Revert "proc/base: make prompt shell start from new line after executing ↵Robin Humble
"cat /proc/$pid/wchan"" This reverts the 4.6-rc1 commit 7e2bc81da333 ("proc/base: make prompt shell start from new line after executing "cat /proc/$pid/wchan") because it breaks /proc/$PID/whcan formatting in ps and top. Revert also because the patch is inconsistent - it adds a newline at the end of only the '0' wchan, and does not add a newline when /proc/$PID/wchan contains a symbol name. eg. $ ps -eo pid,stat,wchan,comm PID STAT WCHAN COMMAND ... 1189 S - dbus-launch 1190 Ssl 0 dbus-daemon 1198 Sl 0 lightdm 1299 Ss ep_pol systemd 1301 S - (sd-pam) 1304 Ss wait sh Signed-off-by: Robin Humble <plaguedbypenguins@gmail.com> Cc: Minfei Huang <mnfhuang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-09Input: twl6040-vibra - fix DT node memory managementH. Nikolaus Schaller
commit e7ec014a47e4 ("Input: twl6040-vibra - update for device tree support") made the separate vibra DT node to a subnode of the twl6040. It now calls of_find_node_by_name() to locate the "vibra" subnode. This function has a side effect to call of_node_put on() for the twl6040 parent node passed in as a parameter. This causes trouble later on. Solution: we must call of_node_get() before of_find_node_by_name() Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-05-10NFC: pn533: handle interrupted commands in pn533_recv_frameMichael Thalmeier
When pn533_recv_frame is called from within abort_command context the current dev->cmd is not guaranteed to be set. Additionally on receiving an error status we can omit frame checking and simply schedule the workqueue. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-05-10NFC: pn533: reset poll modulation list before calling targets_foundMichael Thalmeier
We need to reset the poll modulation list before calling nfc_targets_found because otherwise userspace could run before the modulation list is cleared and then get a "Cannot activate target while polling" error upon calling activate_target. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-05-09NFC: pn533: i2c: do not call pn533_recv_frame with aborted commandsMichael Thalmeier
When a command gets aborted the pn533 core does not need any RX frames that may be received until a new frame is sent. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-05-09NFC: pn533: fix order of initializationMichael Thalmeier
Correctly call nfc_set_parent_dev before nfc_register_device. Otherwise the driver will OOPS when being removed. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-05-09NFC: pn533: i2c: free irq on driver removeMichael Thalmeier
The requested irq needs to be freed when removing the driver, otherwise a following driver load fails to request the irq. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-05-09perf symbols: Fix handling of zero-length symbols.Chris Phlipot
This change introduces a fix to symbols__find, so that it is able to find symbols of length zero (where start == end). The current code has the following problem: - The current implementation of symbols__find is unable to find any symbols of length zero. - The db-export framework explicitly creates zero length symbols at locations where no symbol currently exists. The combination of the two above behaviors results in behavior similar to the example below. 1. addr_location is created for a sample, but symbol is unable to be resolved. 2. db export creates an "unknown" symbol of length zero at that address and inserts it into the dso. 3. A new sample comes in at the same address, but symbol__find is unable to find the zero length symbol, so it is still unresolved. 4. db export sees the symbol is unresolved, and allocated a duplicate symbol, even though it already did this in step 2. This behavior continues every time an address without symbol information is seen, which causes a very large number of these symbols to be allocated. The effect of this fix can be observed by looking at the contents of an exported database before/after the fix (generated with scripts/python/export-to-postgresql.py) Ex. BEFORE THE CHANGE: example_db=# select count(*) from symbols; count -------- 900213 (1 row) example_db=# select count(*) from symbols where symbols.name='unknown'; count -------- 897355 (1 row) example_db=# select count(*) from symbols where symbols.name!='unknown'; count ------- 2858 (1 row) AFTER THE CHANGE: example_db=# select count(*) from symbols; count ------- 25217 (1 row) example_db=# select count(*) from symbols where name='unknown'; count ------- 22359 (1 row) example_db=# select count(*) from symbols where name!='unknown'; count ------- 2858 (1 row) Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com [ Moved the test to later in the rb_tree tests, as this not the likely case ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf evsel: Print state of perf_event_attr.write_backwardArnaldo Carvalho de Melo
Now we can see if it is set when using verbose mode in various tools, such as 'perf test': # perf test -vv back 45: Test backward reading from ring buffer : --- start --- <SNIP> ------------------------------------------------------------ perf_event_attr: type 2 size 112 config 0x98 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|PERIOD|RAW disabled 1 mmap 1 comm 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 write_backward 1 ------------------------------------------------------------ sys_perf_event_open: pid 20911 cpu -1 group_fd -1 flags 0x8 <SNIP> ---- end ---- Test backward reading from ring buffer: Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf tests: Add test to check backward ring bufferWang Nan
This test checks reading from backward ring buffer. Test result: # ~/perf test 'ring buffer' 45: Test backward reading from ring buffer : Ok The test case is a while loop which calls prctl(PR_SET_NAME) multiple times. Each prctl should issue 2 events: one PERF_RECORD_SAMPLE, one PERF_RECORD_COMM. The first round creates a relative large ring buffer (256 pages). It can afford all events. Read from it and check the count of each type of events. The second round creates a small ring buffer (1 page) and makes it overwritable. Check the correctness of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09ACPI / GED: make evged.c explicitly non-modularPaul Gortmaker
The Makefile/Kconfig currently controlling compilation of this code is: Makefile:acpi-$(CONFIG_ACPI_REDUCED_HARDWARE_ONLY) += evged.o drivers/acpi/Kconfig:config ACPI_REDUCED_HARDWARE_ONLY drivers/acpi/Kconfig: bool "Hardware-reduced ACPI support only" if EXPERT ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modularity so that when reading the code there is no doubt it is builtin-only. Since module_platform_driver() uses the same init level priority as builtin_platform_driver() the init ordering remains unchanged with this commit. The file wasn't explicitly including the module.h file but it did already have init.h so, unlike similar changes, this one has no header changes at all. We also delete the MODULE_LICENSE tag since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-09intel_pstate: Clean up intel_pstate_get()Rafael J. Wysocki
intel_pstate_get() contains a local variable that's initialized but never used and it can be written in fewer lines of code, so clean it up. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2016-05-09perf tools: Support reading from backward ring bufferWang Nan
perf_evlist__mmap_read_backward() is introduced for reading backward ring buffer. Since direction for reading such ring buffer is different from the direction kernel writing to it, and since user need to fetch most recent record from it, a perf_evlist__mmap_read_catchup() is introduced to move the reading pointer to the end of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
In netdevice.h we removed the structure in net-next that is being changes in 'net'. In macsec.c and rtnetlink.c we have overlaps between fixes in 'net' and the u64 attribute changes in 'net-next'. The mlx5 conflicts have to do with vxlan support dependencies. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - bug in ahash SG list walking that may lead to crashes - resource leak in qat - missing RSA dependency that causes it to fail" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rsa - select crypto mgr dependency crypto: hash - Fix page length clamping in hash walk crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq crypto: qat - fix invalid pf2vf_resp_wq logic
2016-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Check klogctl failure correctly, from Colin Ian King. 2) Prevent OOM when under memory pressure in flowcache, from Steffen Klassert. 3) Fix info leak in llc and rtnetlink ifmap code, from Kangjie Lu. 4) Memory barrier and multicast handling fixes in bnxt_en, from Michael Chan. 5) Endianness bug in mlx5, from Daniel Jurgens. 6) Fix disconnect handling in VSOCK, from Ian Campbell. 7) Fix locking of netdev list walking in get_bridge_ifindices(), from Nikolay Aleksandrov. 8) Bridge multicast MLD parser can look at wrong packet offsets, fix from Linus Lüssing. 9) Fix chip hang in qede driver, from Sudarsana Reddy Kalluru. 10) Fix missing setting of encapsulation before inner handling completes in udp_offload code, from Jarno Rajahalme. 11) Missing rollbacks during LAG join and flood configuration failures in mlxsw driver, from Ido Schimmel. 12) Fix error code checks in netxen driver, from Dan Carpenter. 13) Fix key size in new macsec driver, from Sabrina Dubroca. 14) Fix mlx5/VXLAN dependencies, from Arnd Bergmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits) net/mlx5e: make VXLAN support conditional Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue" macsec: key identifier is 128 bits, not 64 Documentation/networking: more accurate LCO explanation macvtap: segmented packet is consumed tools: bpf_jit_disasm: check for klogctl failure qede: uninitialized variable in qede_start_xmit() netxen: netxen_rom_fast_read() doesn't return -1 netxen: reversed condition in netxen_nic_set_link_parameters() netxen: fix error handling in netxen_get_flash_block() mlxsw: spectrum: Add missing rollback in flood configuration mlxsw: spectrum: Fix rollback order in LAG join failure udp_offload: Set encapsulation before inner completes. udp_tunnel: Remove redundant udp_tunnel_gro_complete(). qede: prevent chip hang when increasing channels net: ipv6: tcp reset, icmp need to consider L3 domain bridge: fix igmp / mld query parsing net: bridge: fix old ioctl unlocked net device walk VSOCK: do not disconnect socket when peer has shutdown SEND only net/mlx4_en: Fix endianness bug in IPV6 csum calculation ...
2016-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following large patchset contains Netfilter updates for your net-next tree. My initial intention was to send you this in two goes but when I looked back twice I already had this burden on top of me. Several updates for IPVS from Marco Angaroni: 1) Allow SIP connections originating from real-servers to be load balanced by the SIP persistence engine as is already implemented in the other direction. 2) Release connections immediately for One-packet-scheduling (OPS) in IPVS, instead of making it via timer and rcu callback. 3) Skip deleting conntracks for each one packet in OPS, and don't call nf_conntrack_alter_reply() since no reply is expected. 4) Enable drop on exhaustion for OPS + SIP persistence. Miscelaneous conntrack updates from Florian Westphal, including fix for hash resize: 5) Move conntrack generation counter out of conntrack pernet structure since this is only used by the init_ns to allow hash resizing. 6) Use get_random_once() from packet path to collect hash random seed instead of our compound. 7) Don't disable BH from ____nf_conntrack_find() for statistics, use NF_CT_STAT_INC_ATOMIC() instead. 8) Fix lookup race during conntrack hash resizing. 9) Introduce clash resolution on conntrack insertion for connectionless protocol. Then, Florian's netns rework to get rid of per-netns conntrack table, thus we use one single table for them all. There was consensus on this change during the NFWS 2015 and, on top of that, it has recently been pointed as a source of multiple problems from unpriviledged netns: 11) Use a single conntrack hashtable for all namespaces. Include netns in object comparisons and make it part of the hash calculation. Adapt early_drop() to consider netns. 12) Use single expectation and NAT hashtable for all namespaces. 13) Use a single slab cache for all namespaces for conntrack objects. 14) Skip full table scanning from nf_ct_iterate_cleanup() if the pernet conntrack counter tells us the table is empty (ie. equals zero). Fixes for nf_tables interval set element handling, support to set conntrack connlabels and allow set names up to 32 bytes. 15) Parse element flags from element deletion path and pass it up to the backend set implementation. 16) Allow adjacent intervals in the rbtree set type for dynamic interval updates. 17) Add support to set connlabel from nf_tables, from Florian Westphal. 18) Allow set names up to 32 bytes in nf_tables. Several x_tables fixes and updates: 19) Fix incorrect use of IS_ERR_VALUE() in x_tables, original patch from Andrzej Hajda. And finally, miscelaneous netfilter updates such as: 20) Disable automatic helper assignment by default. Note this proc knob was introduced by a9006892643a ("netfilter: nf_ct_helper: allow to disable automatic helper assignment") 4 years ago to start moving towards explicit conntrack helper configuration via iptables CT target. 21) Get rid of obsolete and inconsistent debugging instrumentation in x_tables. 22) Remove unnecessary check for null after ip6_route_output(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>