summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-15pwm: stm32-lp: Remove pwm_is_enabled() check before calling pwm_disable()Axel Lin
The same checking is done by the implementation of pwm_disable(). Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15pwm: mediatek: Add MT2712/MT7622 supportZhi Mao
Add support for MT2712 and MT7622. Due to register offset address of pwm7 for MT2712 is not fixed 0x40, add mtk_pwm_reg_offset array for PWM register offset. Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Zhi Mao <zhi.mao@mediatek.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15pwm: sunxi: Use of_device_get_match_data()Corentin Labbe
The usage of of_device_get_match_data reduce the code size a bit. Furthermore, it prevents an improbable dereference when of_match_device() returns NULL. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15Merge branch 'for-4.15/callbacks' into for-linusJiri Kosina
This pulls in an infrastructure/API that allows livepatch writers to register pre-patch and post-patch callbacks that allow for running a glue code necessary for finalizing the patching if necessary. Conflicts: kernel/livepatch/core.c - trivial conflict by adding a callback call into module going notifier vs. moving that code block to klp_cleanup_module_patches_limited() Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-11-15pwm: atmel-tcb: Support backup modeRomain Izard
Save and restore registers for the PWM on suspend and resume, which makes hibernation and backup modes possible. Signed-off-by: Romain Izard <romain.izard.pro@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15dt-bindings: pwm: Add R-Car D3 device tree bindingsYoshihiro Shimoda
Add device tree bindings for the PWM controller found on R-Car D3 SoCs. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15Merge branch 'for-4.15/shadow-variables' into for-linusJiri Kosina
Shadow variables allow callers to associate new shadow fields to existing data structures. This is intended to be used by livepatch modules seeking to emulate additions to data structure definitions.
2017-11-15pwm: img: Add runtime PMEd Blake
Add runtime PM to disable the clocks when the h/w is not in use. Signed-off-by: Ed Blake <ed.blake@sondrel.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15pwm: img: Add suspend / resume handlingEd Blake
The power may be disabled during suspend, so implement suspend and resume callbacks to save and restore register state. Signed-off-by: Ed Blake <ed.blake@sondrel.com> [thierry.reding@gmail.com: guard using PM_SLEEP instead of PM] Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2017-11-15perf/core: Fix memory leak triggered by perf --namespaceVasily Averin
perf with --namespace key leaks various memory objects including namespaces 4.14.0+ pid_namespace 1 12 2568 12 8 user_namespace 1 39 824 39 8 net_namespace 1 5 6272 5 8 This happen because perf_fill_ns_link_info() struct patch ns_path: during initialization ns_path incremented counters on related mnt and dentry, but without lost path_put nobody decremented them back. Leaked dentry is name of related namespace, and its leak does not allow to free unused namespace. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: commit e422267322cd ("perf: Add PERF_RECORD_NAMESPACES to include namespaces related info") Link: http://lkml.kernel.org/r/c510711b-3904-e5e1-d296-61273d21118d@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-14dax: fix general protection fault in dax_alloc_inodeMikulas Patocka
Don't crash in case of allocation failure in dax_alloc_inode. syzkaller hit the following crash on e4880bc5dfb1 kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access [..] RIP: 0010:dax_alloc_inode+0x3b/0x70 drivers/dax/super.c:348 Call Trace: alloc_inode+0x65/0x180 fs/inode.c:208 new_inode_pseudo+0x69/0x190 fs/inode.c:890 new_inode+0x1c/0x40 fs/inode.c:919 mount_pseudo_xattr+0x288/0x560 fs/libfs.c:261 mount_pseudo include/linux/fs.h:2137 [inline] dax_mount+0x2e/0x40 drivers/dax/super.c:388 mount_fs+0x66/0x2d0 fs/super.c:1223 Cc: <stable@vger.kernel.org> Fixes: 7b6be8444e0f ("dax: refactor dax-fs into a generic provider...") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-14remoteproc: qcom: Fix error handling paths in order to avoid memory leaksChristophe JAILLET
In case of error returned by 'q6v5_xfer_mem_ownership', we must free some resources before returning. In 'q6v5_mpss_init_image()', add a new label to undo a previous 'dma_alloc_attrs()'. In 'q6v5_mpss_load()', re-use the already existing error handling code to undo a previous 'request_firmware()', as already done in the other error handling paths of the function. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-11-14rpmsg: glink: Add missing MODULE_LICENSEBjorn Andersson
The qcom_glink_native driver is missing a MODULE_LICENSE(), correct this. Fixes: 835764ddd9af ("rpmsg: glink: Move the common glink protocol implementation to glink_native.c") Cc: stable@vger.kernel.org Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-11-15Revert "xfrm: Fix stack-out-of-bounds read in xfrm_state_find."Steffen Klassert
This reverts commit c9f3f813d462c72dbe412cee6a5cbacf13c4ad5e. This commit breaks transport mode when the policy template has widlcard addresses configured, so revert it. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-11-15sparc64: Fix page table walk for PUD hugepagesNitin Gupta
For a PUD hugepage entry, we need to propagate bits [32:22] from virtual address to resolve at 4M granularity. However, the current code was incorrectly propagating bits [29:19]. This bug can cause incorrect data to be returned for pages backed with 16G hugepages. Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: Convert timers to user timer_setup()Allen Pais
Switch to using the new timer_setup() and from_timer() in LDOM Virtual I/O handshake. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: convert mdesc_handle.refcnt from atomic_t to refcount_tElena Reshetova
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mdesc_handle.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Acked-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc/led: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Adds a static variable to hold timeout value. Cc: "David S. Miller" <davem@davemloft.net> Cc: Geliang Tang <geliangtang@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: sparclinux@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15Merge branch 'sparc64-optimized-fls'David S. Miller
Vijay Kumar says: ==================== sparc64: Optimize fls and __fls SPARC provides lzcnt instruction (with VIS3) which can be used to optimize fls, __fls and fls64 functions. For the systems that supports lzcnt instruction, we now do boot time patching to use sparc optimized fls, __fls and fls64 functions. v3->v4: - Fixed a typo. v2->v3: - Using ENTRY(), ENDPROC() for assembler functions. - Removed BITS_PER_LONG from __fls. - Using generic fls64(). - Replaced lzcnt instruction with .word directive. v1->v2: - Fixed delay slot issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: Use sparc optimized fls and __fls for T4 and aboveVijay Kumar
For T4 and above, patch fls and __fls functions at the boot time to use lzcnt instruction. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: SPARC optimized __fls functionVijay Kumar
Defined SPARC optimized __fls using lzcnt opcode. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: SPARC optimized fls functionVijay Kumar
Defined SPARC optimized fls using lzcnt opcode. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: Define SPARC default __fls functionVijay Kumar
__fls will now require a boot time patching on T4 and above. Redefining it under arch/sparc/lib. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15sparc64: Define SPARC default fls functionVijay Kumar
fls will now require a boot time patching on T4 and above. Redefining it under arch/sparc/lib. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15vDSO for sparcNagarathnam Muthusamy
Following patch is based on work done by Nick Alcock on 64-bit vDSO for sparc in Oracle linux. I have extended it to include support for 32-bit vDSO for sparc on 64-bit kernel. vDSO for sparc is based on the X86 implementation. This patch provides vDSO support for both 64-bit and 32-bit programs on 64-bit kernel. vDSO will be disabled on 32-bit linux kernel on sparc. *) vclock_gettime.c contains all the vdso functions. Since data page is mapped before the vdso code page, the pointer to data page is got by subracting offset from an address in the vdso code page. The return address stored in %i7 is used for this purpose. *) During compilation, both 32-bit and 64-bit vdso images are compiled and are converted into raw bytes by vdso2c program to be ready for mapping into the process. 32-bit images are compiled only if CONFIG_COMPAT is enabled. vdso2c generates two files vdso-image-64.c and vdso-image-32.c which contains the respective vDSO image in C structure. *) During vdso initialization, required number of vdso pages are allocated and raw bytes are copied into the pages. *) During every exec, these pages are mapped into the process through arch_setup_additional_pages and the location of mapping is passed on to the process through aux vector AT_SYSINFO_EHDR which is used by glibc. *) A new update_vsyscall routine for sparc is added to keep the data page in vdso updated. *) As vDSO cannot contain dynamically relocatable references, a new version of cpu_relax is added for the use of vDSO. This change also requires a putback to glibc to use vDSO. For testing, programs planning to try vDSO can be compiled against the generated vdso(64/32).so in the source. Testing: ======== [root@localhost ~]# cat vdso_test.c int main() { struct timespec tv_start, tv_end; struct timeval tv_tmp; int i; int count = 1 * 1000 * 10000; long long diff; clock_gettime(0, &tv_start); for (i = 0; i < count; i++) gettimeofday(&tv_tmp, NULL); clock_gettime(0, &tv_end); diff = (long long)(tv_end.tv_sec - tv_start.tv_sec)*(1*1000*1000*1000); diff += (tv_end.tv_nsec - tv_start.tv_nsec); printf("Start sec: %d\n", tv_start.tv_sec); printf("End sec : %d\n", tv_end.tv_sec); printf("%d cycles in %lld ns = %f ns/cycle\n", count, diff, (double)diff / (double)count); return 0; } [root@localhost ~]# cc vdso_test.c -o t32_without_fix -m32 -lrt [root@localhost ~]# ./t32_without_fix Start sec: 1502396130 End sec : 1502396140 10000000 cycles in 9565148528 ns = 956.514853 ns/cycle [root@localhost ~]# cc vdso_test.c -o t32_with_fix -m32 ./vdso32.so.dbg [root@localhost ~]# ./t32_with_fix Start sec: 1502396168 End sec : 1502396169 10000000 cycles in 798141262 ns = 79.814126 ns/cycle [root@localhost ~]# cc vdso_test.c -o t64_without_fix -m64 -lrt [root@localhost ~]# ./t64_without_fix Start sec: 1502396208 End sec : 1502396218 10000000 cycles in 9846091800 ns = 984.609180 ns/cycle [root@localhost ~]# cc vdso_test.c -o t64_with_fix -m64 ./vdso64.so.dbg [root@localhost ~]# ./t64_with_fix Start sec: 1502396257 End sec : 1502396257 10000000 cycles in 380984048 ns = 38.098405 ns/cycle V1 to V2 Changes: ================= Added hot patching code to switch the read stick instruction to read tick instruction based on the hardware. V2 to V3 Changes: ================= Merged latest changes from sparc-next and moved the initialization of clocksource_tick.archdata.vclock_mode to time_init_early. Disabled queued spinlock and rwlock configuration when simulating 32-bit config to compile 32-bit VDSO. V3 to V4 Changes: ================= Hardcoded the page size as 8192 in linker script for both 64-bit and 32-bit binaries. Removed unused variables in vdso2c.h. Added -mv8plus flag to Makefile to prevent the generation of relocation entries for __lshrdi3 in 32-bit vdso binary. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_startGustavo A. R. Silva
It seems that the intention of the code is to null check the value returned by function genlmsg_put. But the current code is null checking the address of the pointer that holds the value returned by genlmsg_put. Fix this by properly null checking the value returned by function genlmsg_put in order to avoid a pontential null pointer dereference. Addresses-Coverity-ID: 1461561 ("Dereference before null check") Addresses-Coverity-ID: 1461562 ("Dereference null return value") Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15Merge branch 'netem-fix-compilation-on-32-bit'David S. Miller
Stephen Hemminger says: ==================== netem: fix compilation on 32 bit A couple of places where 64 bit CPU was being assumed incorrectly. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15netem: remove unnecessary 64 bit modulusStephen Hemminger
Fix compilation on 32 bit platforms (where doing modulus operation with 64 bit requires extra glibc functions) by truncation. The jitter for table distribution is limited to a 32 bit value because random numbers are scaled as 32 bit value. Also fix some whitespace. Fixes: 99803171ef04 ("netem: add uapi to express delay and jitter in nanoseconds") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15netem: use 64 bit divide by rateStephen Hemminger
Since times are now expressed in nanosecond, need to now do true 64 bit divide. Old code would truncate rate at 32 bits. Rename function to better express current usage. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15tcp: Namespace-ify sysctl_tcp_default_congestion_controlStephen Hemminger
Make default TCP default congestion control to a per namespace value. This changes default congestion control to a pointer to congestion ops (rather than implicit as first element of available lsit). The congestion control setting of new namespaces is inherited from the current setting of the root namespace. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15net: Protect iterations over net::fib_notifier_ops in fib_seq_sum()Kirill Tkhai
There is at least unlocked deletion of net->ipv4.fib_notifier_ops from net::fib_notifier_ops: ip_fib_net_exit() rtnl_unlock() fib4_notifier_exit() fib_notifier_ops_unregister(net->ipv4.notifier_ops) list_del_rcu(&ops->list) So fib_seq_sum() can't use rtnl_lock() only for protection. The possible solution could be to use rtnl_lock() in fib_notifier_ops_unregister(), but this adds a possible delay during net namespace creation, so we better use rcu_read_lock() till someone really needs the mutex (if that happens). Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15ipv6: set all.accept_dad to 0 by defaultNicolas Dichtel
With commits 35e015e1f577 and a2d3f3e33853, the global 'accept_dad' flag is also taken into account (default value is 1). If either global or per-interface flag is non-zero, DAD will be enabled on a given interface. This is not backward compatible: before those patches, the user could disable DAD just by setting the per-interface flag to 0. Now, the user instead needs to set both flags to 0 to actually disable DAD. Restore the previous behaviour by setting the default for the global 'accept_dad' flag to 0. This way, DAD is still enabled by default, as per-interface flags are set to 1 on device creation, but setting them to 0 is enough to disable DAD on a given interface. - Before 35e015e1f57a7 and a2d3f3e33853: global per-interface DAD enabled [default] 1 1 yes X 0 no X 1 yes - After 35e015e1f577 and a2d3f3e33853: global per-interface DAD enabled [default] 1 1 yes 0 0 no 0 1 yes 1 0 yes - After this fix: global per-interface DAD enabled 1 1 yes 0 0 no [default] 0 1 yes 1 0 yes Fixes: 35e015e1f577 ("ipv6: fix net.ipv6.conf.all interface DAD handlers") Fixes: a2d3f3e33853 ("ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real") CC: Stefano Brivio <sbrivio@redhat.com> CC: Matteo Croce <mcroce@redhat.com> CC: Erik Kline <ek@google.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15uapi: fix linux/tls.h userspace compilation errorDmitry V. Levin
Move inclusion of a private kernel header <net/tcp.h> from uapi/linux/tls.h to its only user - net/tls.h, to fix the following linux/tls.h userspace compilation error: /usr/include/linux/tls.h:41:21: fatal error: net/tcp.h: No such file or directory As to this point uapi/linux/tls.h was totaly unusuable for userspace, cleanup this header file further by moving other redundant includes to net/tls.h. Fixes: 3c4d7559159b ("tls: kernel TLS support") Cc: <stable@vger.kernel.org> # v4.13+ Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15usbnet: ipheth: prevent TX queue timeouts when device not readyAlexander Kappner
iOS devices require the host to be "trusted" before servicing network packets. Establishing trust requires the user to confirm a dialog on the iOS device.Until trust is established, the iOS device will silently discard network packets from the host. Currently, the ipheth driver does not detect whether an iOS device has established trust with the host, and immediately sets up the transmit queues. This causes the following problems: - Kernel taint due to WARN() in netdev watchdog. - Dmesg spam ("TX timeout"). - Disruption of user space networking activity (dhcpd, etc...) when new interface comes up but cannot be used. - Unnecessary host and device wakeups and USB traffic Example dmesg output: [ 1101.319778] NETDEV WATCHDOG: eth1 (ipheth): transmit queue 0 timed out [ 1101.319817] ------------[ cut here ]------------ [ 1101.319828] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x20f/0x220 [ 1101.319831] Modules linked in: ipheth usbmon nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) iwlmvm mac80211 iwlwifi btusb btrtl btbcm btintel qmi_wwan bluetooth cfg80211 ecdh_generic thinkpad_acpi rfkill [last unloaded: ipheth] [ 1101.319861] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P O 4.13.12.1 #1 [ 1101.319864] Hardware name: LENOVO 20ENCTO1WW/20ENCTO1WW, BIOS N1EET62W (1.35 ) 11/10/2016 [ 1101.319867] task: ffffffff81e11500 task.stack: ffffffff81e00000 [ 1101.319873] RIP: 0010:dev_watchdog+0x20f/0x220 [ 1101.319876] RSP: 0018:ffff8810a3c03e98 EFLAGS: 00010292 [ 1101.319880] RAX: 000000000000003a RBX: 0000000000000000 RCX: 0000000000000000 [ 1101.319883] RDX: ffff8810a3c15c48 RSI: ffffffff81ccbfc2 RDI: 00000000ffffffff [ 1101.319886] RBP: ffff880c04ebc41c R08: 0000000000000000 R09: 0000000000000379 [ 1101.319889] R10: 00000100696589d0 R11: 0000000000000378 R12: ffff880c04ebc000 [ 1101.319892] R13: 0000000000000000 R14: 0000000000000001 R15: ffff880c2865fc80 [ 1101.319896] FS: 0000000000000000(0000) GS:ffff8810a3c00000(0000) knlGS:0000000000000000 [ 1101.319899] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1101.319902] CR2: 00007f3ff24ac000 CR3: 0000000001e0a000 CR4: 00000000003406f0 [ 1101.319905] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1101.319908] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1101.319910] Call Trace: [ 1101.319914] <IRQ> [ 1101.319921] ? dev_graft_qdisc+0x70/0x70 [ 1101.319928] ? dev_graft_qdisc+0x70/0x70 [ 1101.319934] ? call_timer_fn+0x2e/0x170 [ 1101.319939] ? dev_graft_qdisc+0x70/0x70 [ 1101.319944] ? run_timer_softirq+0x1ea/0x440 [ 1101.319951] ? timerqueue_add+0x54/0x80 [ 1101.319956] ? enqueue_hrtimer+0x38/0xa0 [ 1101.319963] ? __do_softirq+0xed/0x2e7 [ 1101.319970] ? irq_exit+0xb4/0xc0 [ 1101.319976] ? smp_apic_timer_interrupt+0x39/0x50 [ 1101.319981] ? apic_timer_interrupt+0x8c/0xa0 [ 1101.319983] </IRQ> [ 1101.319992] ? cpuidle_enter_state+0xfa/0x2a0 [ 1101.319999] ? do_idle+0x1a3/0x1f0 [ 1101.320004] ? cpu_startup_entry+0x5f/0x70 [ 1101.320011] ? start_kernel+0x444/0x44c [ 1101.320017] ? early_idt_handler_array+0x120/0x120 [ 1101.320023] ? x86_64_start_kernel+0x145/0x154 [ 1101.320028] ? secondary_startup_64+0x9f/0x9f [ 1101.320033] Code: 20 04 00 00 eb 9f 4c 89 e7 c6 05 59 44 71 00 01 e8 a7 df fd ff 89 d9 4c 89 e6 48 c7 c7 70 b7 cd 81 48 89 c2 31 c0 e8 97 64 90 ff <0f> ff eb bf 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 [ 1101.320103] ---[ end trace 0cc4d251e2b57080 ]--- [ 1101.320110] ipheth 1-5:4.2: ipheth_tx_timeout: TX timeout The last message "TX timeout" is repeated every 5 seconds until trust is established or the device is disconnected, filling up dmesg. The proposed patch eliminates the problem by, upon connection, keeping the TX queue and carrier disabled until a packet is first received from the iOS device. This is reflected by the confirmed_pairing variable in the device structure. Only after at least one packet has been received from the iOS device, the transmit queue and carrier are brought up during the periodic device poll in ipheth_carrier_set. Because the iOS device will always send a packet immediately upon trust being established, this should not delay the interface becoming useable. To prevent failed UBRs in ipheth_rcvbulk_callback from perpetually re-enabling the queue if it was disabled, a new check is added so only successful transfers re-enable the queue, whereas failed transfers only trigger an immediate poll. This has the added benefit of removing the periodic control requests to the iOS device until trust has been established and thus should reduce wakeup events on both the host and the iOS device. Signed-off-by: Alexander Kappner <agk@godking.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15vhost_net: conditionally enable tx pollingJason Wang
We always poll tx for socket, this is sub optimal since this will slightly increase the waitqueue traversing time and more important, vhost could not benefit from commit 9e641bdcfa4e ("net-tun: restructure tun_do_read for better sleep/wakeup efficiency") even if we've stopped rx polling during handle_rx(), tx poll were still left in the waitqueue. Pktgen from a remote host to VM over mlx4 on two 2.00GHz Xeon E5-2650 shows 11.7% improvements on rx PPS. (from 1.28Mpps to 1.44Mpps) Cc: Wei Xu <wexu@redhat.com> Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15uapi: fix linux/rxrpc.h userspace compilation errorsDmitry V. Levin
Consistently use types provided by <linux/types.h> to fix the following linux/rxrpc.h userspace compilation errors: /usr/include/linux/rxrpc.h:24:2: error: unknown type name 'u16' u16 srx_service; /* service desired */ /usr/include/linux/rxrpc.h:25:2: error: unknown type name 'u16' u16 transport_type; /* type of transport socket (SOCK_DGRAM) */ /usr/include/linux/rxrpc.h:26:2: error: unknown type name 'u16' u16 transport_len; /* length of transport address */ Use __kernel_sa_family_t instead of sa_family_t the same way as uapi/linux/in.h does, to fix the following linux/rxrpc.h userspace compilation errors: /usr/include/linux/rxrpc.h:23:2: error: unknown type name 'sa_family_t' sa_family_t srx_family; /* address family */ /usr/include/linux/rxrpc.h:28:3: error: unknown type name 'sa_family_t' sa_family_t family; /* transport address family */ Fixes: 727f8914477e ("rxrpc: Expose UAPI definitions to userspace") Cc: <stable@vger.kernel.org> # v4.14 Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-14dax: fix PMD faults on zero-length filesJeff Moyer
PMD faults on a zero length file on a file system mounted with -o dax will not generate SIGBUS as expected. fd = open(...O_TRUNC); addr = mmap(NULL, 2*1024*1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); *addr = 'a'; <expect SIGBUS> The problem is this code in dax_iomap_pmd_fault: max_pgoff = (i_size_read(inode) - 1) >> PAGE_SHIFT; If the inode size is zero, we end up with a max_pgoff that is way larger than 0. :) Fix it by using DIV_ROUND_UP, as is done elsewhere in the kernel. I tested this with some simple test code that ensured that SIGBUS was received where expected. Cc: <stable@vger.kernel.org> Fixes: 642261ac995e ("dax: add struct iomap based DAX PMD support") Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-15powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 featureMichael Ellerman
Recently we added a CPU feature for Power9 DD2.0, to capture the fact that some workarounds are required only on Power9 DD1 and DD2.0 but not DD2.1 or later. Then in commit 9d2f510a66ec ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1") and commit e3646330cf66 "powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg: BEGIN_FTR_SECTION PPC_INVALIDATE_ERAT END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20) Unfortunately although this reads as "if set DD1 or DD2.0", the or is a bitwise or and actually generates a mask of both bits. The code that does the feature patching then checks that the value of the CPU features masked with that mask are equal to the mask. So the end result is we're checking for DD1 and DD20 being set, which never happens. Yes the API is terrible. Removing the ERAT workaround on DD2.0 results in random SEGVs, the system tends to boot, but things randomly die including sometimes dhclient, udev etc. To fix the problem and hopefully avoid it in future, we remove the DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This allows us to easily express that the workarounds are required if DD2.1 is not set. At some point we will drop the DD1 workarounds entirely and some of this can be cleaned up. Fixes: 9d2f510a66ec ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1") Fixes: e3646330cf66 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-14block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUPLuca Miccio
BFQ currently creates, and updates, its own instance of the whole set of blkio statistics that cfq creates. Yet, from the comments of Tejun Heo in [1], it turned out that most of these statistics are meant/useful only for debugging. This commit makes BFQ create the latter, debugging statistics only if the option CONFIG_DEBUG_BLK_CGROUP is set. By doing so, this commit also enables BFQ to enjoy a high perfomance boost. The reason is that, if CONFIG_DEBUG_BLK_CGROUP is not set, then BFQ has to update far fewer statistics, and, in particular, not the heaviest to update. To give an idea of the benefits, if CONFIG_DEBUG_BLK_CGROUP is not set, then, on an Intel i7-4850HQ, and with 8 threads doing random I/O in parallel on null_blk (configured with 0 latency), the throughput of BFQ grows from 310 to 400 KIOPS (+30%). We have measured similar or even much higher boosts with other CPUs: e.g., +45% with an ARM CortexTM-A53 Octa-core. Our results have been obtained and can be reproduced very easily with the script in [1]. [1] https://www.spinics.net/lists/linux-block/msg18943.html Suggested-by: Tejun Heo <tj@kernel.org> Suggested-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14block, bfq: update blkio stats outside the scheduler lockPaolo Valente
bfq invokes various blkg_*stats_* functions to update the statistics contained in the special files blkio.bfq.* in the blkio controller groups, i.e., the I/O accounting related to the proportional-share policy provided by bfq. The execution of these functions takes a considerable percentage, about 40%, of the total per-request execution time of bfq (i.e., of the sum of the execution time of all the bfq functions that have to be executed to process an I/O request from its creation to its destruction). This reduces the request-processing rate sustainable by bfq noticeably, even on a multicore CPU. In fact, the bfq functions that invoke blkg_*stats_* functions cannot be executed in parallel with the rest of the code of bfq, because both are executed under the same same per-device scheduler lock. To reduce this slowdown, this commit moves, wherever possible, the invocation of these functions (more precisely, of the bfq functions that invoke blkg_*stats_* functions) outside the critical sections protected by the scheduler lock. With this change, and with all blkio.bfq.* statistics enabled, the throughput grows, e.g., from 250 to 310 KIOPS (+25%) on an Intel i7-4850HQ, in case of 8 threads doing random I/O in parallel on null_blk, with the latter configured with 0 latency. We obtained the same or higher throughput boosts, up to +30%, with other processors (some figures are reported in the documentation). For our tests, we used the script [1], with which our results can be easily reproduced. NOTE. This commit still protects the invocation of blkg_*stats_* functions with the request_queue lock, because the group these functions are invoked on may otherwise disappear before or while these functions are executed. Fortunately, tests without even this lock show, by difference, that the serialization caused by this lock has a little impact (at most ~5% of throughput reduction). [1] https://github.com/Algodev-github/IOSpeed Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14block, bfq: add missing invocations of bfqg_stats_update_io_add/removeLuca Miccio
bfqg_stats_update_io_add and bfqg_stats_update_io_remove are to be invoked, respectively, when an I/O request enters and when an I/O request exits the scheduler. Unfortunately, bfq does not fully comply with this scheme, because it does not invoke these functions for requests that are inserted into or extracted from its priority dispatch list. This commit fixes this mistake. Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14doc, block, bfq: update max IOPS sustainable with BFQPaolo Valente
We have investigated more deeply the performance of BFQ, in terms of number of IOPS that can be processed by the CPU when BFQ is used as I/O scheduler. In more detail, using the script [1], we have measured the number of IOPS reached on top of a null block device configured with zero latency, as a function of the workload (sequential read, sequential write, random read, random write) and of the system (we considered desktops, laptops and embedded systems). Basing on the resulting figures, with this commit we update the current, conservative IOPS range reported in BFQ documentation. In particular, the documentation now reports, for each of three different systems, the lowest number of IOPS obtained for that system with the above test (namely, the value obtained with the workload leading to the lowest IOPS). [1] https://github.com/Algodev-github/IOSpeed Reviewed-by: Lee Tibbert <lee.tibbert@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14ide: Make ide_cdrom_prep_fs() initialize the sense buffer pointerBart Van Assche
The changes introduced through commit 82ed4db499b8 assume that the sense buffer pointer in struct scsi_request is initialized for all requests - passthrough and filesystem requests. Hence make sure that that pointer is initialized for filesystem requests. Remove the memset() call that clears .cmd because the scsi_req_init() call in ide_initialize_rq() already initializes the .cmd. Fixes: commit 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hongxu Jia <hongxu.jia@windriver.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14md: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Shaohua Li <shli@kernel.org> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: linux-bcache@vger.kernel.org Cc: linux-raid@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14block: swim3: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14block/aoe: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jens Axboe <axboe@kernel.dk> Cc: "Ed L. Cashin" <ed.cashin@acm.org> Cc: linux-block@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14amifloppy: Convert timers to use timer_setup()Kees Cook
This converts the amifloppy driver to pass the timer pointer to the callback instead of the drive number (and flags). It eliminates the decusagecounter flag, as it was unused, and drops the ininterrupt flag which appeared to be a needless optimization. The drive can then be calculated from the offset of the timer in the drive timer array. Additionally moves to a static data variable instead of the soon-to-be-gone timer->data field. Cc: Jens Axboe <axboe@kernel.dk> Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14block/floppy: Convert callback to pass timer_listKees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to passing in the timer pointer explicitly. Calculate the drive from the offset of the timer in the timer list. Cc: Jiri Kosina <jikos@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <tom.leiming@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Geliang Tang <geliangtang@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14Merge tag 'devicetree-for-4.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree updates from Rob Herring: "A bigger diffstat than usual with the kbuild changes and a tree wide fix in the binding documentation. Summary: - kbuild cleanups and improvements for dtbs - Code clean-up of overlay code and fixing for some long standing memory leak and race condition in applying overlays - Improvements to DT memory usage making sysfs/kobjects optional and skipping unflattening of disabled nodes. This is part of kernel tinification efforts. - Final piece of removing storing the full path for every DT node. The prerequisite conversion of printk's to use device_node format specifier happened in 4.14. - Sync with current upstream dtc. This brings additional checks to dtb compiling. - Binding doc tree wide removal of leading 0s from examples - RTC binding documentation adding missing devices and some consolidation of duplicated bindings - Vendor prefix documentation for nutsboard, Silicon Storage Technology, shimafuji, Tecon Microprocessor Technologies, DH electronics GmbH, Opal Kelly, and Next Thing" * tag 'devicetree-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (55 commits) dt-bindings: usb: add #phy-cells to usb-nop-xceiv dt-bindings: Remove leading zeros from bindings notation kbuild: handle dtb-y and CONFIG_OF_ALL_DTBS natively in Makefile.lib MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry kbuild: clean up *.dtb and *.dtb.S patterns from top-level Makefile .gitignore: move *.dtb and *.dtb.S patterns to the top-level .gitignore .gitignore: sort normal pattern rules alphabetically dt-bindings: add vendor prefix for Next Thing Co. scripts/dtc: Update to upstream version v1.4.5-6-gc1e55a5513e9 of: dynamic: fix memory leak related to properties of __of_node_dup of: overlay: make pr_err() string unique of: overlay: pr_err from return NOTIFY_OK to overlay apply/remove of: overlay: remove unneeded check for NULL kbasename() of: overlay: remove a dependency on device node full_name of: overlay: simplify applying symbols from an overlay of: overlay: avoid race condition between applying multiple overlays of: overlay: loosen overly strict phandle clash check of: overlay: expand check of whether overlay changeset can be removed of: overlay: detect cases where device tree may become corrupt of: overlay: minor restructuring ...
2017-11-14Merge tag 'leds_for_4.15rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED updates from Jacek Anaszewski: "New LED class driver: - add a driver for PC Engines APU/APU2 LEDs New LED trigger: - add a system activity LED trigger LED core improvements: - replace flags bit shift with BIT() macros Convert timers to use timer_setup() in: - led-core - ledtrig-activity - ledtrig-heartbeat - ledtrig-transient LED class drivers fixes: - lp55xx: fix spelling mistake: 'cound' -> 'could' - tca6507: Remove unnecessary reg check - pca955x: Don't invert requested value in pca955x_gpio_set_value() LED documentation improvements: - update 00-INDEX file" * tag 'leds_for_4.15rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: Add driver for PC Engines APU/APU2 LEDs leds: lp55xx: fix spelling mistake: 'cound' -> 'could' leds: Convert timers to use timer_setup() Documentation: leds: Update 00-INDEX file leds: tca6507: Remove unnecessary reg check leds: ledtrig-heartbeat: Convert timers to use timer_setup() leds: Replace flags bit shift with BIT() macros leds: pca955x: Don't invert requested value in pca955x_gpio_set_value() leds: ledtrig-activity: Add a system activity LED trigger