linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2014-03-06	cpufreq: Initialize governor for a new policy under policy->rwsem	Viresh Kumar
	policy->rwsem is used to lock access to all parts of code modifying struct cpufreq_policy, but it's not used on a new policy created by __cpufreq_add_dev(). Because of that, if cpufreq_update_policy() is called in a tight loop on one CPU in parallel with offline/online of another CPU, then the following crash can be triggered: Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = c0003000 [00000020] pgd=80000000004003, pmd=00000000 Internal error: Oops: 206 [#1] PREEMPT SMP ARM PC is at __cpufreq_governor+0x10/0x1ac LR is at cpufreq_update_policy+0x114/0x150 ---[ end trace f23a8defea6cd706 ]--- Kernel panic - not syncing: Fatal exception CPU0: stopping CPU: 0 PID: 7136 Comm: mpdecision Tainted: G D W 3.10.0-gd727407-00074-g979ede8 #396 [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68) [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44) [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc) [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78) [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74) [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c) [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180) [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68) [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30) Fix that by taking locks at appropriate places in __cpufreq_add_dev() as well. Reported-by: Saravana Kannan <skannan@codeaurora.org> Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06	cpufreq: Initialize policy before making it available for others to use	Viresh Kumar
	Policy must be fully initialized before it is being made available for use by others. Otherwise cpufreq_cpu_get() would be able to grab a half initialized policy structure that might not have affected_cpus (for example) populated. Then, anybody accessing those fields will get a wrong value and that will lead to unpredictable results. In order to fix this, do all the necessary initialization before we make the policy structure available via cpufreq_cpu_get(). That will guarantee that any code accessing fields of the policy will get correct data from them. Reported-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06	cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions	Aaron Plattner
	If a module calls cpufreq_get while cpufreq is initializing, it's possible for it to be called after cpufreq_driver is set but before cpufreq_cpu_data is written during subsys_interface_register. This happens because cpufreq_get doesn't take the cpufreq_driver_lock around its use of cpufreq_cpu_data. Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather than reading it out of cpufreq_cpu_data directly. cpufreq_cpu_get() takes the appropriate locks to prevent this race from happening. Since it's possible for policy to be NULL if the caller passes in an invalid CPU number or calls the function before cpufreq is initialized, delete the BUG_ON(!policy) and simply return 0. Don't try to return -ENOENT because that's negative and the function returns an unsigned integer. References: https://bbs.archlinux.org/viewtopic.php?id=177934 Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06	drm/i915: Fix PSR programming	Ben Widawsky
	\| has a higher precedence than ?. Therefore, the calculation doesn't do at all what you would expect. Thanks to Ken for convincing me that this was indeed the issue. Send me back to C programmer school, please. I'm sort of surprised PSR was continuing to work for people. It should be broken IMO (and it was broken for me, but I had assumed it never worked). Regression from: commit ed8546ac1f99b850879f07b1e9b06b42fb0a36d9 Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Mon Nov 4 22:45:05 2013 -0800 drm/i915/bdw: Support eDP PSR Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com> Cc: Kenneth Graunke <kenneth.w.graunke@intel.com> Cc: Art Runyan <arthur.j.runyan@intel.com> Reported-by: "Kumar, Kiran S" <kiran.s.kumar@intel.com> Cc: stable@vger.kernel.org [v3.13+] Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-03-06	clocksource: vf_pit_timer: use complement for sched_clock reading	Stefan Agner
	Vybrids PIT register is monitonic decreasing. However, sched_clock reading needs to be monitonic increasing. Use bitwise not to get the complement of the clock register. This fixes the clock going backward. Also, the clock now starts at 0 since we load the register with the maximum value at start. Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Shawn Guo <shawn.guo@linaro.org> Cc: daniel.lezcano@linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Link: http://lkml.kernel.org/r/d25af915993aec1b486be653eb86f748ddef54fe.1394057313.git.stefan@agner.ch Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-03-06	ARM: KVM: fix non-VGIC compilation	Marc Zyngier
	Add a stub for kvm_vgic_addr when compiling without CONFIG_KVM_ARM_VGIC. The usefulness of this configurarion is extremely doubtful, but let's fix it anyway (until we decide that we'll always support a VGIC). Reported-by: Michele Paolino <m.paolino@virtualopensystems.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-05	clk: shmobile: rcar-gen2: Use kick bit to allow Z clock frequency change	Benoit Cousson
	The Z clock frequency change is effective only after setting the kick bit located in the FRQCRB register. Without that, the CA15 CPUs clock rate will never change. Fix that by checking if the kick bit is cleared and enable it to make the clock rate change effective. The bit is cleared automatically upon completion. Signed-off-by: Benoit Cousson <bcousson+renesas@baylibre.com> Acked-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Mike Turquette <mturquette@linaro.org>
2014-03-05	hyperv: Move state setting for link query	Haiyang Zhang
	It moves the state setting for query into rndis_filter_receive_response(). All callbacks including query-complete and status-callback are synchronized by channel->inbound_lock. This prevents pentential race between them. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	net: macb: DMA-unmap full rx-buffer	Soren Brinkmann
	When allocating RX buffers a fixed size is used, while freeing is based on actually received bytes, resulting in the following kernel warning when CONFIG_DMA_API_DEBUG is enabled: WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1051 check_unmap+0x258/0x894() macb e000b000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002d170040] [map size=1536 bytes] [unmap size=60 bytes] Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc3-xilinx-00220-g49f84081ce4f #65 [<c001516c>] (unwind_backtrace) from [<c0011df8>] (show_stack+0x10/0x14) [<c0011df8>] (show_stack) from [<c03c775c>] (dump_stack+0x7c/0xc8) [<c03c775c>] (dump_stack) from [<c00245cc>] (warn_slowpath_common+0x60/0x84) [<c00245cc>] (warn_slowpath_common) from [<c0024670>] (warn_slowpath_fmt+0x2c/0x3c) [<c0024670>] (warn_slowpath_fmt) from [<c0227d44>] (check_unmap+0x258/0x894) [<c0227d44>] (check_unmap) from [<c0228588>] (debug_dma_unmap_page+0x64/0x70) [<c0228588>] (debug_dma_unmap_page) from [<c02ab78c>] (gem_rx+0x118/0x170) [<c02ab78c>] (gem_rx) from [<c02ac4d4>] (macb_poll+0x24/0x94) [<c02ac4d4>] (macb_poll) from [<c031222c>] (net_rx_action+0x6c/0x188) [<c031222c>] (net_rx_action) from [<c0028a28>] (__do_softirq+0x108/0x280) [<c0028a28>] (__do_softirq) from [<c0028e8c>] (irq_exit+0x84/0xf8) [<c0028e8c>] (irq_exit) from [<c000f360>] (handle_IRQ+0x68/0x8c) [<c000f360>] (handle_IRQ) from [<c0008528>] (gic_handle_irq+0x3c/0x60) [<c0008528>] (gic_handle_irq) from [<c0012904>] (__irq_svc+0x44/0x78) Exception stack(0xc056df20 to 0xc056df68) df20: 00000001 c0577430 00000000 c0577430 04ce8e0d 00000002 edfce238 00000000 df40: 04e20f78 00000002 c05981f4 00000000 00000008 c056df68 c0064008 c02d7658 df60: 20000013 ffffffff [<c0012904>] (__irq_svc) from [<c02d7658>] (cpuidle_enter_state+0x54/0xf8) [<c02d7658>] (cpuidle_enter_state) from [<c02d77dc>] (cpuidle_idle_call+0xe0/0x138) [<c02d77dc>] (cpuidle_idle_call) from [<c000f660>] (arch_cpu_idle+0x8/0x3c) [<c000f660>] (arch_cpu_idle) from [<c006bec4>] (cpu_startup_entry+0xbc/0x124) [<c006bec4>] (cpu_startup_entry) from [<c053daec>] (start_kernel+0x350/0x3b0) ---[ end trace d5fdc38641bd3a11 ]--- Mapped at: [<c0227184>] debug_dma_map_page+0x48/0x11c [<c02ab32c>] gem_rx_refill+0x154/0x1f8 [<c02ac7b4>] macb_open+0x270/0x3e0 [<c03152e0>] __dev_open+0x7c/0xfc [<c031554c>] __dev_change_flags+0x8c/0x140 Fixing this by passing the same size which is passed during mapping the memory to the unmap function as well. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Reviewed-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	net: macb: Check DMA mappings for error	Soren Brinkmann
	With CONFIG_DMA_API_DEBUG enabled the following warning is printed: WARNING: CPU: 0 PID: 619 at lib/dma-debug.c:1101 check_unmap+0x758/0x894() macb e000b000.ethernet: DMA-API: device driver failed to check map error[device address=0x000000002d171c02] [size=322 bytes] [mapped as single] Modules linked in: CPU: 0 PID: 619 Comm: udhcpc Not tainted 3.14.0-rc3-xilinx-00219-gd158fc7f36a2 #63 [<c001516c>] (unwind_backtrace) from [<c0011df8>] (show_stack+0x10/0x14) [<c0011df8>] (show_stack) from [<c03c7714>] (dump_stack+0x7c/0xc8) [<c03c7714>] (dump_stack) from [<c00245cc>] (warn_slowpath_common+0x60/0x84) [<c00245cc>] (warn_slowpath_common) from [<c0024670>] (warn_slowpath_fmt+0x2c/0x3c) [<c0024670>] (warn_slowpath_fmt) from [<c0228244>] (check_unmap+0x758/0x894) [<c0228244>] (check_unmap) from [<c0228588>] (debug_dma_unmap_page+0x64/0x70) [<c0228588>] (debug_dma_unmap_page) from [<c02aba64>] (macb_interrupt+0x1f8/0x2dc) [<c02aba64>] (macb_interrupt) from [<c006c6e4>] (handle_irq_event_percpu+0x2c/0x178) [<c006c6e4>] (handle_irq_event_percpu) from [<c006c86c>] (handle_irq_event+0x3c/0x5c) [<c006c86c>] (handle_irq_event) from [<c006f548>] (handle_fasteoi_irq+0xb8/0x100) [<c006f548>] (handle_fasteoi_irq) from [<c006c148>] (generic_handle_irq+0x20/0x30) [<c006c148>] (generic_handle_irq) from [<c000f35c>] (handle_IRQ+0x64/0x8c) [<c000f35c>] (handle_IRQ) from [<c0008528>] (gic_handle_irq+0x3c/0x60) [<c0008528>] (gic_handle_irq) from [<c0012904>] (__irq_svc+0x44/0x78) Exception stack(0xed197f60 to 0xed197fa8) 7f60: 00000134 60000013 bd94362e bd94362e be96b37c 00000014 fffffd72 00000122 7f80: c000ebe4 ed196000 00000000 00000011 c032c0d8 ed197fa8 c0064008 c000ea20 7fa0: 60000013 ffffffff [<c0012904>] (__irq_svc) from [<c000ea20>] (ret_fast_syscall+0x0/0x48) ---[ end trace 478f921d0d542d1e ]--- Mapped at: [<c0227184>] debug_dma_map_page+0x48/0x11c [<c02aaca0>] macb_start_xmit+0x184/0x2a8 [<c03143c0>] dev_hard_start_xmit+0x334/0x470 [<c032c09c>] sch_direct_xmit+0x78/0x2f8 [<c0314814>] __dev_queue_xmit+0x318/0x708 due to missing checks of the dma mapping. Add the appropriate checks to fix this. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Reviewed-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk	Daniel Borkmann
	While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable"), we noticed that there's a skb memory leakage in the error path. Running the same reproducer as in ec0223ec48a9 and by unconditionally jumping to the error label (to simulate an error condition) in sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about the unfreed chunk->auth_chunk skb clone: Unreferenced object 0xffff8800b8f3a000 (size 256): comm "softirq", pid 0, jiffies 4294769856 (age 110.757s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00 ..u^..X......... backtrace: [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210 [<ffffffff81566929>] skb_clone+0x49/0xb0 [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp] [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp] [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp] [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210 [<ffffffff815a64af>] nf_reinject+0xbf/0x180 [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue] [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink] [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0 [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink] [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250 [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0 [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0 [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380 What happens is that commit bbd0d59809f9 clones the skb containing the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case that an endpoint requires COOKIE-ECHO chunks to be authenticated: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- ------------------ AUTH; COOKIE-ECHO ----------------> <-------------------- COOKIE-ACK --------------------- When we enter sctp_sf_do_5_1D_ce() and before we actually get to the point where we process (and subsequently free) a non-NULL chunk->auth_chunk, we could hit the "goto nomem_init" path from an error condition and thus leave the cloned skb around w/o freeing it. The fix is to centrally free such clones in sctp_chunk_destroy() handler that is invoked from sctp_chunk_free() after all refs have dropped; and also move both kfree_skb(chunk->auth_chunk) there, so that chunk->auth_chunk is either NULL (since sctp_chunkify() allocs new chunks through kmem_cache_zalloc()) or non-NULL with a valid skb pointer. chunk->skb and chunk->auth_chunk are the only skbs in the sctp_chunk structure that need to be handeled. While at it, we should use consume_skb() for both. It is the same as dev_kfree_skb() but more appropriately named as we are not a device but a protocol. Also, this effectively replaces the kfree_skb() from both invocations into consume_skb(). Functions are the same only that kfree_skb() assumes that the frame was being dropped after a failure (e.g. for tools like drop monitor), usage of consume_skb() seems more appropriate in function sctp_chunk_destroy() though. Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <yasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	r8152: disable the ECM mode	hayeswang
	There are known issues for switching the drivers between ECM mode and vendor mode. The interrup transfer may become abnormal. The hardware may have the opportunity to die if you change the configuration without unloading the current driver first, because all the control transfers of the current driver would fail after the command of switching the configuration. Although to use the ecm driver and vendor driver independently is fine, it may have problems to change the driver from one to the other by switching the configuration. Additionally, now the vendor mode driver is more powerful than the ECM driver. Thus, disable the ECM mode driver, and let r8152 to set the configuration to vendor mode and reset the device automatically. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	net/mlx4: Support shutdown() interface	Gavin Shan
	In kexec scenario, we failed to load the mlx4 driver in the second kernel because the ownership bit was hold by the first kernel without release correctly. The patch adds shutdown() interface so that the ownership can be released correctly in the first kernel. It also helps avoiding EEH error happened during boot stage of the second kernel because of undesired traffic, which can't be handled by hardware during that stage on Power platform. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Tested-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	bridge: multicast: add sanity check for query source addresses	Linus Lüssing
	MLD queries are supposed to have an IPv6 link-local source address according to RFC2710, section 4 and RFC3810, section 5.1.14. This patch adds a sanity check to ignore such broken MLD queries. Without this check, such malformed MLD queries can result in a denial of service: The queries are ignored by any MLD listener therefore they will not respond with an MLD report. However, without this patch these malformed MLD queries would enable the snooping part in the bridge code, potentially shutting down the according ports towards these hosts for multicast traffic as the bridge did not learn about these listeners. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	net: fix for a race condition in the inet frag code	Nikolay Aleksandrov
	I stumbled upon this very serious bug while hunting for another one, it's a very subtle race condition between inet_frag_evictor, inet_frag_intern and the IPv4/6 frag_queue and expire functions (basically the users of inet_frag_kill/inet_frag_put). What happens is that after a fragment has been added to the hash chain but before it's been added to the lru_list (inet_frag_lru_add) in inet_frag_intern, it may get deleted (either by an expired timer if the system load is high or the timer sufficiently low, or by the fraq_queue function for different reasons) before it's added to the lru_list, then after it gets added it's a matter of time for the evictor to get to a piece of memory which has been freed leading to a number of different bugs depending on what's left there. I've been able to trigger this on both IPv4 and IPv6 (which is normal as the frag code is the same), but it's been much more difficult to trigger on IPv4 due to the protocol differences about how fragments are treated. The setup I used to reproduce this is: 2 machines with 4 x 10G bonded in a RR bond, so the same flow can be seen on multiple cards at the same time. Then I used multiple instances of ping/ping6 to generate fragmented packets and flood the machines with them while running other processes to load the attacked machine. *It is very important to have the _same flow_ coming in on multiple CPUs concurrently. Usually the attacked machine would die in less than 30 minutes, if configured properly to have many evictor calls and timeouts it could happen in 10 minutes or so. An important point to make is that any caller (frag_queue or timer) of inet_frag_kill will remove both the timer refcount and the original/guarding refcount thus removing everything that's keeping the frag from being freed at the next inet_frag_put. All of this could happen before the frag was ever added to the LRU list, then it gets added and the evictor uses a freed fragment. An example for IPv6 would be if a fragment is being added and is at the stage of being inserted in the hash after the hash lock is released, but before inet_frag_lru_add executes (or is able to obtain the lru lock) another overlapping fragment for the same flow arrives at a different CPU which finds it in the hash, but since it's overlapping it drops it invoking inet_frag_kill and thus removing all guarding refcounts, and afterwards freeing it by invoking inet_frag_put which removes the last refcount added previously by inet_frag_find, then inet_frag_lru_add gets executed by inet_frag_intern and we have a freed fragment in the lru_list. The fix is simple, just move the lru_add under the hash chain locked region so when a removing function is called it'll have to wait for the fragment to be added to the lru_list, and then it'll remove it (it works because the hash chain removal is done before the lru_list one and there's no window between the two list adds when the frag can get dropped). With this fix applied I couldn't kill the same machine in 24 hours with the same setup. Fixes: 3ef0eb0db4bf ("net: frag, move LRU list maintenance outside of rwlock") CC: Florian Westphal <fw@strlen.de> CC: Jesper Dangaard Brouer <brouer@redhat.com> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05	drm/i915: Unify CHICKEN_PIPESL_1 register definitions	Ville Syrjälä
	We have two names for the same register CHICKEN_PIPESL_1 and HSW_PIPE_SLICE_CHICKEN_1. Unify it to just one. Also rename the FBCQ disable bit to resemble the name we've given to a similar bit on earlier platforms. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Use RMW to update chicken bits in gen7_enable_fbc()	Ville Syrjälä
	gen7_enable_fbc() may write to some registers which we've already touched, so use RMW so that we don't undo any previous updates. Also note that we implemnt WaFbcAsynchFlipDisableFbcQueue:bdw. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Don't clobber CHICKEN_PIPESL_1 on BDW	Ville Syrjälä
	Misplaced parens cause us to totally clobber the CHICKEN_PIPESL_1 registers with 0xffffffff. Move the parens to the correct place to avoid this. In particular this caused bit 30 of said registers to be set, which caused the sprite CSC to produce incorrect results. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72220 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: reverse dp link param selection, prefer fast over wide again	Daniel Vetter
	... it's this time of the year again. Originally we've frobbed this to fix up some regressions, but maybe our DP code improved sufficiently now that we can dare to do again what the spec recommends. This reverts commit 2514bc510d0c3aadcc5204056bb440fa36845147 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Thu Jun 21 15:13:50 2012 -0700 drm/i915: prefer wide & slow to fast & narrow in DP configs I'm pretty sure I'll regret this patch, but otoh I expect we won't make progress here without poking the devil occasionally. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73694 Cc: peter@colberg.org Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Tested-by: Itai BEN YAACOV <candeb@free.fr> Tested-by: David En <d.engraf@arcor.de> Reported-and-Tested-by: Marcus Bergner <marcusbergner@gmail.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: No need to put forcewake after a reset	Mika Kuoppala
	As we now have intel_uncore_forcewake_reset() no need to do explicit put after reset. v2: rebase Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Fix i915_switch_context() argument name in kerneldoc	Damien Lespiau
	While reading some code, out of boredom, stumbled on a tiny tiny fix. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Remove unused to_gem_object() macro	Damien Lespiau
	That macro was only ever used to convert ring->private into a gem object (hence the forceful cast). ring->private doesn't even exist anymore as it was transmogrified by Chris in: commit 0d1aacac36530fce058d7a0db3da7befd5765417 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Aug 26 20:58:11 2013 +0100 drm/i915: Embed the ring->private within the struct intel_ring_buffer Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Make i915_gem_retire_requests_ring() static	Damien Lespiau
	Its last usage outside of i915_gem.c was removed in: commit 1f70999f9052f5a1b0ce1a55aff3808f2ec9fe42 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 27 22:43:07 2014 +0000 drm/i915: Prevent recursion by retiring requests when the ring is full Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Don't just say it, actually force edp vdd	Patrik Jakobsson
	This patch fixes the blank screen bug introduced in 3.14-rc1 on the MacBook Air 6,2. The comments state that we need to force edp vdd so lets put it back. The regression was introduced by the following commit: commit dff392dbd258381a6c3164f38420593f2d291e3b Author: Paulo Zanoni <paulo.r.zanoni@intel.com> Date: Fri Dec 6 17:32:41 2013 -0200 drm/i915: don't touch the VDD when disabling the panel v2: Wrap intel_disable_dp() with _vdd_on and _vdd_off Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74628 Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Make num_sprites a per-pipe value	Damien Lespiau
	In the future, we need to be able to specify per-pipe number of planes/sprites. Let's start today! Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add a for_each_sprite() macro	Damien Lespiau
	This macro is similar to for_each_pipe() we already have. Convert the two call sites we have at the same time. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Replace a few for_each_pipe(i) by for_each_pipe(pipe)	Damien Lespiau
	Consistency throughout the code base is good and remove some room for mistakes (as explained in the "drm/i915: Use a pipe variable to cycle through the pipes" commit) So, let's replace the for_each_pipe(i) occurences by for_each_pipe(pipe) when it's reasonable and practical to do so (eg. when there isn't another pipe variable already). Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Don't declare unnecessary shadowing variable	Damien Lespiau
	'i' is already defined in the function scope and used elsewhere. Let's use it instead. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Use a pipe variable to cycle through the pipes	Damien Lespiau
	I recently fumbled a patch because I wrote twice num_sprites[i], and it was the right thing to do in only 50% of the cases. This patch ensures I need to write num_sprites[pipe], ie it should be self-documented that it's per-pipe number of sprites without having to look at what is 'i' this time around. It's all a lame excuse, but it does make it harder to redo the same mistake. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: We implement WaDisableAsyncFlipPerfMode:bdw	Ville Syrjälä
	Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Implement WaDisableSDEUnitClockGating:bdw	Ville Syrjälä
	Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Disable semaphore wait event idle message on BDW	Ville Syrjälä
	According to BSpec we need to always set this magic bit in ring buffer mode. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Use DIV_ROUND_UP() when calculating number of required FDI lanes	Ville Syrjälä
	If we need precisely N lanes to satisfy the FDI bandwidth requirement, the code would still claim that we need N+1 lanes. Use DIV_ROUND_UP() to get a more accurate answer. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Fix DDI port_clock for VGA output	Ville Syrjälä
	On DDI there's no PLL as such to generate the pixel clock for VGA. Instead we derive the pixel clock from the FDI link frequency. So to make .compute_config match what .get_config does, we need to set the port_clock based on the FDI link frequency. Note that we don't even check the port_clock when selecting the PLL for VGA output. We just assume SPLL at 1.35GHz is what we want, and that does match with the asumption of FDI frequency of 2.7Ghz we have in intel_fdi_link_freq(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74955 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Don't access fifodbg registers on gen8	Mika Kuoppala
	as they don't exists. v2: rename gen6__mt_ to gen7__mt_ as they never get called with gen6 (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Do forcewake reset on gen8	Mika Kuoppala
	When we get control from BIOS there might be mt forcewake bits already set. This causes us to do double mt get without proper clear/ack sequence. Fix this by clearing mt forcewake register on init, like we do with older gens. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: don't flood the logs about bdw semaphores	Jani Nikula
	BDW is no longer flagged as preliminary hw, but without i915.preliminary_hw_support module param set the logs are filled with WARNs about it. Just make semaphores off the BDW per-chip default for now. CC: Ben Widawsky <ben@bwidawsk.net> Reported-by: Sebastien Dufour <sebastien.dufour@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add thread stall DOP clock gating workaround on Broadwell.	Kenneth Graunke
	Ben and I believe this will be necessary on production hardware. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: Shuffle lines to group all ROW_CHICKEN writes and add a cautious comment that this might not be needed on production hw.] Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add a partial instruction shootdown workaround on Broadwell.	Kenneth Graunke
	I believe this will be necessary on production hardware. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Fix whitespace fail spotted by checkpatch. Also add missing :bdw w/a tag that Ville spotted.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add suspend count to error state	Mika Kuoppala
	For example if we get bug reports with similar error states and suspend count is always 1, that might lead the Sherlocks to right general direction. Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add reset count to error state	Mika Kuoppala
	By default we keep only the error state from first hang. However some sneaky user might have cleared the first error state and we assume mistakenly that it is from first hang. As sometimes this matters, it is better to explicitly store the reset count. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add reason for capture in error state	Mika Kuoppala
	We capture error state not only when the GPU hangs but also on other situations as in interrupt errors and in situations where we can kick things forward without GPU reset. There will be log entry on most of these cases. But as error state capture might be only thing we have, if dmesg was not captured. Or as in GEN4 case, interrupt error can trigger error state capture without log entry, the exact reason why capture was made is hard to decipher. v2: Split out the the error code stuff to separate patch (Ben) References: https://bugs.freedesktop.org/show_bug.cgi?id=74193 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Add error code into error state	Mika Kuoppala
	commit 011cf577b2531dfbd2254bd9ec147ad71471abaf Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Tue Feb 4 12:18:55 2014 +0000 drm/i915: Generate a hang error code added error code debug into dmesg. Store this also with error state to make matching dmesg logs and error states easier. As we need to have full ring state for error code generation, do full capture always, print hang message into log and then decide if we need to keep the error state. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Record pid/comm of hanging task	Chris Wilson
	After finding the guilty batch and request, we can use it to find the process that submitted the batch and then add the culprit into the error state. This is a slightly different approach from Ben's in that instead of adding the extra information into the struct i915_hw_context, we use the information already captured in struct drm_file which is then referenced from the request. v2: Also capture the workaround buffer for gen2, so that we can compare its contents against the intended batch for the active request. v3: Rebase (Mika) v4: Check for null context (Chris) checkpatch warnings fixed Link: http://lists.freedesktop.org/archives/intel-gfx/2013-August/032280.html Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v4) Acked-by: Ben Widawsky <ben@bwidawsk.net> Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Rely on accurate request tracking for finding hung batches	Chris Wilson
	In the past, it was possible to have multiple batches per request due to a stray signal or ENOMEM. As a result we had to scan each active object (filtered by those having the COMMAND domain) for the one that contained the ACTHD pointer. This was then made more complicated by the introduction of ppgtt, whereby ACTHD then pointed into the address space of the context and so also needed to be taken into account. This is a fairly robust approach (though the implementation is a little fragile and depends upon the per-generation setup, registers and parameters). However, due to the requirements for hangstats, we needed a robust method for associating batches with a particular request and having that we can rely upon it for finding the associated batch object for error capture. If the batch buffer tracking is not robust enough, that should become apparent quite quickly through an erroneous error capture. That should also help to make sure that the runtime reporting to userspace is robust. It also means that we then report the oldest incomplete batch on each ring, which can be useful for determining the state of userspace at the time of a hang. v2: Use i915_gem_find_active_request (Mika) v3: remove check for ring->get_seqno, split long lines (Ben) v4: check that context is available (Chris) checkpatch warnings fixed Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v3) Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Reset vma->mm_list after unbinding	Chris Wilson
	In place of true activity counting, we walk the list of vma associated with an object managing each on the vm's active/inactive list everytime we call move-to-inactive. This depends upon the vma->mm_list being cleared after unbinding, or else we run into difficulty when tracking the object in multiple vm's - we see a use-after free and corruption of the mm_list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Streamline VLV forcewake handling	Ville Syrjälä
	It occured to me that when we're trying to wake up both render and media wells on VLV, we might end up calling the low level force_wake_get/put two times even though one call would be enough. Make that happen by figuring out which wells really need to be woken up based on the forcewake counts. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by:Deepak S <deepak.s@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Drop the forcewake count inc/dec around register read on VLV	Ville Syrjälä
	VLV is the only platform where we increment/decrement the forcewake count around register access. Drop the inc/dec on VLV to make the forcewake code a bit more unified. The inc/dec are not necessary since we hold the uncore lock around the whole operation. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Fix VLV forcewake after reset	Ville Syrjälä
	Use the render/media specific forcewake counts to properly restore the forcewake status after a GPU reset on VLV. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05	drm/i915: Perform pageflip using mmio if the GPU is terminally wedged	Chris Wilson
	After a hang and failed reset, we cannot use the GPU to execute the page flip instructions. Instead we can force a synchronous mmio flip. (Later, we can reduce the synchronicity of the mmio flip by moving some of the delays off to a worker, like the current page flip code; see vblank tasks.) References: https://bugs.freedesktop.org/show_bug.cgi?id=72631 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>