linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-09-28	rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket	David Howells
	It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on IP_RECVERR, so neither local errors nor ICMP-transported remote errors from IPv4 peer addresses are returned to the AF_RXRPC protocol. Make the sockopt setting code in rxrpc_open_socket() fall through from the AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in the AF_INET6 case. Fixes: f2aeed3a591f ("rxrpc: Fix error reception on AF_INET6 sockets") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28	rxrpc: Make service call handling more robust	David Howells
	Make the following changes to improve the robustness of the code that sets up a new service call: (1) Cache the rxrpc_sock struct obtained in rxrpc_data_ready() to do a service ID check and pass that along to rxrpc_new_incoming_call(). This means that I can remove the check from rxrpc_new_incoming_call() without the need to worry about the socket attached to the local endpoint getting replaced - which would invalidate the check. (2) Cache the rxrpc_peer struct, thereby allowing the peer search to be done once. The peer is passed to rxrpc_new_incoming_call(), thereby saving the need to repeat the search. This also reduces the possibility of rxrpc_publish_service_conn() BUG()'ing due to the detection of a duplicate connection, despite the initial search done by rxrpc_find_connection_rcu() having turned up nothing. This BUG() shouldn't ever get hit since rxrpc_data_ready() should be non-reentrant and the result of the initial search should still hold true, but it has proven possible to hit. I think this may be due to __rxrpc_lookup_peer_rcu() cutting short the iteration over the hash table if it finds a matching peer with a zero usage count, but I don't know for sure since it's only ever been hit once that I know of. Another possibility is that a bug in rxrpc_data_ready() that checked the wrong byte in the header for the RXRPC_CLIENT_INITIATED flag might've let through a packet that caused a spurious and invalid call to be set up. That is addressed in another patch. (3) Fix __rxrpc_lookup_peer_rcu() to skip peer records that have a zero usage count rather than stopping and returning not found, just in case there's another peer record behind it in the bucket. (4) Don't search the peer records in rxrpc_alloc_incoming_call(), but rather either use the peer cached in (2) or, if one wasn't found, preemptively install a new one. Fixes: 8496af50eb38 ("rxrpc: Use RCU to access a peer's service connection tree") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28	rxrpc: Improve up-front incoming packet checking	David Howells
	Do more up-front checking on incoming packets to weed out invalid ones and also ones aimed at services that we don't support. Whilst we're at it, replace the clearing of call and skew if we don't find a connection with just initialising the variables to zero at the top of the function. Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28	rxrpc: Emit BUSY packets when supposed to rather than ABORTs	David Howells
	In the input path, a received sk_buff can be marked for rejection by setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data (such as an abort code) in skb->priority. The rejection is handled by queueing the sk_buff up for dealing with in process context. The output code reads the mark and priority and, theoretically, generates an appropriate response packet. However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT message with a random abort code is generated (since skb->priority wasn't set to anything). Fix this by outputting the appropriate sort of packet. Also, whilst we're at it, most of the marks are no longer used, so remove them and rename the remaining two to something more obvious. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28	rxrpc: Fix RTT gathering	David Howells
	Fix RTT information gathering in AF_RXRPC by the following means: (1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS. (2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready() collects it, set it at that point. (3) Allow ACKs to be requested on the last packet of a client call, but not a service call. We need to be careful lest we undo: bf7d620abf22c321208a4da4f435e7af52551a21 Author: David Howells <dhowells@redhat.com> Date: Thu Oct 6 08:11:51 2016 +0100 rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase but that only really applies to service calls that we're handling, since the client side gets to send the final ACK (or not). (4) When about to transmit an ACK or DATA packet, record the Tx timestamp before only; don't update the timestamp afterwards. (5) Switch the ordering between recording the serial and recording the timestamp to always set the serial number first. The serial number shouldn't be seen referenced by an ACK packet until we've transmitted the packet bearing it - so in the Rx path, we don't need the timestamp until we've checked the serial number. Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28	rxrpc: Fix checks as to whether we should set up a new call	David Howells
	There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED flag in the packet type field rather than in the packet flags field. Fix this by creating a pair of helper functions to check whether the packet is going to the client or to the server and use them generally. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28	selftests/powerpc: Fix Makefiles for headers_install change	Michael Ellerman
	Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk") introduced a requirement that Makefiles more than one level below the selftests directory need to define top_srcdir, but it didn't update any of the powerpc Makefiles. This broke building all the powerpc selftests with eg: make[1]: Entering directory '/src/linux/tools/testing/selftests/powerpc' BUILD_TARGET=/src/linux/tools/testing/selftests/powerpc/alignment; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C alignment all make[2]: Entering directory '/src/linux/tools/testing/selftests/powerpc/alignment' ../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory make[2]: *** No rule to make target '../../../../scripts/subarch.include'. make[2]: Failed to remake makefile '../../../../scripts/subarch.include'. Makefile:38: recipe for target 'alignment' failed Fix it by setting top_srcdir in the affected Makefiles. Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-28	crypto: qat - Fix KASAN stack-out-of-bounds bug in adf_probe()	Waiman Long
	The following KASAN warning was printed when booting a 64-bit kernel on some systems with Intel CPUs: [ 44.512826] ================================================================== [ 44.520165] BUG: KASAN: stack-out-of-bounds in find_first_bit+0xb0/0xc0 [ 44.526786] Read of size 8 at addr ffff88041e02fc50 by task kworker/0:2/124 [ 44.535253] CPU: 0 PID: 124 Comm: kworker/0:2 Tainted: G X --------- --- 4.18.0-12.el8.x86_64+debug #1 [ 44.545858] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS BKVDTRL1.86B.0005.D08.1712070559 12/07/2017 [ 44.555682] Workqueue: events work_for_cpu_fn [ 44.560043] Call Trace: [ 44.562502] dump_stack+0x9a/0xe9 [ 44.565832] print_address_description+0x65/0x22e [ 44.570683] ? find_first_bit+0xb0/0xc0 [ 44.570689] kasan_report.cold.6+0x92/0x19f [ 44.578726] find_first_bit+0xb0/0xc0 [ 44.578737] adf_probe+0x9eb/0x19a0 [qat_c62x] [ 44.578751] ? adf_remove+0x110/0x110 [qat_c62x] [ 44.591490] ? mark_held_locks+0xc8/0x140 [ 44.591498] ? _raw_spin_unlock+0x30/0x30 [ 44.591505] ? trace_hardirqs_on_caller+0x381/0x570 [ 44.604418] ? adf_remove+0x110/0x110 [qat_c62x] [ 44.604427] local_pci_probe+0xd4/0x180 [ 44.604432] ? pci_device_shutdown+0x110/0x110 [ 44.617386] work_for_cpu_fn+0x51/0xa0 [ 44.621145] process_one_work+0x8fe/0x16e0 [ 44.625263] ? pwq_dec_nr_in_flight+0x2d0/0x2d0 [ 44.629799] ? lock_acquire+0x14c/0x400 [ 44.633645] ? move_linked_works+0x12e/0x2a0 [ 44.637928] worker_thread+0x536/0xb50 [ 44.641690] ? __kthread_parkme+0xb6/0x180 [ 44.645796] ? process_one_work+0x16e0/0x16e0 [ 44.650160] kthread+0x30c/0x3d0 [ 44.653400] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 44.658457] ret_from_fork+0x3a/0x50 [ 44.663557] The buggy address belongs to the page: [ 44.668350] page:ffffea0010780bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 44.676356] flags: 0x17ffffc0000000() [ 44.680023] raw: 0017ffffc0000000 ffffea0010780bc8 ffffea0010780bc8 0000000000000000 [ 44.687769] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 44.695510] page dumped because: kasan: bad access detected [ 44.702578] Memory state around the buggy address: [ 44.707372] ffff88041e02fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 44.714593] ffff88041e02fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 44.721810] >ffff88041e02fc00: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 f2 f2 [ 44.729028] ^ [ 44.734864] ffff88041e02fc80: f2 f2 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 [ 44.742082] ffff88041e02fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 44.749299] ================================================================== Looking into the code: int ret, bar_mask; : for_each_set_bit(bar_nr, (const unsigned long *)&bar_mask, It is casting a 32-bit integer pointer to a 64-bit unsigned long pointer. There are two problems here. First, the 32-bit pointer address may not be 64-bit aligned. Secondly, it is accessing an extra 4 bytes. This is fixed by changing the bar_mask type to unsigned long. Cc: <stable@vger.kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-28	crypto: mxs-dcp - Fix wait logic on chan threads	Leonard Crestez
	When compiling with CONFIG_DEBUG_ATOMIC_SLEEP=y the mxs-dcp driver prints warnings such as: WARNING: CPU: 0 PID: 120 at kernel/sched/core.c:7736 __might_sleep+0x98/0x9c do not call blocking ops when !TASK_RUNNING; state=1 set at [<8081978c>] dcp_chan_thread_sha+0x3c/0x2ec The problem is that blocking ops will manipulate current->state themselves so it is not allowed to call them between set_current_state(TASK_INTERRUPTIBLE) and schedule(). Fix this by converting the per-chan mutex to a spinlock (it only protects tiny list ops anyway) and rearranging the wait logic so that callbacks are called current->state as TASK_RUNNING. Those callbacks will indeed call blocking ops themselves so this is required. Cc: <stable@vger.kernel.org> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-28	crypto: chelsio - Fix memory corruption in DMA Mapped buffers.	Harsh Jain
	Update PCI Id in "cpl_rx_phys_dsgl" header. In case pci_chan_id and tx_chan_id are not derived from same queue, H/W can send request completion indication before completing DMA Transfer. Herbert, It would be good if fix can be merge to stable tree. For 4.14 kernel, It requires some update to avoid mege conficts. Cc: <stable@vger.kernel.org> Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-28	Merge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-fixes Just a few fixes for 4.19: - Couple of suspend/resume fixes - Fix EDID emulation with DC Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180927155418.2813-1-alexander.deucher@amd.com
2018-09-28	Merge tag 'drm-misc-fixes-2018-09-27-1' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes - Revert adding device-link to panels - Don't leak fences in drm/syncobj Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20180927152712.GA53076@art_vandelay
2018-09-27	PCI: Reprogram bridge prefetch registers on resume	Daniel Drake
	On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3 suspend/resume. The affected products include multiple generations of NVIDIA GPUs and Intel SoCs. After resume, nouveau logs many errors such as: fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04 [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown] DRM: failed to idle channel 0 [DRM] Similarly, the NVIDIA proprietary driver also fails after resume (black screen, 100% CPU usage in Xorg process). We shipped a sample to NVIDIA for diagnosis, and their response indicated that it's a problem with the parent PCI bridge (on the Intel SoC), not the GPU. Runtime suspend/resume works fine, only S3 suspend is affected. We found a workaround: on resume, rewrite the Intel PCI bridge 'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In the cases that I checked, this register has value 0 and we just have to rewrite that value. Linux already saves and restores PCI config space during suspend/resume, but this register was being skipped because upon resume, it already has value 0 (the correct, pre-suspend value). Intel appear to have previously acknowledged this behaviour and the requirement to rewrite this register: https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23 Based on that, rewrite the prefetch register values even when that appears unnecessary. We have confirmed this solution on all the affected models we have in-hands (X542UQ, UX533FD, X530UN, V272UN). Additionally, this solves an issue where r8169 MSI-X interrupts were broken after S3 suspend/resume on ASUS X441UAR. This issue was recently worked around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e"). It also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop that we had not yet patched. I suspect it will also fix the issue that was worked around in commit 7c53a722459c ("r8169: don't use MSI-X on RTL8168g"). Thomas Martitz reports that this change also solves an issue where the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3 suspend/resume. Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069 Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-By: Peter Wu <peter@lekensteyn.nl> CC: stable@vger.kernel.org
2018-09-27	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma	Greg Kroah-Hartman
	Jason writes: "Second RDMA rc pull request - Fix a long standing race bug when destroying comp_event file descriptors - srp, hfi1, bnxt_re: Various driver crashes from missing validation and other cases - Fixes for regressions in patches merged this window in the gid cache, devx, ucma and uapi." * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/core: Set right entry state before releasing reference IB/mlx5: Destroy the DEVX object upon error flow IB/uverbs: Free uapi on destroy RDMA/bnxt_re: Fix system crash during RDMA resource initialization IB/hfi1: Fix destroy_qp hang after a link down IB/hfi1: Fix context recovery when PBC has an UnsupportedVL IB/hfi1: Invalid user input can result in crash IB/hfi1: Fix SL array bounds check RDMA/uverbs: Fix validity check for modify QP IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop ucma: fix a use-after-free in ucma_resolve_ip() RDMA/uverbs: Atomically flush and mark closed the comp event queue cxgb4: fix abort_req_rss6 struct
2018-09-27	Merge tag 'for_v4.19-rc6' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Jan writes: "an ext2 patch fixing fsync(2) for DAX mounts." * tag 'for_v4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2, dax: set ext2_dax_aops for dax files
2018-09-27	blk-mq: I/O and timer unplugs are inverted in blktrace	Ilya Dryomov
	trace_block_unplug() takes true for explicit unplugs and false for implicit unplugs. schedule() unplugs are implicit and should be reported as timer unplugs. While correct in the legacy code, this has been inverted in blk-mq since 4.11. Cc: stable@vger.kernel.org Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers") Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-27	rseq/selftests: fix parametrized test with -fpie	Mathieu Desnoyers
	On x86-64, the parametrized selftest code for rseq crashes with a segmentation fault when compiled with -fpie. This happens when the param_test binary is loaded at an address beyond 32-bit on x86-64. The issue is caused by use of a 32-bit register to hold the address of the loop counter variable. Fix this by using a 64-bit register to calculate the address of the loop counter variables as an offset from rip. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> # v4.18 Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: linux-kselftest@vger.kernel.org Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Chris Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Maurer <bmaurer@fb.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-09-27	iwlwifi: 1000: set the TFD queue size	Pavel Machek
	.max_tfd_queue_size was ommited for 1000 card serries leading to oops in swiotlb. Fixes: 7b3e42ea2ead ("iwlwifi: support multiple tfd queue max sizes for different devices") Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-09-27	dax: Fix deadlock in dax_lock_mapping_entry()	Jan Kara
	When dax_lock_mapping_entry() has to sleep to obtain entry lock, it will fail to unlock mapping->i_pages spinlock and thus immediately deadlock against itself when retrying to grab the entry lock again. Fix the problem by unlocking mapping->i_pages before retrying. Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()") Reported-by: Barret Rhoden <brho@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-27	x86/boot: Fix kexec booting failure in the SEV bit detection code	Kairui Song
	Commit 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active") can occasionally cause system resets when kexec-ing a second kernel even if SEV is not active. That's because get_sev_encryption_bit() uses 32-bit rIP-relative addressing to read the value of enc_bit - a variable which caches a previously detected encryption bit position - but kexec may allocate the early boot code to a higher location, beyond the 32-bit addressing limit. In this case, garbage will be read and get_sev_encryption_bit() will return the wrong value, leading to accessing memory with the wrong encryption setting. Therefore, remove enc_bit, and thus get rid of the need to do 32-bit rIP-relative addressing in the first place. [ bp: massage commit message heavily. ] Fixes: 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active") Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: brijesh.singh@amd.com Cc: kexec@lists.infradead.org Cc: dyoung@redhat.com Cc: bhe@redhat.com Cc: ghook@redhat.com Link: https://lkml.kernel.org/r/20180927123845.32052-1-kasong@redhat.com
2018-09-27	bcache: add separate workqueue for journal_write to avoid deadlock	Guoju Fang
	After write SSD completed, bcache schedules journal_write work to system_wq, which is a public workqueue in system, without WQ_MEM_RECLAIM flag. system_wq is also a bound wq, and there may be no idle kworker on current processor. Creating a new kworker may unfortunately need to reclaim memory first, by shrinking cache and slab used by vfs, which depends on bcache device. That's a deadlock. This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM flag. It's rescuer thread will work to avoid the deadlock. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-27	ieee802154: mcr20a: Replace magic number with constants	Xue Liu
	The combination of defined constants are used to present the state of IRQ so the magic numbers has been replaced. This is a simple coding style change which should have no impact on runtime code execution. Signed-off-by: Xue Liu <liuxuenetmail@gmail.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-09-27	drm/amd/display: Fix Edid emulation for linux	Bhawanpreet Lakha
	[Why] EDID emulation didn't work properly for linux, as we stop programming if nothing is connected physically. [How] We get a flag from DRM when we want to do edid emulation. We check if this flag is true and nothing is connected physically, if so we only program the front end using VIRTUAL_SIGNAL. Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27	drm/amd/display: Fix Vega10 lightup on S3 resume	Roman Li
	[Why] There have been a few reports of Vega10 display remaining blank after S3 resume. The regression is caused by workaround for mode change on Vega10 - skip set_bandwidth if stream count is 0. As a result we skipped dispclk reset on suspend, thus on resume we may skip the clock update assuming it hasn't been changed. On some systems it causes display blank or 'out of range'. [How] Revert "drm/amd/display: Fix Vega10 black screen after mode change" Verified that it hadn't cause mode change regression. Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27	drm/amdgpu: Fix vce work queue was not cancelled when suspend	Rex Zhu
	The vce cancel_delayed_work_sync never be called. driver call the function in error path. This caused the A+A suspend hang when runtime pm enebled. As we will visit the smu in the idle queue. this will cause smu hang because the dgpu has been suspend, and the dgpu also will be waked up. As the smu has been hang, so the dgpu resume will failed. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2018-09-27	Revert "drm/panel: Add device_link from panel device to DRM device"	Linus Walleij
	This reverts commit 0c08754b59da5557532d946599854e6df28edc22. commit 0c08754b59da ("drm/panel: Add device_link from panel device to DRM device") creates a circular dependency under these circumstances: 1. The panel depends on dsi-host because it is MIPI-DSI child device. 2. dsi-host depends on the drm parent device (connector->dev->dev) this should be allowed. 3. drm parent dev (connector->dev->dev) depends on the panel after this patch. This makes the dependency circular and while it appears it does not affect any in-tree drivers (they do not seem to have dsi hosts depending on the same parent device) this does not seem right. As noted in a response from Andrzej Hajda, the intent is likely to make the panel dependent on the DRM device (connector->dev) not its parent. But we have no way of doing that since the DRM device doesn't contain any struct device on its own (arguably it should). Revert this until a proper approach is figured out. Cc: Jyri Sarha <jsarha@ti.com> Cc: Eric Anholt <eric@anholt.net> Cc: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20180927124130.9102-1-linus.walleij@linaro.org
2018-09-27	Merge branch 'clockevents/4.19-fixes' of ↵	Thomas Gleixner
	https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent Pull another fix from Daniel Lezcano, which felt through the cracks: - Fix a potential memory leak reported by smatch in the atmel timer driver
2018-09-27	xen/blkfront: When purging persistent grants, keep them in the buffer	Boris Ostrovsky
	Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") added support for purging persistent grants when they are not in use. As part of the purge, the grants were removed from the grant buffer, This eventually causes the buffer to become empty, with BUG_ON triggered in get_free_grant(). This can be observed even on an idle system, within 20-30 minutes. We should keep the grants in the buffer when purging, and only free the grant ref. Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-27	rxrpc: Remove dup code from rxrpc_find_connection_rcu()	David Howells
	rxrpc_find_connection_rcu() initialises variable k twice with the same information. Remove one of the initialisations. Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-27	ieee802154: ca8210: remove redundant condition check before debugfs_remove	zhong jiang
	debugfs_remove has taken the IS_ERR into account. Just remove the unnecessary condition. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-09-27	clocksource/drivers/timer-atmel-pit: Properly handle error cases	Alexandre Belloni
	The smatch utility reports a possible leak: smatch warnings: drivers/clocksource/timer-atmel-pit.c:183 at91sam926x_pit_dt_init() warn: possible memory leak of 'data' Ensure data is freed before exiting with an error. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2018-09-27	nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds	Masashi Honma
	Use array_index_nospec() to sanitize i with respect to speculation. Note that the user doesn't control i directly, but can make it out of bounds by not finding a threshold in the array. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> [add note about user control, as explained by Masashi] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	net-tcp: /proc/sys/net/ipv4/tcp_probe_interval is a u32 not int	Maciej Żenczykowski
	(fix documentation and sysctl access to treat it as such) Tested: # zcat /proc/config.gz \| egrep ^CONFIG_HZ CONFIG_HZ_1000=y CONFIG_HZ=1000 # echo $[(1<<32)/1000 + 1] \| tee /proc/sys/net/ipv4/tcp_probe_interval 4294968 tee: /proc/sys/net/ipv4/tcp_probe_interval: Invalid argument # echo $[(1<<32)/1000] \| tee /proc/sys/net/ipv4/tcp_probe_interval 4294967 # echo 0 \| tee /proc/sys/net/ipv4/tcp_probe_interval # echo -1 \| tee /proc/sys/net/ipv4/tcp_probe_interval -1 tee: /proc/sys/net/ipv4/tcp_probe_interval: Invalid argument Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	bnxt_en: Fix TX timeout during netpoll.	Michael Chan
	The current netpoll implementation in the bnxt_en driver has problems that may miss TX completion events. bnxt_poll_work() in effect is only handling at most 1 TX packet before exiting. In addition, there may be in flight TX completions that ->poll() may miss even after we fix bnxt_poll_work() to handle all visible TX completions. netpoll may not call ->poll() again and HW may not generate IRQ because the driver does not ARM the IRQ when the budget (0 for netpoll) is reached. We fix it by handling all TX completions and to always ARM the IRQ when we exit ->poll() with 0 budget. Also, the logic to ACK the completion ring in case it is almost filled with TX completions need to be adjusted to take care of the 0 budget case, as discussed with Eric Dumazet <edumazet@google.com> Reported-by: Song Liu <songliubraving@fb.com> Reviewed-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	vxlan: fill ttl inherit info	Hangbin Liu
	When add vxlan ttl inherit support, I forgot to fill it when dump vlxan info. Fix it now. Fixes: 72f6d71e491e6 ("vxlan: add ttl inherit support") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	net: phy: sfp: Fix unregistering of HWMON SFP device	Andrew Lunn
	A HWMON device is only registered is the SFP module supports the diagnostic page and is complient to SFF8472. Don't unconditionally unregister the hwmon device when the SFP module is remove, otherwise we access data structures which don't exist. Reported-by: Florian Fainelli <f.fainelli@gmail.com> Fixes: 1323061a018a ("net: phy: sfp: Add HWMON support for module sensors") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	qed: Avoid implicit enum conversion in qed_iwarp_parse_rx_pkt	Nathan Chancellor
	Clang warns when one enumerated type is implicitly converted to another. drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1713:25: warning: implicit conversion from enumeration type 'enum tcp_ip_version' to different enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion] cm_info->ip_version = TCP_IPV4; ~ ^~~~~~~~ drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1733:25: warning: implicit conversion from enumeration type 'enum tcp_ip_version' to different enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion] cm_info->ip_version = TCP_IPV6; ~ ^~~~~~~~ 2 warnings generated. Use the appropriate values from the expected type, qed_tcp_ip_version: TCP_IPV4 = QED_TCP_IPV4 = 0 TCP_IPV6 = QED_TCP_IPV6 = 1 Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	qed: Avoid constant logical operation warning in qed_vf_pf_acquire	Nathan Chancellor
	Clang warns when a constant is used in a boolean context as it thinks a bitwise operation may have been intended. drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (!p_iov->b_pre_fp_hsi && ^ drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: use '&' for a bitwise operation if (!p_iov->b_pre_fp_hsi && ^~ & drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: remove constant to silence this warning if (!p_iov->b_pre_fp_hsi && ~^~ 1 warning generated. This has been here since commit 1fe614d10f45 ("qed: Relax VF firmware requirements") and I am not entirely sure why since 0 isn't a special case. Just remove the statement causing Clang to warn since it isn't required. Link: https://github.com/ClangBuiltLinux/linux/issues/126 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	bonding: avoid possible dead-lock	Mahesh Bandewar
	Syzkaller reported this on a slightly older kernel but it's still applicable to the current kernel - ====================================================== WARNING: possible circular locking dependency detected 4.18.0-next-20180823+ #46 Not tainted ------------------------------------------------------ syz-executor4/26841 is trying to acquire lock: 00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652 but task is already holding lock: 00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline] 00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (rtnl_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline] bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320 process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153 worker_thread+0x189/0x13c0 kernel/workqueue.c:2296 kthread+0x35a/0x420 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415 -> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}: process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129 worker_thread+0x189/0x13c0 kernel/workqueue.c:2296 kthread+0x35a/0x420 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415 -> #0 ((wq_completion)bond_dev->name){+.+.}: lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734 register_netdevice+0x337/0x1100 net/core/dev.c:8410 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115 __sys_sendmsg+0x11d/0x290 net/socket.c:2153 __do_sys_sendmsg net/socket.c:2162 [inline] __se_sys_sendmsg net/socket.c:2160 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: (wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock((work_completion)(&(&nnw->work)->work)); lock(rtnl_mutex); lock((wq_completion)bond_dev->name); * DEADLOCK * 1 lock held by syz-executor4/26841: stack backtrace: CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222 check_prev_add kernel/locking/lockdep.c:1862 [inline] check_prevs_add kernel/locking/lockdep.c:1975 [inline] validate_chain kernel/locking/lockdep.c:2416 [inline] __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412 lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734 register_netdevice+0x337/0x1100 net/core/dev.c:8410 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115 __sys_sendmsg+0x11d/0x290 net/socket.c:2153 __do_sys_sendmsg net/socket.c:2162 [inline] __se_sys_sendmsg net/socket.c:2160 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457089 Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089 RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003 RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	bonding: pass link-local packets to bonding master also.	Mahesh Bandewar
	Commit b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") changed the behavior of how link-local-multicast packets are processed. The change in the behavior broke some legacy use cases where these packets are expected to arrive on bonding master device also. This patch passes the packet to the stack with the link it arrived on as well as passes to the bonding-master device to preserve the legacy use case. Fixes: b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") Reported-by: Michal Soltys <soltys@ziu.info> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor	Nathan Chancellor
	Clang warns when one enumerated type is implicitly converted to another. drivers/net/ethernet/qlogic/qed/qed_roce.c:153:12: warning: implicit conversion from enumeration type 'enum roce_mode' to different enumeration type 'enum roce_flavor' [-Wenum-conversion] flavor = ROCE_V2_IPV6; ~ ^~~~~~~~~~~~ drivers/net/ethernet/qlogic/qed/qed_roce.c:156:12: warning: implicit conversion from enumeration type 'enum roce_mode' to different enumeration type 'enum roce_flavor' [-Wenum-conversion] flavor = MAX_ROCE_MODE; ~ ^~~~~~~~~~~~~ 2 warnings generated. Use the appropriate values from the expected type, roce_flavor: ROCE_V2_IPV6 = RROCE_IPV6 = 2 MAX_ROCE_MODE = MAX_ROCE_FLAVOR = 3 While we're add it, ditch the local variable flavor, we can just return the value directly from the switch statement. Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	qed: Fix mask parameter in qed_vf_prep_tunn_req_tlv	Nathan Chancellor
	Clang complains when one enumerated type is implicitly converted to another. drivers/net/ethernet/qlogic/qed/qed_vf.c:686:6: warning: implicit conversion from enumeration type 'enum qed_tunn_mode' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] QED_MODE_L2GENEVE_TUNN, ^~~~~~~~~~~~~~~~~~~~~~ Update mask's parameter to expect qed_tunn_mode, which is what was intended. Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	qed: Avoid implicit enum conversion in qed_set_tunn_cls_info	Nathan Chancellor
	Clang warns when one enumerated type is implicitly converted to another. drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:163:25: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->vxlan.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:165:26: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->l2_gre.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:167:26: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->ip_gre.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:169:29: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->l2_geneve.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:171:29: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->ip_geneve.tun_cls = type; ~ ^~~~ 5 warnings generated. Avoid this by changing type to an int. Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	wimax/i2400m: fix spelling mistake "not unitialized" -> "uninitialized"	Colin Ian King
	Trivial fix to spelling mistake in ms_to_errno array of error messages and remove confusing "not" from the error text since the error code refers to an uninitialized error code. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	qed: fix spelling mistake "toogle" -> "toggle"	Colin Ian King
	Trivial fix to spelling mistake in DP_VERBOSE message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	Merge branch 'net-phy-fix-WoL-handling-when-suspending-the-PHY'	David S. Miller
	Heiner Kallweit says: ==================== net: phy: fix WoL handling when suspending the PHY phy_suspend doesn't always recognize that WoL is enabled and therefore suspends the PHY when it should not. First idea to address the issue was to reuse checks used in mdio_bus_phy_may_suspend which check whether relevant devices are wakeup-enabled. Florian raised some concerns because drivers may enable wakeup even if WoL isn't enabled (e.g. certain USB network drivers). The new approach focuses on reducing the risk to break existing stuff. We add a flag wol_enabled to struct net_device which is set in ethtool_set_wol(). Then this flag is checked in phy_suspend(). This doesn't cover 100% of the cases yet (e.g. if WoL is enabled w/o explicit configuration), but it covers the most relevant cases with very little risk of regressions. v2: - Fix a typo ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	net: phy: fix WoL handling when suspending the PHY	Heiner Kallweit
	Core of the problem is that phy_suspend() suspends the PHY when it should not because of WoL. phy_suspend() checks for WoL already, but this works only if the PHY driver handles WoL (what is rarely the case). Typically WoL is handled by the MAC driver. This patch uses new member wol_enabled of struct net_device as additional criteria in the check when not to suspend the PHY because of WoL. Last but not least change phy_detach() to call phy_suspend() before attached_dev is set to NULL. phy_suspend() accesses attached_dev when checking whether the MAC driver activated WoL. Fixes: f1e911d5d0df ("r8169: add basic phylib support") Fixes: e8cfd9d6c772 ("net: phy: call state machine synchronously in phy_stop") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	net: core: add member wol_enabled to struct net_device	Heiner Kallweit
	Add flag wol_enabled to struct net_device indicating whether Wake-on-LAN is enabled. As first user phy_suspend() will use it to decide whether PHY can be suspended or not. Fixes: f1e911d5d0df ("r8169: add basic phylib support") Fixes: e8cfd9d6c772 ("net: phy: call state machine synchronously in phy_stop") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	Revert "net: phy: fix WoL handling when suspending the PHY"	David S. Miller
	This reverts commit e0511f6c1ccdd153cf063764e93ac177a8553c5d. I commited the wrong version of these changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26	net: phy: fix WoL handling when suspending the PHY	Heiner Kallweit
	Actually there's nothing wrong with the two changes marked as "Fixes", they just revealed a problem which has been existing before. After having switched r8169 to phylib it was reported that WoL from shutdown doesn't work any longer (WoL from suspend isn't affected). Reason is that during shutdown phy_disconnect()->phy_detach()-> phy_suspend() is called. A similar issue occurs when the phylib state machine calls phy_suspend() when handling state PHY_HALTED. Core of the problem is that phy_suspend() suspends the PHY when it should not due to WoL. phy_suspend() checks for WoL already, but this works only if the PHY driver handles WoL (what is rarely the case). Typically WoL is handled by the MAC driver. phylib knows about this and handles it in mdio_bus_phy_may_suspend(), but that's used only when suspending the system, not in other cases like shutdown. Therefore factor out the relevant check from mdio_bus_phy_may_suspend() to a new function phy_may_suspend() and use it in phy_suspend(). Last but not least change phy_detach() to call phy_suspend() before attached_dev is set to NULL. phy_suspend() accesses attached_dev when checking whether the MAC driver activated WoL. Fixes: f1e911d5d0df ("r8169: add basic phylib support") Fixes: e8cfd9d6c772 ("net: phy: call state machine synchronously in phy_stop") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>