linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-08-02	random: Make crng state queryable	Jason A. Donenfeld
	It is very useful to be able to know whether or not get_random_bytes_wait / wait_for_random_bytes is going to block or not, or whether plain get_random_bytes is going to return good randomness or bad randomness. The particular use case is for mitigating certain attacks in WireGuard. A handshake packet arrives and is queued up. Elsewhere a worker thread takes items from the queue and processes them. In replying to these items, it needs to use some random data, and it has to be good random data. If we simply block until we can have good randomness, then it's possible for an attacker to fill the queue up with packets waiting to be processed. Upon realizing the queue is full, WireGuard will detect that it's under a denial of service attack, and behave accordingly. A better approach is just to drop incoming handshake packets if the crng is not yet initialized. This patch, therefore, makes that information directly accessible. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-08-02	block: Switch struct packet_command to use struct scsi_sense_hdr	Kees Cook
	There is a lot of needless struct request_sense usage in the CDROM code. These can all be struct scsi_sense_hdr instead, to avoid any confusion over their respective structure sizes. This patch is a lot of noise changing "sense" to "sshdr", but the final code is more readable to distinguish between "sense" meaning "struct request_sense" and "sshdr" meaning "struct scsi_sense_hdr". Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-02	scsi: sysfs: Introduce sysfs_{un,}break_active_protection()	Bart Van Assche
	Introduce these two functions and export them such that the next patch can add calls to these functions from the SCSI core. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-08-02	libceph: implement CEPHX_V2 calculation mode	Ilya Dryomov
	Derive the signature from the entire buffer (both AES cipher blocks) instead of using just the first half of the first block, leaving out data_crc entirely. This addresses CVE-2018-1129. Link: http://tracker.ceph.com/issues/24837 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2018-08-02	libceph: add authorizer challenge	Ilya Dryomov
	When a client authenticates with a service, an authorizer is sent with a nonce to the service (ceph_x_authorize_[ab]) and the service responds with a mutation of that nonce (ceph_x_authorize_reply). This lets the client verify the service is who it says it is but it doesn't protect against a replay: someone can trivially capture the exchange and reuse the same authorizer to authenticate themselves. Allow the service to reject an initial authorizer with a random challenge (ceph_x_authorize_challenge). The client then has to respond with an updated authorizer proving they are able to decrypt the service's challenge and that the new authorizer was produced for this specific connection instance. The accepting side requires this challenge and response unconditionally if the client side advertises they have CEPHX_V2 feature bit. This addresses CVE-2018-1128. Link: http://tracker.ceph.com/issues/24836 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2018-08-02	libceph: store ceph_auth_handshake pointer in ceph_connection	Ilya Dryomov
	We already copy authorizer_reply_buf and authorizer_reply_buf_len into ceph_connection. Factoring out __prepare_write_connect() requires two more: authorizer_buf and authorizer_buf_len. Store the pointer to the handshake in con->auth rather than piling on. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2018-08-02	libceph: remove now unused ceph_{en,de}code_timespec()	Ilya Dryomov
	Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-08-02	libceph: use timespec64 for r_mtime	Arnd Bergmann
	The request mtime field is used all over ceph, and is currently represented as a 'timespec' structure in Linux. This changes it to timespec64 to allow times beyond 2038, modifying all users at the same time. [ Remove now redundant ts variable in writepage_nounlock(). ] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-08-02	libceph: use timespec64 in for keepalive2 and ticket validity	Arnd Bergmann
	ceph_con_keepalive_expired() is the last user of timespec_add() and some of the last uses of ktime_get_real_ts(). Replacing this with timespec64 based interfaces lets us remove that deprecated API. I'm introducing new ceph_encode_timespec64()/ceph_decode_timespec64() here that take timespec64 structures and convert to/from ceph_timespec, which is defined to have an unsigned 32-bit tv_sec member. This extends the range of valid times to year 2106, avoiding the year 2038 overflow. The ceph file system portion still uses the old functions for inode timestamps, this will be done separately after the VFS layer is converted. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-08-02	libceph: change ceph_pagelist_encode_string() to take u32	Ilya Dryomov
	The wire format dictates that the length of string fits into 4 bytes. Take u32 instead of size_t to reflect that. We were already truncating len in ceph_pagelist_encode_32() -- this just pushes that truncation one level up. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-08-02	libceph: make ceph_osdc_notify{,_ack}() payload_len u32	Ilya Dryomov
	The wire format dictates that payload_len fits into 4 bytes. Take u32 instead of size_t to reflect that. All callers pass a small integer, so no changes required. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-08-02	Merge branch 'rppt' into docs-next	Jonathan Corbet
	This contains a set of early-boot memory-management docs from Mike Rapoport. It's been circulating on linux-mm for a long time; I finally picked it up even though it changes a lot of .c files under mm/ (comments only).
2018-08-02	docs/mm: memblock: add kernel-doc description for memblock types	Mike Rapoport
	Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-08-02	docs/mm: memblock: update kernel-doc comments	Mike Rapoport
	* make memblock_discard description kernel-doc compatible * add brief description for memblock_setclr_flag and describe its parameters * fixup return value descriptions Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-08-02	mm/memblock: add a name for memblock flags enumeration	Mike Rapoport
	Since kernel-doc does not like anonymous enums the name is required for adding documentation. While on it, I've also updated all the function declarations to use 'enum memblock_flags' instead of unsigned long. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-08-02	docs/mm: bootmem: add kernel-doc description of 'struct bootmem_data'	Mike Rapoport
	Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-08-02	Merge tag 'pci-v4.18-fixes-5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix integer overflow in new mobiveil driver (Dan Carpenter) - Fix race during NVMe removal/rescan (Hari Vyas) * tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix is_added/is_busmaster race condition PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE
2018-08-02	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller
	The BTF conflicts were simple overlapping changes. The virtio_net conflict was an overlap of a fix of statistics counter, happening alongisde a move over to a bonafide statistics structure rather than counting value on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02	rtc: remove struct rtc_task	Alexandre Belloni
	Include rtc_task members directly in rtc_timer member. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2018-08-02	spi: spi-mem: Extend the SPI mem interface to set a custom memory name	Frieder Schrempf
	When porting (Q)SPI controller drivers from the MTD layer to the SPI layer, the naming scheme for the memory devices changes. To be able to keep compatibility with the old drivers naming scheme, a name field is added to struct spi_mem and a hook is added to let controller drivers set a custom name for the memory device. Example for the FSL QSPI driver: Name with the old driver: 21e0000.qspi, or with multiple devices: 21e0000.qspi-0, 21e0000.qspi-1, ... Name with the new driver without spi_mem_get_name: spi4.0 Suggested-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Frieder Schrempf <frieder.schrempf@exceet.de> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-02	spi: spi-mem: Fix a typo in the documentation of struct spi_mem	Frieder Schrempf
	Fix a typo in the @drvpriv description. Signed-off-by: Frieder Schrempf <frieder.schrempf@exceet.de> Acked-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-02	goldfish: Use dedicated macros instead of manual bit shifting	Roman Kiryanov
	There are dedicated macros (lower_32_bits and upper_32_bits) available to extract the lower and upper 32 bits. They provide better readability and could prevent some compilation warnings. Signed-off-by: Roman Kiryanov <rkir@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-02	goldfish: Add missing includes to goldfish.h	Roman Kiryanov
	goldfish.h refers to external symbols such as dma_addr_t and writel. This causes compilation errors if this file is included before other header files. The mentioned symbols are defined in types.h (dma_addr_t) and io.h (writel). Signed-off-by: Roman Kiryanov <rkir@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-02	Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()	Dexuan Cui
	Before setting channel->rescind in vmbus_rescind_cleanup(), we should make sure the channel callback won't run any more, otherwise a high-level driver like pci_hyperv, which may be infinitely waiting for the host VSP's response and notices the channel has been rescinded, can't safely give up: e.g., in hv_pci_protocol_negotiation() -> wait_for_response(), it's unsafe to exit from wait_for_response() and proceed with the on-stack variable "comp_pkt" popped. The issue was originally spotted by Michael Kelley <mikelley@microsoft.com>. In vmbus_close_internal(), the patch also minimizes the range protected by disabling/enabling channel->callback_event: we don't really need that for the whole function. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Michael Kelley <mikelley@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-02	serial: sh-sci: Remove SCIx_RZ_SCIFA_REGTYPE	Chris Brandt
	There is no more need for SCIx_RZ_SCIFA_REGTYPE now that SCIx_SH4_SCIF_REGTYPE can provide the same register/address definitions. Also, R7S9210 no longer needs a special compatible since the standard "renesas,scif" will work just fine. Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-02	Merge branch 'perf/urgent' into perf/core, to pick up fixes	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-08-01	kill d_instantiate_no_diralias()	Al Viro
	The only user is fuse_create_new_entry(), and there it's used to mitigate the same mkdir/open-by-handle race as in nfs_mkdir(). The same solution applies - unhash the mkdir argument, then call d_splice_alias() and if that returns a reference to preexisting alias, dput() and report success. ->mkdir() argument left unhashed negative with the preexisting alias moved in the right place is just fine from the ->mkdir() callers point of view. Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-08-01	ring-buffer: Make ring_buffer_record_is_set_on() return bool	Steven Rostedt (VMware)
	The value of ring_buffer_record_is_set_on() is either true or false, so have its return value be bool. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01	ring-buffer: Make ring_buffer_record_is_on() return bool	Steven Rostedt (VMware)
	The value of ring_buffer_record_is_on() is either true or false, so have its return value be bool. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01	Merge branch 'ib/4.17-bitmap' into next	Dmitry Torokhov
	Bring in bitmap API improvements.
2018-08-01	bitmap: Add bitmap_alloc(), bitmap_zalloc() and bitmap_free()	Andy Shevchenko
	A lot of code become ugly because of open coding allocations for bitmaps. Introduce three helpers to allow users be more clear of intention and keep their code neat. Note, due to multiple circular dependencies we may not provide the helpers as inliners. For now we keep them exported and, perhaps, at some point in the future we will sort out header inclusion and inheritance. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-08-01	mm: do not initialize TLB stack vma's with vma_init()	Linus Torvalds
	Commit 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments") tried to initialize various left-over ad-hoc vma's "properly", but actually made things worse for the temporary vma's used for TLB flushing. vma_init() doesn't actually initialize all of the vma, just a few fields, so doing something like - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; + struct vm_area_struct vma; + + vma_init(&vma, tlb->mm); was actually very bad: instead of having a nicely initialized vma with every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma with only a couple of fields initialized. And they weren't even fields that the code in question mostly cared about. The flush_tlb_range() function takes a "struct vma" rather than a "struct mm_struct", because a few architectures actually care about what kind of range it is - being able to only do an ITLB flush if it's a range that doesn't have data accesses enabled, for example. And all the normal users already have the vma for doing the range invalidation. But a few people want to call flush_tlb_range() with a range they just made up, so they also end up using a made-up vma. x86 just has a special "flush_tlb_mm_range()" function for this, but other architectures (arm and ia64) do the "use fake vma" thing instead, and thus got caught up in the vma_init() changes. At the same time, the TLB flushing code really doesn't care about most other fields in the vma, so vma_init() is just unnecessary and pointless. This fixes things by having an explicit "this is just an initializer for the TLB flush" initializer macro, which is used by the arm/arm64/ia64 people who mis-use this interface with just a dummy vma. Fixes: 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments") Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01	tcp: add stat of data packet reordering events	Wei Wang
	Introduce a new TCP stats to record the number of reordering events seen and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Application can use this stats to track the frequency of the reordering events in addition to the existing reordering stats which tracks the magnitude of the latest reordering event. Note: this new stats tracks reordering events triggered by ACKs, which could often be fewer than the actual number of packets being delivered out-of-order. Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01	tcp: add dsack blocks received stats	Wei Wang
	Introduce a new TCP stat to record the number of DSACK blocks received (RFC4989 tcpEStatsStackDSACKDups) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01	tcp: add data bytes retransmitted stats	Wei Wang
	Introduce a new TCP stat to record the number of bytes retransmitted (RFC4898 tcpEStatsPerfOctetsRetrans) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01	tcp: add data bytes sent stats	Wei Wang
	Introduce a new TCP stat to record the number of bytes sent (RFC4898 tcpEStatsPerfHCDataOctetsOut) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01	spi: spi-gpio: add SPI_3WIRE support	Lorenzo Bianconi
	Add SPI_3WIRE support to spi-gpio controller introducing set_line_direction function pointer in spi_bitbang data structure. Spi-gpio controller has been tested using hts221 temp/rh iio sensor running in 3wire mode and lsm6dsm running in 4wire mode Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-01	spi: add flags parameter to txrx_word function pointers	Lorenzo Bianconi
	Add the capability to specify the flag parameter used in bitbang_txrx_be_cpha{0,1} through the txrx_word function pointers of spi_bitbang data structure. That feature will be used to add spi-3wire support to the spi-gpio controller Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-01	Merge tag 'clk-core-duty-cycle-for-mark' of ↵	Mark Brown
	git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux into asoc-4.19 Duty cycle support for the clk api and drivers.
2018-08-01	mtd: rawnand: allocate dynamically ONFI parameters during detection	Miquel Raynal
	Now that it is possible to do dynamic allocations during the identification phase, convert the onfi_params structure (which is only needed with ONFI compliant chips) into a pointer that will be allocated only if needed. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-08-01	mtd: spi-nor: only apply reset hacks to broken hardware	Brian Norris
	Commit 59b356ffd0b0 ("mtd: m25p80: restore the status of SPI flash when exiting") is the latest from a long history of attempts to add reboot handling to handle stateful addressing modes on SPI flash. Some prior mostly-related discussions: http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands http://lists.infradead.org/pipermail/barebox/2014-September/020682.html [RFC] MTD m25p80 3-byte addressing and boot problem http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html [PATCH 2/2] m25p80: if supported put chip to deep power down if not used Previously, attempts to add reboot-time software reset handling were rejected, but the latest attempt was not. Quick summary of the problem: Some systems (e.g., boot ROM or bootloader) assume that they can read initial boot code from their SPI flash using 3-byte addressing. If the flash is left in 4-byte mode after reset, these systems won't boot. The above patch provided a shutdown/remove hook to attempt to reset the addressing mode before we reboot. Notably, this patch misses out on huge classes of unexpected reboots (e.g., crashes, watchdog resets). Unfortunately, it is essentially impossible to solve this problem 100%: if your system doesn't know how to reset the SPI flash to power-on defaults at initialization time, no amount of software can really rescue you -- there will always be a chance of some unexpected reset that leaves your flash in an addressing mode that your boot sequence didn't expect. While it is not directly harmful to perform hacks like the aforementioned commit on all 4-byte addressing flash, a properly-designed system should not need the hack -- and in fact, providing this hack may mask the fact that a given system is indeed broken. So this patch attempts to apply this unsound hack more narrowly, providing a strong suggestion to developers and system designers that this is truly a hack. With luck, system designers can catch their errors early on in their development cycle, rather than applying this hack long term. But apparently enough systems are out in the wild that we still have to provide this hack. Document a new device tree property to denote systems that do not have a proper hardware (or software) reset mechanism, and apply the hack (with a loud warning) only in this case. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-07-31	PCI: Unify PCI and normal DMA direction definitions	Shunyong Yang
	Current DMA direction definitions in pci-dma-compat.h and dma-direction.h are mirrored in value. Unify them to enhance readability and avoid possible inconsistency. Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Joey Zheng <yu.zheng@hxt-semitech.com>
2018-08-01	Merge tag 'drm-msm-next-2018-07-30' of ↵	Dave Airlie
	git://people.freedesktop.org/~robclark/linux into drm-next A bit larger this time around, due to introduction of "dpu1" support for the display controller in sdm845 and beyond. This has been on list and undergoing refactoring since Feb (going from ~110kloc to ~30kloc), and all my review complaints have been addressed, so I'd be happy to see this upstream so further feature work can procede on top of upstream. Also includes the gpu coredump support, which should be useful for debugging gpu crashes. And various other misc fixes and such. Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv-8y3zguY0Mj1vh=o+vrv_bJ8AwZ96wBXYPvMeQT2XcA@mail.gmail.com
2018-07-31	dm kcopyd: return void from dm_kcopyd_copy()	Mike Snitzer
	dm_kcopyd_copy() only ever returns 0 so there is no need for callers to account for possible failure. Same goes for dm_kcopyd_zero(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-07-31	net: remove bogus RCU annotations on socket.wq	Christoph Hellwig
	We never use RCU protection for it, just a lot of cargo-cult rcu_deference_protects calls. Note that we do keep the kfree_rcu call for it, as the references through struct sock are RCU protected and thus might require a grace period before freeing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31	NFSv4 client live hangs after live data migration recovery	Bill Baker
	After a live data migration event at the NFS server, the client may send I/O requests to the wrong server, causing a live hang due to repeated recovery events. On the wire, this will appear as an I/O request failing with NFS4ERR_BADSESSION, followed by successful CREATE_SESSION, repeatedly. NFS4ERR_BADSSESSION is returned because the session ID being used was issued by the other server and is not valid at the old server. The failure is caused by async worker threads having cached the transport (xprt) in the rpc_task structure. After the migration recovery completes, the task is redispatched and the task resends the request to the wrong server based on the old value still present in tk_xprt. The solution is to recompute the tk_xprt field of the rpc_task structure so that the request goes to the correct server. Signed-off-by: Bill Baker <bill.baker@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Helen Chao <helen.chao@oracle.com> Fixes: fb43d17210ba ("SUNRPC: Use the multipath iterator to assign a ...") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-07-31	sunrpc: Change rpc_print_iostats to rpc_clnt_show_stats and handle rpc_clnt ↵	Dave Wysochanski
	clones The existing rpc_print_iostats has a few shortcomings. First, the naming is not consistent with other functions in the kernel that display stats. Second, it is really displaying stats for an rpc_clnt structure as it displays both xprt stats and per-op stats. Third, it does not handle rpc_clnt clones, which is important for the one in-kernel tree caller of this function, the NFS client's nfs_show_stats function. Fix all of the above by renaming the rpc_print_iostats to rpc_clnt_show_stats and looping through any rpc_clnt clones via cl_parent. Once this interface is fixed, this addresses a problem with NFSv4. Before this patch, the /proc/self/mountstats always showed incorrect counts for NFSv4 lease and session related opcodes such as SEQUENCE, RENEW, SETCLIENTID, CREATE_SESSION, etc. These counts were always 0 even though many ops would go over the wire. The reason for this is there are multiple rpc_clnt structures allocated for any given NFSv4 mount, and inside nfs_show_stats() we callled into rpc_print_iostats() which only handled one of them, nfs_server->client. Fix these counts by calling sunrpc's new rpc_clnt_show_stats() function, which handles cloned rpc_clnt structs and prints the stats together. Note that one side-effect of the above is that multiple mounts from the same NFS server will show identical counts in the above ops due to the fact the one rpc_clnt (representing the NFSv4 client state) is shared across mounts. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-07-31	xsk: don't allow umem replace at stack level	Jakub Kicinski
	Currently drivers have to check if they already have a umem installed for a given queue and return an error if so. Make better use of XDP_QUERY_XSK_UMEM and move this functionality to the core. We need to keep rtnl across the calls now. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31	net: update real_num_rx_queues even when !CONFIG_SYSFS	Jakub Kicinski
	We used to depend on real_num_rx_queues as a upper bound for sanity checks. For AF_XDP socket validation it's useful if the check behaves the same regardless of CONFIG_SYSFS setting. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31	PCI: Fix is_added/is_busmaster race condition	Hari Vyas
	When a PCI device is detected, pdev->is_added is set to 1 and proc and sysfs entries are created. When the device is removed, pdev->is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, pdev->is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple removal and rescan of a PCIe NVMe device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between the PCI core setting the is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue setting the is_busmaster bit in pci_set_master(). As these fields are not handled atomically, that clears the is_added bit. Move the is_added bit to a separate private flag variable and use atomic functions to set and retrieve the device addition state. This avoids the race because is_added no longer shares a memory location with is_busmaster. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283 Signed-off-by: Hari Vyas <hari.vyas@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au>