linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-11-13	afs: Overhaul cell database management	David Howells
	Overhaul the way that the in-kernel AFS client keeps track of cells in the following manner: (1) Cells are now held in an rbtree to make walking them quicker and RCU managed (though this is probably overkill). (2) Cells now have a manager work item that: (A) Looks after fetching and refreshing the VL server list. (B) Manages cell record lifetime, including initialising and destruction. (B) Manages cell record caching whereby threads are kept around for a certain time after last use and then destroyed. (C) Manages the FS-Cache index cookie for a cell. It is not permitted for a cookie to be in use twice, so we have to be careful to not allow a new cell record to exist at the same time as an old record of the same name. (3) Each AFS network namespace is given a manager work item that manages the cells within it, maintaining a single timer to prod cells into updating their DNS records. This uses the reduce_timer() facility to make the timer expire at the soonest timed event that needs happening. (4) When a module is being unloaded, cells and cell managers are now counted out using dec_after_work() to make sure the module text is pinned until after the data structures have been cleaned up. (5) Each cell's VL server list is now protected by a seqlock rather than a semaphore. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Overhaul permit caching	David Howells
	Overhaul permit caching in AFS by making it per-vnode and sharing permit lists where possible. When most of the fileserver operations are called, they return a status structure indicating the (revised) details of the vnode or vnodes involved in the operation. This includes the access mark derived from the ACL (named CallerAccess in the protocol definition file). This is cacheable and if the ACL changes, the server will tell us that it is breaking the callback promise, at which point we can discard the currently cached permits. With this patch, the afs_permits structure has, at the end, an array of { key, CallerAccess } elements, sorted by key pointer. This is then cached in a hash table so that it can be shared between vnodes with the same access permits. Permit lists can only be shared if they contain the exact same set of key->CallerAccess mappings. Note that that table is global rather than being per-net_ns. If the keys in a permit list cross net_ns boundaries, there is no problem sharing the cached permits, since the permits are just integer masks. Since permit lists pin keys, the permit cache also makes it easier for a future patch to find all occurrences of a key and remove them by means of setting the afs_permits::invalidated flag and then clearing the appropriate key pointer. In such an event, memory barriers will need adding. Lastly, the permit caching is skipped if the server has sent either a vnode-specific or an entire-server callback since the start of the operation. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Overhaul the callback handling	David Howells
	Overhaul the AFS callback handling by the following means: (1) Don't give up callback promises on vnodes that we are no longer using, rather let them just expire on the server or let the server break them. This is actually more efficient for the server as the callback lookup is expensive if there are lots of extant callbacks. (2) Only give up the callback promises we have from a server when the server record is destroyed. Then we can just give up all the callback promises on it in one go. (3) Servers can end up being shared between cells if cells are aliased, so don't add all the vnodes being backed by a particular server into a big FID-indexed tree on that server as there may be duplicates. Instead have each volume instance (~= superblock) register an interest in a server as it starts to make use of it and use this to allow the processor for callbacks from the server to find the superblock and thence the inode corresponding to the FID being broken by means of ilookup_nowait(). (4) Rather than iterating over the entire callback list when a mass-break comes in from the server, maintain a counter of mass-breaks in afs_server (cb_seq) and make afs_validate() check it against the copy in afs_vnode. It would be nice not to have to take a read_lock whilst doing this, but that's tricky without using RCU. (5) Save a ref on the fileserver we're using for a call in the afs_call struct so that we can access its cb_s_break during call decoding. (6) Write-lock around callback and status storage in a vnode and read-lock around getattr so that we don't see the status mid-update. This has the following consequences: (1) Data invalidation isn't seen until someone calls afs_validate() on a vnode. Unfortunately, we need to use a key to query the server, but getting one from a background thread is tricky without caching loads of keys all over the place. (2) Mass invalidation isn't seen until someone calls afs_validate(). (3) Callback breaking is going to hit the inode_hash_lock quite a bit. Could this be replaced with rcu_read_lock() since inodes are destroyed under RCU conditions. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Rename struct afs_call server member to cm_server	David Howells
	Rename the server member of struct afs_call to cm_server as we're only going to be using it for incoming calls for the Cache Manager service. This makes it easier to differentiate from the pointer to the target server for the client, which will point to a different structure to allow for callback handling. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Fix the afs_uuid struct to make the char-sized fields signed	David Howells
	In AFS's encoding of a UUID, the eight 'char' fields are all signed, so represent them with __s8 rather than __u8. This makes the compiler sign-extend them correctly when XDR-encoding them. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Connect up the CB.ProbeUuid	David Howells
	The handler for the CB.ProbeUuid operation in the cache manager is implemented, but isn't listed in the switch-statement of operation selection, so won't be used. Fix this by adding it. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Potentially return call->reply[0] from afs_make_call()	David Howells
	If call->ret_reply0 is set, return call->reply[0] on success. Change the return type of afs_make_call() to long so that this can be passed back without bit loss and then cast to a pointer if required. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Condense afs_call's reply{,2,3,4} into an array	David Howells
	Condense struct afs_call's reply anchor members - reply{,2,3,4} - into an array. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Consolidate abort_to_error translators	David Howells
	The AFS abort code space is shared across all services, so there's no need for separate abort_to_error translators for each service. Consolidate them into a single function and remove the function pointers for them. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Allow IPv6 address specification of VL servers	David Howells
	Allow VL server specifications to be given IPv6 addresses as well as IPv4 addresses, for example as: echo add foo.org 1111:2222:3333:0:4444:5555:6666:7777 >/proc/fs/afs/cells Note that ':' is the expected separator for separating IPv4 addresses, but if a ',' is detected or no '.' is detected in the string, the delimiter is switched to ','. This also works with DNS AFSDB or SRV record strings fetched by upcall from userspace. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Keep and pass sockaddr_rxrpc addresses rather than in_addr	David Howells
	Keep and pass sockaddr_rxrpc addresses around rather than keeping and passing in_addr addresses to allow for the use of IPv6 and non-standard port numbers in future. This also allows the port and service_id fields to be removed from the afs_call struct. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Update the cache index structure	David Howells
	Update the cache index structure in the following ways: (1) Don't use the volume name followed by the volume type as levels in the cache index. Volumes can be renamed. Use the volume ID instead. (2) Don't store the VLDB data for a volume in the tree. If the volume database should be cached locally, then it should be done in a separate tree. (3) Expand the volume ID stored in the cache to 64 bits. (4) Expand the file/vnode ID stored in the cache to 96 bits. (5) Increment the cache structure version number to 1. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Add some protocol defs	David Howells
	Add some protocol definitions, including max field lengths, flag defs, an XDR-encoded UUID def, more VL operation IDs and more fileserver abort codes. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Push the net ns pointer to more places	David Howells
	Push the network namespace pointer to more places in AFS, including the afs_server structure (which doesn't hold a ref on the netns). In particular, afs_put_cell() now takes requires a net ns parameter so that it can safely alter the netns after decrementing the cell usage count - the cell will be deallocated by a background thread after being cached for a period, which means that it's not safe to access it after reducing its usage count. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Note the cell in the superblock info also	David Howells
	Keep a reference to the cell in the superblock info structure in addition to the volume and net pointers. This will make it easier to clean up in a future patch in which afs_put_volume() will need the cell pointer. Whilst we're at it, make the cell and volume getting functions return a pointer to the object got to make the call sites look neater. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Fix server reaping	David Howells
	Fix server reaping and make sure it's all done before we start trying to purge cells, given that servers currently pin cells. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Close the rxrpc socket only after purging the servers	David Howells
	Close the rxrpc socket only after we've purged the server records (and also cell and volume records which might refer to servers) so that we can give up the callbacks on each server. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	afs: Lay the groundwork for supporting network namespaces	David Howells
	Lay the groundwork for supporting network namespaces (netns) to the AFS filesystem by moving various global features to a network-namespace struct (afs_net) and providing an instance of this as a temporary global variable that everything uses via accessor functions for the moment. The following changes have been made: (1) Store the netns in the superblock info. This will be obtained from the mounter's nsproxy on a manual mount and inherited from the parent superblock on an automount. (2) The cell list is made per-netns. It can be viewed through /proc/net/afs/cells and also be modified by writing commands to that file. (3) The local workstation cell is set per-ns in /proc/net/afs/rootcell. This is unset by default. (4) The 'rootcell' module parameter, which sets a cell and VL server list modifies the init net namespace, thereby allowing an AFS root fs to be theoretically used. (5) The volume location lists and the file lock manager are made per-netns. (6) The AF_RXRPC socket and associated I/O bits are made per-ns. The various workqueues remain global for the moment. Changes still to be made: (1) /proc/fs/afs/ should be moved to /proc/net/afs/ and a symlink emplaced from the old name. (2) A per-netns subsys needs to be registered for AFS into which it can store its per-netns data. (3) Rather than the AF_RXRPC socket being opened on module init, it needs to be opened on the creation of a superblock in that netns. (4) The socket needs to be closed when the last superblock using it is destroyed and all outstanding client calls on it have been completed. This prevents a reference loop on the namespace. (5) It is possible that several namespaces will want to use AFS, in which case each one will need its own UDP port. These can either be set through /proc/net/afs/cm_port or the kernel can pick one at random. The init_ns gets 7001 by default. Other issues that need resolving: (1) The DNS keyring needs net-namespacing. (2) Where do upcalls go (eg. DNS request-key upcall)? (3) Need something like open_socket_in_file_ns() syscall so that AFS command line tools attempting to operate on an AFS file/volume have their RPC calls go to the right place. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	Pass mode to wait_on_atomic_t() action funcs and provide default actions	David Howells
	Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an extra argument and make it 'unsigned int throughout. Also, consolidate a bunch of identical action functions into a default function that can do the appropriate thing for the mode. Also, change the argument name in the bit_wait*() function declarations to reflect the fact that it's the mode and not the bit number. [Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait should be done differently, though he's not immediately sure as to how] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> cc: Ingo Molnar <mingo@kernel.org>
2017-11-13	Merge remote-tracking branch 'tip/timers/core' into afs-next	David Howells
	These AFS patches need the timer_reduce() patch from timers/core. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13	rbd: default to single-major device number scheme	Ilya Dryomov
	It's been 3.5 years, let's turn it on by default. Support in rbd(8) utility goes back to pre-firefly, "rbd map" has been loading the module with single_major=Y ever since. However, if the module is already loaded (whether by hand or at boot time), we end up with single_major=N. Also, some people don't install rbd(8) and use the sysfs interface directly. (With single-major=N, a major number is consumed for every mapping, imposing a limit of ~240 rbd images per host. single-major=Y allows mapping thousands of rbd images on a single machine.) Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2017-11-13	timer/debug: Change /proc/timer_list from 0444 to 0400	Ingo Molnar
	While it uses %pK, there's still few reasons to read this file as non-root. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-13	Merge tag 'asoc-v4.15' of ↵	Takashi Iwai
	https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v4.15 The biggest thing this release has been the conversion of the AC98 bus to the driver model, that's been a long time coming so thanks to Robert Jarzmik for his dedication there. Due to there being some AC97 MFD there's a few fairly large changes in input and the MFD layer, mainly to the wm97xx driver. There's also some drivers/drm changes to support the new AMD Stoney platform, these are shared with the DRM subsystem and should be being merged via both. Within the subsystem the overwhelming bulk of the changes is in the Intel drivers which continue to need lots of cleanups and fixes, this release they've also gained support for their open source firmware. There's also some large changs in the core as Morimoto-san continues to mirror operations into the component level in preparation for conversion of drivers to that. - The AC97 bus has finally caught up with the driver model thanks to some dedicated and persistent work from Robert Jarzmik. - Continued work from Morimoto-san on moving us towards being able to use components for everything. - Lots of cleanups for the Intel platform code, including support for their open source audio firmware. - Support for scaling MCLK with sample rate in simple-card. - Support for AMD Stoney platform.
2017-11-13	Merge branch 'for-next' into for-linus	Takashi Iwai
	Pull 4.15 updates to take over the previous urgent fixes. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-11-13	irqchip/gic-v3: pr_err() strings should end with newlines	Arvind Yadav
	pr_err() messages should end with a new-line to avoid other messages being concatenated. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-13	irqchip/s3c24xx: pr_err() strings should end with newlines	Arvind Yadav
	pr_err() messages should end with a new-line to avoid other messages being concatenated. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-13	sh: select KBUILD_DEFCONFIG depending on ARCH	Masahiro Yamada
	You can not select KBUILD_DEFCONFIG depending on any CONFIG option because include/config/auto.conf is not included when building config targets. So, CONFIG_SUPERH32 is never set during the configuration, then cayman_defconfig is always chosen. This commit provides a sensible way to choose shx3/cayman_defconfig. arch/sh/Kconfig sets either SUPERH32 or SUPERH64 depending on ARCH environment, like follows: config SUPERH32 def_bool ARCH = "sh" ... config SUPERH64 def_bool ARCH = "sh64" It should make sense to choose the default defconfig by ARCH, like arch/sparc/Makefile. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13	kbuild: fix linker feature test macros when cross compiling with Clang	Nick Desaulniers
	I was not seeing my linker flags getting added when using ld-option when cross compiling with Clang. Upon investigation, this seems to be due to a difference in how GCC vs Clang handle cross compilation. GCC is configured at build time to support one backend, that is implicit when compiling. Clang is explicit via the use of `-target <triple>` and ships with all supported backends by default. GNU Make feature test macros that compile then link will always fail when cross compiling with Clang unless Clang's triple is passed along to the compiler. For example: $ clang -x c /dev/null -c -o temp.o $ aarch64-linux-android/bin/ld -E temp.o aarch64-linux-android/bin/ld: unknown architecture of input file `temp.o' is incompatible with aarch64 output aarch64-linux-android/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400078 $ echo $? 1 $ clang -target aarch64-linux-android- -x c /dev/null -c -o temp.o $ aarch64-linux-android/bin/ld -E temp.o aarch64-linux-android/bin/ld: warning: cannot find entry symbol _start; defaulting to 00000000004002e4 $ echo $? 0 This causes conditional checks that invoke $(CC) without the target triple, then $(LD) on the result, to always fail. Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13	kbuild: shrink .cache.mk when it exceeds 1000 lines	Masahiro Yamada
	The cache files are only cleaned away by "make clean". If you continue incremental builds, the cache files will grow up little by little. It is not a big deal in general use cases because compiler flags do not change quite often. However, if you do build-test for various architectures, compilers, and kernel configurations, you will end up with huge cache files soon. When the cache file exceeds 1000 lines, shrink it down to 500 by "tail". The Least Recently Added lines are cut. (not Least Recently Used) I hope it will work well enough. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Douglas Anderson <dianders@chromium.org>
2017-11-13	kbuild: do not call cc-option before KBUILD_CFLAGS initialization	Masahiro Yamada
	Some $(call cc-option,...) are invoked very early, even before KBUILD_CFLAGS, etc. are initialized. The returned string from $(call cc-option,...) depends on KBUILD_CPPFLAGS, KBUILD_CFLAGS, and GCC_PLUGINS_CFLAGS. Since they are exported, they are not empty when the top Makefile is recursively invoked. The recursion occurs in several places. For example, the top Makefile invokes itself for silentoldconfig. "make tinyconfig", "make rpm-pkg" are the cases, too. In those cases, the second call of cc-option from the same line runs a different shell command due to non-pristine KBUILD_CFLAGS. To get the same result all the time, KBUILD_* and GCC_PLUGINS_CFLAGS must be initialized before any call of cc-option. This avoids garbage data in the .cache.mk file. Move all calls of cc-option below the config targets because target compiler flags are unnecessary for Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Douglas Anderson <dianders@chromium.org>
2017-11-13	kbuild: Cache a few more calls to the compiler	Douglas Anderson
	These are a few stragglers that I left out of the original patch to cache calls to the C compiler ("kbuild: Add a cache for generated variables") because they bleed out into the main Makefile and thus uglify things a little bit. The idea is the same here, though. Signed-off-by: Douglas Anderson <dianders@chromium.org> Tested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13	kbuild: Add a cache for generated variables	Douglas Anderson
	While timing a "no-op" build of the kernel (incrementally building the kernel even though nothing changed) in the Chrome OS build system I found that it was much slower than I expected. Digging into things a bit, I found that quite a bit of the time was spent invoking the C compiler even though we weren't actually building anything. Currently in the Chrome OS build system the C compiler is called through a number of wrappers (one of which is written in python!) and can take upwards of 100 ms to invoke even if we're not doing anything difficult, so these invocations of the compiler were taking a lot of time. Worse the invocations couldn't seem to take advantage of the multiple cores on my system. Certainly it seems like we could make the compiler invocations in the Chrome OS build system faster, but only to a point. Inherently invoking a program as big as a C compiler is a fairly heavy operation. Thus even if we can speed the compiler calls it made sense to track down what was happening. It turned out that all the compiler invocations were coming from usages like this in the kernel's Makefile: KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,) Due to the way cc-option and similar statements work the above contains an implicit call to the C compiler. ...and due to the fact that we're storing the result in KBUILD_CFLAGS, a simply expanded variable, the call will happen every time the Makefile is parsed, even if there are no users of KBUILD_CFLAGS. Rather than redoing this computation every time, it makes a lot of sense to cache the result of all of the Makefile's compiler calls just like we do when we compile a ".c" file to a ".o" file. Conceptually this is quite a simple idea. ...and since the calls to invoke the compiler and similar tools are centrally located in the Kbuild.include file this doesn't even need to be super invasive. Implementing the cache in a simple-to-use and efficient way is not quite as simple as it first sounds, though. To get maximum speed we really want the cache in a format that make can natively understand and make doesn't really have an ability to load/parse files. ...but make _can_ import other Makefiles, so the solution is to store the cache in Makefile format. This requires coming up with a valid/unique Makefile variable name for each value to be cached, but that's solvable with some cleverness. After this change, we'll automatically create a ".cache.mk" file that will contain our cached variables. We'll load this on each invocation of make and will avoid recomputing anything that's already in our cache. The cache is stored in a format that it shouldn't need any invalidation since anything that might change should affect the "key" and any old cached value won't be used. Signed-off-by: Douglas Anderson <dianders@chromium.org> Tested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13	kbuild: add forward declaration of default target to Makefile.asm-generic	Masahiro Yamada
	$(kbuild-file) and Kbuild.include are included before the default target "all". We will add a target into Kbuild.include. In advance, add a forward declaration of the default target. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Douglas Anderson <dianders@chromium.org>
2017-11-13	MIPS: pci: Make use of the BIT() macro inside the mt7620 driver	John Crispin
	There are a few defines that manully shift a bit. Change these to using the BIT() macro. Signed-off-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15322/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13	MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver	John Crispin
	Switch the printk() call to the prefered pr_warn() api. Fixes: 7e5873d3755c ("MIPS: pci: Add MT7620a PCIE driver") Signed-off-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.5+ Patchwork: https://patchwork.linux-mips.org/patch/15321/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13	MIPS: pci: Remove duplicate define in mt7620 driver	John Crispin
	An invalid and duplicate define has gone unnoticed for some time. lets remove it. The correct define is 3 lines below. Fixes: 7e5873d3755c ("MIPS: pci: Add MT7620a PCIE driver") Signed-off-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15320/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13	platform/x86: Revert intel_pmc_ipc: Use MFD framework to create dependent ↵	Andy Shevchenko
	devices Heikki discovered a runtime issue with this patch. Taking into consideration we have no time to test any fix right now, revert the commit 43aaf4f03f063b12bcba2f8b800fdec85e2acc75. Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-11-13	powerpc/64s: mm_context.addr_limit is only used on hash	Nicholas Piggin
	Radix keeps no meaningful state in addr_limit, so remove it from radix code and rename to slb_addr_limit to make it clear it applies to hash only. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13	powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation	Nicholas Piggin
	Radix VA space allocations test addresses against mm->task_size which is 512TB, even in cases where the intention is to limit allocation to below 128TB. This results in mmap with a hint address below 128TB but address + length above 128TB succeeding when it should fail (as hash does after the previous patch). Set the high address limit to be considered up front, and base subsequent allocation checks on that consistently. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13	powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary	Nicholas Piggin
	While mapping hints with a length that cross 128TB are disallowed, MAP_FIXED allocations that cross 128TB are allowed. These are failing on hash (on radix they succeed). Add an additional case for fixed mappings to expand the addr_limit when crossing 128TB. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13	powerpc/64s/hash: Fix fork() with 512TB process address space	Nicholas Piggin
	Hash unconditionally resets the addr_limit to default (128TB) when the mm context is initialised. If a process has > 128TB mappings when it forks, the child will not get the 512TB addr_limit, so accesses to valid > 128TB mappings will fail in the child. Fix this by only resetting the addr_limit to default if it was 0. Non zero indicates it was duplicated from the parent (0 means exec()). Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13	powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation	Nicholas Piggin
	When allocating VA space with a hint that crosses 128TB, the SLB addr_limit variable is not expanded if addr is not > 128TB, but the slice allocation looks at task_size, which is 512TB. This results in slice_check_fit() incorrectly succeeding because the slice_count truncates off bit 128 of the requested mask, so the comparison to the available mask succeeds. Fix this by using mm->context.addr_limit instead of mm->task_size for testing allocation limits. This causes such allocations to fail. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13	powerpc/64s/hash: Fix 512T hint detection to use >= 128T	Michael Ellerman
	Currently userspace is able to request mmap() search between 128T-512T by specifying a hint address that is greater than 128T. But that means a hint of 128T exactly will return an address below 128T, which is confusing and wrong. So fix the logic to check the hint is greater than or equal to 128T. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: stable@vger.kernel.org # v4.12+ Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Suggested-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of Nick's bigger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13	MIPS: ralink: Fix typo in mt7628 pinmux function	Mathias Kresin
	There is a typo inside the pinmux setup code. The function is called refclk and not reclk. Fixes: 53263a1c6852 ("MIPS: ralink: add mt7628an support") Signed-off-by: Mathias Kresin <dev@kresin.me> Acked-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.19+ Patchwork: https://patchwork.linux-mips.org/patch/16047/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13	MIPS: ralink: Fix MT7628 pinmux	Mathias Kresin
	According to the datasheet the REFCLK pin is shared with GPIO#37 and the PERST pin is shared with GPIO#36. Fixes: 53263a1c6852 ("MIPS: ralink: add mt7628an support") Signed-off-by: Mathias Kresin <dev@kresin.me> Acked-by: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.19+ Patchwork: https://patchwork.linux-mips.org/patch/16046/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13	powerpc: Fix DABR match on hash based systems	Benjamin Herrenschmidt
	Commit 398a719d34a1 ("powerpc/mm: Update bits used to skip hash_page") mistakenly dropped the DSISR_DABRMATCH bit from the mask of bit tested to skip trying to hash a page. As a result, the DABR matches would no longer be detected. This adds it back. We open code it in the 2 places where it matters rather than fold it into DSISR_BAD_FAULT_32S/64S because this isn't technically a bad fault and while we would never hit it with the current code, I prefer if page_fault_is_bad() didn't trigger on these. Fixes: 398a719d34a1 ("powerpc/mm: Update bits used to skip hash_page") Cc: stable@vger.kernel.org # v4.14 Tested-by: Pedro Miraglia Franco de Carvalho <pedromfc@br.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2017-11-13	libceph: don't WARN() if user tries to add invalid key	Eric Biggers
	The WARN_ON(!key->len) in set_secret() in net/ceph/crypto.c is hit if a user tries to add a key of type "ceph" with an invalid payload as follows (assuming CONFIG_CEPH_LIB=y): echo -e -n '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' \ \| keyctl padd ceph desc @s This can be hit by fuzzers. As this is merely bad input and not a kernel bug, replace the WARN_ON() with return -EINVAL. Fixes: 7af3ea189a9a ("libceph: stop allocating a new cipher on every crypto request") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13	rbd: set discard_alignment to zero	David Disseldorp
	RBD devices are currently incorrectly initialised with the block queue discard_alignment set to the underlying RADOS object size. As per Documentation/ABI/testing/sysfs-block: The discard_alignment parameter indicates how many bytes the beginning of the device is offset from the internal allocation unit's natural alignment. Correcting the discard_alignment parameter from the RADOS object size to zero (the blk_set_default_limits() default) has no effect on how discard requests are propagated through the block layer - @alignment in __blkdev_issue_discard() remains zero. However, it does fix the UNMAP granularity alignment value advertised to SCSI initiators via the Block Limits VPD. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13	ceph: silence sparse endianness warning in encode_caps_cb	Jeff Layton
	sparse warns: fs/ceph/mds_client.c:2887:34: warning: incorrect type in assignment (different base types) fs/ceph/mds_client.c:2887:34: expected restricted __le32 [assigned] [usertype] flock_len fs/ceph/mds_client.c:2887:34: got int At this point, it's just being used as a flag. It gets overwritten later if the rest of the encoding succeeds. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13	ceph: remove the bump of i_version	Jeff Layton
	Eventually, we'll want to wire cephfs up to use the change attribute that the cluster tracks instead, but for now this is unneeded. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>