linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-02-03	firmware: qcom: scm: Fix interrupted SCM calls	Andy Gross
	This patch adds a Qualcomm specific quirk to the arm_smccc_smc call. On Qualcomm ARM64 platforms, the SMC call can return before it has completed. If this occurs, the call can be restarted, but it requires using the returned session ID value from the interrupted SMC call. The quirk stores off the session ID from the interrupted call in the quirk structure so that it can be used by the caller. This patch folds in a fix given by Sricharan R: https://lkml.org/lkml/2016/9/28/272 Signed-off-by: Andy Gross <andy.gross@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-03	arm: kernel: Add SMC structure parameter	Andy Gross
	This patch adds a quirk parameter to the arm_smccc_(smc/hvc) calls. The quirk structure allows for specialized SMC operations due to SoC specific requirements. The current arm_smccc_(smc/hvc) is renamed and macros are used instead to specify the standard arm_smccc_(smc/hvc) or the arm_smccc_(smc/hvc)_quirk function. This patch and partial implementation was suggested by Will Deacon. Signed-off-by: Andy Gross <andy.gross@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-03	Merge branch 'modversions' (modversions fixes for powerpc from Ard)	Linus Torvalds
	Merge kcrctab entry fixes from Ard Biesheuvel: "This is a followup to [0] 'modversions: redefine kcrctab entries as relative CRC pointers', but since relative CRC pointers do not work in modules, and are actually only needed by powerpc with CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature instead. First it introduces the MODULE_REL_CRCS Kconfig symbol, and adds the kbuild handling of it, i.e., modpost, genksyms and kallsyms. Then it switches all architectures to 32-bit CRC entries in kcrctab, where all architectures except powerpc with CONFIG_RELOCATABLE=y use absolute ELF symbol references as before" [0] http://marc.info/?l=linux-arch&m=148493613415294&w=2 * emailed patches from Ard Biesheuvel: module: unify absolute krctab definitions for 32-bit and 64-bit modversions: treat symbol CRCs as 32 bit quantities kbuild: modversions: add infrastructure for emitting relative CRCs
2017-02-03	log2: make order_base_2() behave correctly on const input value zero	Ard Biesheuvel
	The function order_base_2() is defined (according to the comment block) as returning zero on input zero, but subsequently passes the input into roundup_pow_of_two(), which is explicitly undefined for input zero. This has gone unnoticed until now, but optimization passes in GCC 7 may produce constant folded function instances where a constant value of zero is passed into order_base_2(), resulting in link errors against the deliberately undefined '____ilog2_NaN'. So update order_base_2() to adhere to its own documented interface. [ See http://marc.info/?l=linux-kernel&m=147672952517795&w=2 and follow-up discussion for more background. The gcc "optimization pass" is really just broken, but now the GCC trunk problem seems to have escaped out of just specially built daily images, so we need to work around it in mainline. - Linus ] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03	module: unify absolute krctab definitions for 32-bit and 64-bit	Ard Biesheuvel
	The previous patch introduced a separate inline asm version of the krcrctab declaration template for use with 64-bit architectures, which cannot refer to ELF symbols using 32-bit quantities. This declaration should be equivalent to the C one for 32-bit architectures, but just in case - unify them in a separate patch, which can simply be dropped if it turns out to break anything. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03	modversions: treat symbol CRCs as 32 bit quantities	Ard Biesheuvel
	The modversion symbol CRCs are emitted as ELF symbols, which allows us to easily populate the kcrctab sections by relying on the linker to associate each kcrctab slot with the correct value. This has a couple of downsides: - Given that the CRCs are treated as memory addresses, we waste 4 bytes for each CRC on 64 bit architectures, - On architectures that support runtime relocation, a R_<arch>_RELATIVE relocation entry is emitted for each CRC value, which identifies it as a quantity that requires fixing up based on the actual runtime load offset of the kernel. This results in corrupted CRCs unless we explicitly undo the fixup (and this is currently being handled in the core module code) - Such runtime relocation entries take up 24 bytes of __init space each, resulting in a x8 overhead in [uncompressed] kernel size for CRCs. Switching to explicit 32 bit values on 64 bit architectures fixes most of these issues, given that 32 bit values are not treated as quantities that require fixing up based on the actual runtime load offset. Note that on some ELF64 architectures [such as PPC64], these 32-bit values are still emitted as [absolute] runtime relocatable quantities, even if the value resolves to a build time constant. Since relative relocations are always resolved at build time, this patch enables MODULE_REL_CRCS on powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC references into relative references into .rodata where the actual CRC value is stored. So redefine all CRC fields and variables as u32, and redefine the __CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using inline assembler (which is necessary since 64-bit C code cannot use 32-bit types to hold memory addresses, even if they are ultimately resolved using values that do not exceed 0xffffffff). To avoid potential problems with legacy 32-bit architectures using legacy toolchains, the equivalent C definition of the kcrctab entry is retained for 32-bit architectures. Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y") Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03	tcp: add tcp_mss_clamp() helper	Eric Dumazet
	Small cleanup factorizing code doing the TCP_MAXSEG clamping. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	ipv6: sr: remove cleanup flag and fix HMAC computation	David Lebrun
	In the latest version of the IPv6 Segment Routing IETF draft [1] the cleanup flag is removed and the flags field length is shrunk from 16 bits to 8 bits. As a consequence, the input of the HMAC computation is modified in a non-backward compatible way by covering the whole octet of flags instead of only the cleanup bit. As such, if an implementation compatible with the latest draft computes the HMAC of an SRH who has other flags set to 1, then the HMAC result would differ from the current implementation. This patch carries those modifications to prevent conflict with other implementations of IPv6 SR. [1] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-05 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	ACPI: Add support for ResourceSource/IRQ domain mapping	Agustin Vega-Frias
	ACPI extended IRQ resources may contain a ResourceSource to specify an alternate interrupt controller. Introduce acpi_irq_get and use it to implement ResourceSource/IRQ domain mapping. The new API is similar to of_irq_get and allows re-initialization of a platform resource from the ACPI extended IRQ resource, and provides proper behavior for probe deferral when the domain is not yet present when called. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-03	[media] v4l2-subdev.h: fix v4l2_subdev_pad_config documentation	Baruch Siach
	The fields of v4l2_subdev_pad_config are not pointers. Fixes: 21c29de1d09 ("[media] v4l2-subdev.h: Improve documentation") Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-03	Merge remote-tracking branches 'regmap/topic/doc' and 'regmap/topic/rbtree' ↵	Mark Brown
	into regmap-next
2017-02-03	vmw_vmci: switch to pci_irq_alloc_vectors	Christoph Hellwig
	Cleans up the IRQ management code a lot, including removing a lot of state from the per-device structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-03	crypto: doc - Fix hash export state information	Rabin Vincent
	The documentation states that crypto_ahash_reqsize() provides the size of the state structure used by crypto_ahash_export(). But it's actually crypto_ahash_statesize() which provides this size. Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03	drm: Improve drm_mm search (and fix topdown allocation) with rbtrees	Chris Wilson
	The drm_mm range manager claimed to support top-down insertion, but it was neither searching for the top-most hole that could fit the allocation request nor fitting the request to the hole correctly. In order to search the range efficiently, we create a secondary index for the holes using either their size or their address. This index allows us to find the smallest hole or the hole at the bottom or top of the range efficiently, whilst keeping the hole stack to rapidly service evictions. v2: Search for holes both high and low. Rename flags to mode. v3: Discover rb_entry_safe() and use it! v4: Kerneldoc for enum drm_mm_insert_mode. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: Eric Anholt <eric@anholt.net> Cc: Sinclair Yeh <syeh@vmware.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> # vmwgfx Reviewed-by: Lucas Stach <l.stach@pengutronix.de> #etnaviv Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170202210438.28702-1-chris@chris-wilson.co.uk
2017-02-03	[media] media: Protect enable_source and disable_source handler code paths	Shuah Khan
	Drivers might try to access and run enable_source and disable_source handlers when the driver that implements these handlers is clearing the handlers during its unregister. Fix the following race condition: process 1 process 2 request video streaming unbind au0828 v4l2 checks if tuner is free ... ... au0828_unregister_media_device() ... ... (doesn't hold graph_mutex) mdev->enable_source = NULL; if (mdev && mdev->enable_source) mdev->disable_source = NULL; mdev->enable_source() (enable_source holds graph_mutex) As shown above enable_source check is done without holding the graph_mutex. If unbind happens to be in progress, au0828 could clear enable_source and disable_source handlers leading to null pointer de-reference. Fix it by protecting enable_source and disable_source set and clear and protecting enable_source and disable_source handler access and the call itself. process 1 process 2 request video streaming unbind au0828 v4l2 checks if tuner is free ... ... au0828_unregister_media_device() ... ... (hold graph_mutex while clearing) mdev->enable_source = NULL; if (mdev) mdev->disable_source = NULL; (hold graph_mutex to check and call enable_source) if (mdev->enable_source) mdev->enable_source() If graph_mutex is held to just heck for handler being null and needs to be released before calling the handler, there will be another window for the handlers to be cleared. Hence, enable_source and disable_source handlers no longer hold the graph_mutex and expect callers to hold it to avoid forcing them release the graph_mutex before calling the handlers. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-03	serdev: add a tty port controller driver	Rob Herring
	Add a serdev controller driver for tty ports. The controller is registered with serdev when tty ports are registered with the TTY core. As the TTY core is built-in only, this has the side effect of making serdev built-in as well. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-By: Sebastian Reichel <sre@kernel.org> Tested-By: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-03	serdev: Introduce new bus for serial attached devices	Rob Herring
	The serdev bus is designed for devices such as Bluetooth, WiFi, GPS and NFC connected to UARTs on host processors. Tradionally these have been handled with tty line disciplines, rfkill, and userspace glue such as hciattach. This approach has many drawbacks since it doesn't fit into the Linux driver model. Handling of sideband signals, power control and firmware loading are the main issues. This creates a serdev bus with controllers (i.e. host serial ports) and attached devices. Typically, these are point to point connections, but some devices have muxing protocols or a h/w mux is conceivable. Any muxing is not yet supported with the serdev bus. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-By: Sebastian Reichel <sre@kernel.org> Tested-By: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-03	tty_port: Add port client functions	Rob Herring
	Introduce a client (upward direction) operations struct for tty_port clients. Initially supported operations are for receiving data and write wake-up. This will allow for having clients other than an ldisc. Convert the calls to the ldisc to use the client ops as the default operations. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-By: Sebastian Reichel <sre@kernel.org> Tested-By: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-02	net: add LINUX_MIB_PFMEMALLOCDROP counter	Eric Dumazet
	Debugging issues caused by pfmemalloc is often tedious. Add a new SNMP counter to more easily diagnose these problems. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Acked-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	introduce the walk_process_tree() helper	Oleg Nesterov
	Add the new helper to walk the process tree, the next patch adds a user. Note that it visits the group leaders only, proc_visitor can do for_each_thread itself or we can trivially extend walk_process_tree() to do this. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-02-02	net: phy: marvell: Add support for 88e1545 PHY	Andrew Lunn
	The 88e1545 PHYs are discrete Marvell PHYs, found in a quad package on the zii-devel-b board. Add support for it to the Marvell PHY driver. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02	unix: add ioctl to open a unix socket file with O_PATH	Andrey Vagin
	This ioctl opens a file to which a socket is bound and returns a file descriptor. The caller has to have CAP_NET_ADMIN in the socket network namespace. Currently it is impossible to get a path and a mount point for a socket file. socket_diag reports address, device ID and inode number for unix sockets. An address can contain a relative path or a file may be moved somewhere. And these properties say nothing about a mount namespace and a mount point of a socket file. With the introduced ioctl, we can get a path by reading /proc/self/fd/X and get mnt_id from /proc/self/fdinfo/X. In CRIU we are going to use this ioctl to dump and restore unix socket. Here is an example how it can be used: $ strace -e socket,bind,ioctl ./test /tmp/test_sock socket(AF_UNIX, SOCK_STREAM, 0) = 3 bind(3, {sa_family=AF_UNIX, sun_path="test_sock"}, 11) = 0 ioctl(3, SIOCUNIXFILE, 0) = 4 ^Z $ ss -a \| grep test_sock u_str LISTEN 0 1 test_sock 17798 * 0 $ ls -l /proc/760/fd/{3,4} lrwx------ 1 root root 64 Feb 1 09:41 3 -> 'socket:[17798]' l--------- 1 root root 64 Feb 1 09:41 4 -> /tmp/test_sock $ cat /proc/760/fdinfo/4 pos: 0 flags: 012000000 mnt_id: 40 $ cat /proc/self/mountinfo \| grep "^40\s" 40 19 0:37 / /tmp rw shared:23 - tmpfs tmpfs rw Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02	net: phy: Marvell: Add mv88e6390 internal PHY	Andrew Lunn
	The mv88e6390 Ethernet switch has internal PHYs. These PHYs don't have an model ID in the ID2 register. So the MDIO driver in the switch intercepts reads to this register, and returns the switch family ID. Extend the Marvell PHY driver by including this ID, and treat the PHY as a 88E1540. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	Merge branch 'nsfs-discovery'	Eric W. Biederman
	Michael Kerrisk <<mtk.manpages@gmail.com> writes: I would like to write code that discovers the namespace setup on a live system. The NS_GET_PARENT and NS_GET_USERNS ioctl() operations added in Linux 4.9 provide much of what I want, but there are still a couple of small pieces missing. Those pieces are added with this patch series. Here's an example program that makes use of the new ioctl() operations. 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x--- /* ns_capable.c (C) 2016 Michael Kerrisk, <mtk.manpages@gmail.com> Licensed under the GNU General Public License v2 or later. Test whether a process (identified by PID) might (subject to LSM checks) have capabilities in a namespace (identified by a /proc/PID/ns/xxx file). / } while (0) exit(EXIT_FAILURE); } while (0) / Display capabilities sets of process with specified PID / static void show_cap(pid_t pid) { cap_t caps; char cap_string; caps = cap_get_pid(pid); if (caps == NULL) errExit("cap_get_proc"); cap_string = cap_to_text(caps, NULL); if (cap_string == NULL) errExit("cap_to_text"); printf("Capabilities: %s\n", cap_string); } /* Obtain the effective UID pf the process 'pid' by scanning its /proc/PID/file / static uid_t get_euid_of_process(pid_t pid) { char path[PATH_MAX]; char line[1024]; int uid; snprintf(path, sizeof(path), "/proc/%ld/status", (long) pid); FILE fp; fp = fopen(path, "r"); if (fp == NULL) errExit("fopen-/proc/PID/status"); for (;;) { if (fgets(line, sizeof(line), fp) == NULL) { /* Should never happen... / fprintf(stderr, "Failure scanning %s\n", path); exit(EXIT_FAILURE); } if (strstr(line, "Uid:") == line) { sscanf(line, "Uid: %d %d %d %d", &uid); return uid; } } } int main(int argc, char argv[]) { int ns_fd, userns_fd, pid_userns_fd; int nstype; int next_fd; struct stat pid_stat; struct stat target_stat; char pid_str; pid_t pid; char path[PATH_MAX]; if (argc < 2) { fprintf(stderr, "Usage: %s PID [ns-file]\n", argv[0]); fprintf(stderr, "\t'ns-file' is a /proc/PID/ns/xxxx file; " "if omitted, use the namespace\n" "\treferred to by standard input " "(file descriptor 0)\n"); exit(EXIT_FAILURE); } pid_str = argv[1]; pid = atoi(pid_str); if (argc <= 2) { ns_fd = STDIN_FILENO; } else { ns_fd = open(argv[2], O_RDONLY); if (ns_fd == -1) errExit("open-ns-file"); } /* Get the relevant user namespace FD, which is 'ns_fd' if 'ns_fd' refers to a user namespace, otherwise the user namespace that owns 'ns_fd' / nstype = ioctl(ns_fd, NS_GET_NSTYPE); if (nstype == -1) errExit("ioctl-NS_GET_NSTYPE"); if (nstype == CLONE_NEWUSER) { userns_fd = ns_fd; } else { userns_fd = ioctl(ns_fd, NS_GET_USERNS); if (userns_fd == -1) errExit("ioctl-NS_GET_USERNS"); } / Obtain 'stat' info for the user namespace of the specified PID / snprintf(path, sizeof(path), "/proc/%s/ns/user", pid_str); pid_userns_fd = open(path, O_RDONLY); if (pid_userns_fd == -1) errExit("open-PID"); if (fstat(pid_userns_fd, &pid_stat) == -1) errExit("fstat-PID"); / Get 'stat' info for the target user namesapce / if (fstat(userns_fd, &target_stat) == -1) errExit("fstat-PID"); / If the PID is in the target user namespace, then it has whatever capabilities are in its sets. / if (pid_stat.st_dev == target_stat.st_dev && pid_stat.st_ino == target_stat.st_ino) { printf("PID is in target namespace\n"); printf("Subject to LSM checks, it has the following capabilities\n"); show_cap(pid); exit(EXIT_SUCCESS); } / Otherwise, we need to walk through the ancestors of the target user namespace to see if PID is in an ancestor namespace / for (;;) { int f; next_fd = ioctl(userns_fd, NS_GET_PARENT); if (next_fd == -1) { / The error here should be EPERM... / if (errno != EPERM) errExit("ioctl-NS_GET_PARENT"); printf("PID is not in an ancestor namespace\n"); printf("It has no capabilities in the target namespace\n"); exit(EXIT_SUCCESS); } if (fstat(next_fd, &target_stat) == -1) errExit("fstat-PID"); / If the 'stat' info for this user namespace matches the 'stat' * info for 'next_fd', then the PID is in an ancestor namespace / if (pid_stat.st_dev == target_stat.st_dev && pid_stat.st_ino == target_stat.st_ino) break; / Next time round, get the next parent / f = userns_fd; userns_fd = next_fd; close(f); } / At this point, we found that PID is in an ancestor of the target user namespace, and 'userns_fd' refers to the immediate descendant user namespace of PID in the chain of user namespaces from PID to the target user namespace. If the effective UID of PID matches the owner UID of descendant user namespace, then PID has all capabilities in the descendant namespace(s); otherwise, it just has the capabilities that are in its sets. */ uid_t owner_uid, uid; if (ioctl(userns_fd, NS_GET_OWNER_UID, &owner_uid) == -1) { perror("ioctl-NS_GET_OWNER_UID"); exit(EXIT_FAILURE); } uid = get_euid_of_process(pid); printf("PID is in an ancestor namespace\n"); if (owner_uid == uid) { printf("And its effective UID matches the owner " "of the namespace\n"); printf("Subject to LSM checks, PID has all capabilities in " "that namespace!\n"); } else { printf("But its effective UID does not match the owner " "of the namespace\n"); printf("Subject to LSM checks, it has the following capabilities\n"); show_cap(pid); } exit(EXIT_SUCCESS); } 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x--- Michael Kerrisk (2): nsfs: Add an ioctl() to return the namespace type nsfs: Add an ioctl() to return owner UID of a userns fs/nsfs.c \| 13 +++++++++++++ include/uapi/linux/nsfs.h \| 9 +++++++-- 2 files changed, 20 insertions(+), 2 deletions(-)
2017-02-03	nsfs: Add an ioctl() to return owner UID of a userns	Michael Kerrisk (man-pages)
	I'd like to write code that discovers the user namespace hierarchy on a running system, and also shows who owns the various user namespaces. Currently, there is no way of getting the owner UID of a user namespace. Therefore, this patch adds a new NS_GET_CREATOR_UID ioctl() that fetches the UID (as seen in the user namespace of the caller) of the creator of the user namespace referred to by the specified file descriptor. If the supplied file descriptor does not refer to a user namespace, the operation fails with the error EINVAL. If the owner UID does not have a mapping in the caller's user namespace return the overflow UID as that appears easier to deal with in practice in user-space applications. -- EWB Changed the handling of unmapped UIDs from -EOVERFLOW back to the overflow uid. Per conversation with Michael Kerrisk after examining his test code. Acked-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Michael Kerrisk <mtk-manpages@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-02-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	All merge conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Five kernel fixes: - an mmap tracing ABI fix for certain mappings - a use-after-free fix, found via KASAN - three CPU hotplug related x86 PMU driver fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Make package handling more robust perf/x86/intel/uncore: Clean up hotplug conversion fallout perf/x86/intel/rapl: Make package handling more robust perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory perf/core: Fix use-after-free bug
2017-02-02	workqueue: avoid clang warning	Arnd Bergmann
	Building with clang shows lots of warning like: drivers/amba/bus.c:447:8: warning: implicit conversion from 'long long' to 'int' changes value from 4294967248 to -48 [-Wconstant-conversion] static DECLARE_DELAYED_WORK(deferred_retry_work, amba_deferred_retry_func); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/workqueue.h:187:26: note: expanded from macro 'DECLARE_DELAYED_WORK' struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, 0) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/workqueue.h:177:10: note: expanded from macro '__DELAYED_WORK_INITIALIZER' .work = __WORK_INITIALIZER((n).work, (f)), \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/workqueue.h:170:10: note: expanded from macro '__WORK_INITIALIZER' .data = WORK_DATA_STATIC_INIT(), \ ^~~~~~~~~~~~~~~~~~~~~~~ include/linux/workqueue.h:111:39: note: expanded from macro 'WORK_DATA_STATIC_INIT' ATOMIC_LONG_INIT(WORK_STRUCT_NO_POOL \| WORK_STRUCT_STATIC) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ include/asm-generic/atomic-long.h:32:41: note: expanded from macro 'ATOMIC_LONG_INIT' #define ATOMIC_LONG_INIT(i) ATOMIC_INIT(i) ~~~~~~~~~~~~^~ arch/arm/include/asm/atomic.h:21:27: note: expanded from macro 'ATOMIC_INIT' #define ATOMIC_INIT(i) { (i) } ~ ^ This makes the type cast explicit, which shuts up the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-02-02	drm: Fix build when FBDEV_EMULATION is disabled	Gabriel Krisman Bertazi
	Commit be7f735cd5ea ("drm: Rely on mode_config data for fb_helper initialization") broke the build when CONFIG_DRM_FBDEV_EMULATION is disabled because it didn't update the prototype for drm_fb_helper_init in that case. Fixes: be7f735cd5ea ("drm: Rely on mode_config data for fb_helper initialization") Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170202193900.22075-1-krisman@collabora.co.uk
2017-02-02	drm: Rely on mode_config data for fb_helper initialization	Gabriel Krisman Bertazi
	Instead of receiving the num_crts as a parameter, we can read it directly from the mode_config structure. I audited the drivers that invoke this helper and I believe all of them initialize the mode_config struct accordingly, prior to calling the fb_helper. I used the following coccinelle hack to make this transformation, except for the function headers and comment updates. The first and second rules are split because I couldn't find a way to remove the unused temporary variables at the same time I removed the parameter. // <smpl> @r@ expression A,B,D,E; identifier C; @@ ( - drm_fb_helper_init(A,B,C,D) + drm_fb_helper_init(A,B,D) \| - drm_fbdev_cma_init_with_funcs(A,B,C,D,E) + drm_fbdev_cma_init_with_funcs(A,B,D,E) \| - drm_fbdev_cma_init(A,B,C,D) + drm_fbdev_cma_init(A,B,D) ) @@ expression A,B,C,D,E; @@ ( - drm_fb_helper_init(A,B,C,D) + drm_fb_helper_init(A,B,D) \| - drm_fbdev_cma_init_with_funcs(A,B,C,D,E) + drm_fbdev_cma_init_with_funcs(A,B,D,E) \| - drm_fbdev_cma_init(A,B,C,D) + drm_fbdev_cma_init(A,B,D) ) @@ identifier r.C; type T; expression V; @@ - T C; <... when != C - C = V; ...> // </smpl> Changes since v1: - Rebased on top of the tip of drm-misc-next. - Remove mention to sti since a proper fix got merged. Suggested-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170202162640.27261-1-krisman@collabora.co.uk
2017-02-02	soc: samsung: pmu: Add register defines for pad retention control	Marek Szyprowski
	Add PMU defines related to pad retention control. Will be later used by the Exynos pin controller driver. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2017-02-02	blktrace: make do_blk_trace_setup() static	Omar Sandoval
	This isn't used outside of blktrace.c anymore. Fixes: 62c2a7d969f3 ("block: push BKL into blktrace ioctls") Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	block: fix debugfs config conditional in struct request_queue	Omar Sandoval
	The debugfs dentries are only used for CONFIG_BLK_DEBUG_FS, so make them conditional on that instead of CONFIG_DEBUG_FS. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	debugfs: add debugfs_lookup()	Omar Sandoval
	We don't always have easy access to the dentry of a file or directory we created in debugfs. Add a helper which allows us to get a dentry we previously created. The motivation for this change is a problem with blktrace and the blk-mq debugfs entries introduced in 07e4fead45e6 ("blk-mq: create debugfs directory tree"). Namely, in some cases, the directory that blktrace needs to create may already exist, but in other cases, it may not. We _could_ rely on a bunch of implied knowledge to decide whether to create the directory or not, but it's much cleaner on our end to just look it up. Signed-off-by: Omar Sandoval <osandov@fb.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	Merge branch 'topic/intel-lpe-audio-dp' into for-next	Takashi Iwai
	Merged more patches for Intel LPE audio driver, now to support DP audio.
2017-02-02	Merge tag 'samsung-pinctrl-4.11' into next/dt64	Krzysztof Kozlowski
	Merge the pinctrl header before its usage. Defines for GPIO drive strengths on Exynos5433 and Exynos7, used in Device Tree sources.
2017-02-02	pinctrl: dt-bindings: samsung: Add Exynos7 specific pinctrl macro definitions	Pankaj Dubey
	Exynos7 SoC pinctrl configurations are similar to existing Exynos4/5 except for FSYS1 pinctrl drive strengths. So adding Exynos7 specific FSYS1 blocks pinctrl driver strength related macros which will be used in Exynos7 DTSi files. Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2017-02-02	ext4: move halfmd4 into hash.c directly	Jason A. Donenfeld
	The "half md4" transform should not be used by any new code. And fortunately, it's only used now by ext4. Since ext4 supports several hashing methods, at some point it might be desirable to move to something like SipHash. As an intermediate step, remove half md4 from cryptohash.h and lib, and make it just a local function in ext4's hash.c. There's precedent for doing this; the other function ext can use for its hashes -- TEA -- is also implemented in the same place. Also, by being a local function, this might allow gcc to perform some additional optimizations. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-02-02	scsi, block: fix duplicate bdi name registration crashes	Dan Williams
	Warnings of the following form occur because scsi reuses a devt number while the block layer still has it referenced as the name of the bdi [1]: WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192' [..] Call Trace: dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5f/0x80 ? kernfs_path_from_node+0x4f/0x60 sysfs_warn_dup+0x62/0x80 sysfs_create_dir_ns+0x77/0x90 kobject_add_internal+0xb2/0x350 kobject_add+0x75/0xd0 device_add+0x15a/0x650 device_create_groups_vargs+0xe0/0xf0 device_create_vargs+0x1c/0x20 bdi_register+0x90/0x240 ? lockdep_init_map+0x57/0x200 bdi_register_owner+0x36/0x60 device_add_disk+0x1bb/0x4e0 ? __pm_runtime_use_autosuspend+0x5c/0x70 sd_probe_async+0x10d/0x1c0 async_run_entry_fn+0x39/0x170 This is a brute-force fix to pass the devt release information from sd_probe() to the locations where we register the bdi, device_add_disk(), and unregister the bdi, blk_cleanup_queue(). Thanks to Omar for the quick reproducer script [2]. This patch survives where an unmodified kernel fails in a few seconds. [1]: https://marc.info/?l=linux-scsi&m=147116857810716&w=4 [2]: http://marc.info/?l=linux-block&m=148554717109098&w=2 Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Jan Kara <jack@suse.cz> Reported-by: Omar Sandoval <osandov@osandov.com> Tested-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	block: Get rid of blk_get_backing_dev_info()	Jan Kara
	blk_get_backing_dev_info() is now a simple dereference. Remove that function and simplify some code around that. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	block: Make blk_get_backing_dev_info() safe without open bdev	Jan Kara
	Currenly blk_get_backing_dev_info() is not safe to be called when the block device is not open as bdev->bd_disk is NULL in that case. However inode_to_bdi() uses this function and may be call called from flusher worker or other writeback related functions without bdev being open which leads to crashes such as: [113031.075540] Unable to handle kernel paging request for data at address 0x00000000 [113031.075614] Faulting instruction address: 0xc0000000003692e0 0:mon> t [c0000000fb65f900] c00000000036cb6c writeback_sb_inodes+0x30c/0x590 [c0000000fb65fa10] c00000000036ced4 __writeback_inodes_wb+0xe4/0x150 [c0000000fb65fa70] c00000000036d33c wb_writeback+0x30c/0x450 [c0000000fb65fb40] c00000000036e198 wb_workfn+0x268/0x580 [c0000000fb65fc50] c0000000000f3470 process_one_work+0x1e0/0x590 [c0000000fb65fce0] c0000000000f38c8 worker_thread+0xa8/0x660 [c0000000fb65fd80] c0000000000fc4b0 kthread+0x110/0x130 [c0000000fb65fe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	block: Dynamically allocate and refcount backing_dev_info	Jan Kara
	Instead of storing backing_dev_info inside struct request_queue, allocate it dynamically, reference count it, and free it when the last reference is dropped. Currently only request_queue holds the reference but in the following patch we add other users referencing backing_dev_info. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	block: Use pointer to backing_dev_info from request_queue	Jan Kara
	We will want to have struct backing_dev_info allocated separately from struct request_queue. As the first step add pointer to backing_dev_info to request_queue and convert all users touching it. No functional changes in this patch. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	block: Unhash block device inodes on gendisk destruction	Jan Kara
	Currently, block device inodes stay around after corresponding gendisk hash died until memory reclaim finds them and frees them. Since we will make block device inode pin the bdi, we want to free the block device inode as soon as the device goes away so that bdi does not stay around unnecessarily. Furthermore we need to avoid issues when new device with the same major,minor pair gets created since reusing the bdi structure would be rather difficult in this case. Unhashing block device inode on gendisk destruction nicely deals with these problems. Once last block device inode reference is dropped (which may be directly in del_gendisk()), the inode gets evicted. Furthermore if the major,minor pair gets reallocated, we are guaranteed to get new block device inode even if old block device inode is not yet evicted and thus we avoid issues with possible reuse of bdi. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02	drm: Provide a driver hook for drm_dev_release()	Chris Wilson
	Some state is coupled into the device lifetime outside of the load/unload timeframe and requires teardown during final unreference from drm_dev_release(). For example, dmabufs hold both a device and module reference and may live longer than expected (i.e. the current pattern of the driver tearing down its state and then releasing a reference to the drm device) and yet touch driver private state when destroyed. v2: Export drm_dev_fini() and move the responsibility for finalizing the drm_device and freeing it to the release callback. (If no callback is provided, the core will call drm_dev_fini() and kfree(dev) as before.) v3: Remember to add drm_dev_fini() to drm_drv.h v4: Tidy language for kerneldoc v5: Cross reference from drm_dev_init() to note that driver->release() allows for arbitrary embedding. v6: Refer to driver data rather than driver state, as state is now becoming associated with the struct drm_atomic_state and friends. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> [danvet: Use the proper reference for struct members, which is &drm_driver.release.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170202093632.31017-1-chris@chris-wilson.co.uk
2017-02-02	netfilter: allow logging from non-init namespaces	Michal Kubeček
	Commit 69b34fb996b2 ("netfilter: xt_LOG: add net namespace support for xt_LOG") disabled logging packets using the LOG target from non-init namespaces. The motivation was to prevent containers from flooding kernel log of the host. The plan was to keep it that way until syslog namespace implementation allows containers to log in a safe way. However, the work on syslog namespace seems to have hit a dead end somewhere in 2013 and there are users who want to use xt_LOG in all network namespaces. This patch allows to do so by setting /proc/sys/net/netfilter/nf_log_all_netns to a nonzero value. This sysctl is only accessible from init_net so that one cannot switch the behaviour from inside a container. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-02	ipvs: free ip_vs_dest structs when refcnt=0	David Windsor
	Currently, the ip_vs_dest cache frees ip_vs_dest objects when their reference count becomes < 0. Aside from not being semantically sound, this is problematic for the new type refcount_t, which will be introduced shortly in a separate patch. refcount_t is the new kernel type for holding reference counts, and provides overflow protection and a constrained interface relative to atomic_t (the type currently being used for kernel reference counts). Per Julian Anastasov: "The problem is that dest_trash currently holds deleted dests (unlinked from RCU lists) with refcnt=0." Changing dest_trash to hold dest with refcnt=1 will allow us to free ip_vs_dest structs when their refcnt=0, in ip_vs_dest_put_and_free(). Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-02	netfilter: merge ctinfo into nfct pointer storage area	Florian Westphal
	After this change conntrack operations (lookup, creation, matching from ruleset) only access one instead of two sk_buff cache lines. This works for normal conntracks because those are allocated from a slab that guarantees hw cacheline or 8byte alignment (whatever is larger) so the 3 bits needed for ctinfo won't overlap with nf_conn addresses. Template allocation now does manual address alignment (see previous change) on arches that don't have sufficent kmalloc min alignment. Some spots intentionally use skb->_nfct instead of skb_nfct() helpers, this is to avoid undoing the skb_nfct() use when we remove untracked conntrack object in the future. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-02	netfilter: guarantee 8 byte minalign for template addresses	Florian Westphal
	The next change will merge skb->nfct pointer and skb->nfctinfo status bits into single skb->_nfct (unsigned long) area. For this to work nf_conn addresses must always be aligned at least on an 8 byte boundary since we will need the lower 3bits to store nfctinfo. Conntrack templates are allocated via kmalloc. kbuild test robot reported BUILD_BUG_ON failed: NFCT_INFOMASK >= ARCH_KMALLOC_MINALIGN on v1 of this patchset, so not all platforms meet this requirement. Do manual alignment if needed, the alignment offset is stored in the nf_conn entry protocol area. This works because templates are not handed off to L4 protocol trackers. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-02	netfilter: add and use nf_ct_set helper	Florian Westphal
	Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff. This avoids changing code in followup patch that merges skb->nfct and skb->nfctinfo into skb->_nfct. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>