summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2019-03-23Merge tag 'ceph-for-5.1-rc2' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "A follow up for the new alloc_size logic and a blacklisting fix, marked for stable" * tag 'ceph-for-5.1-rc2' of git://github.com/ceph/ceph-client: rbd: drop wait_for_latest_osdmap() libceph: wait for latest osdmap in ceph_monc_blacklist_add() rbd: set io_min, io_opt and discard_granularity to alloc_size
2019-03-23x86/gart: Exclude GART aperture from kcoreKairui Song
On machines where the GART aperture is mapped over physical RAM, /proc/kcore contains the GART aperture range. Accessing the GART range via /proc/kcore results in a kernel crash. vmcore used to have the same issue, until it was fixed with commit 2a3e83c6f96c ("x86/gart: Exclude GART aperture from vmcore")', leveraging existing hook infrastructure in vmcore to let /proc/vmcore return zeroes when attempting to read the aperture region, and so it won't read from the actual memory. Apply the same workaround for kcore. First implement the same hook infrastructure for kcore, then reuse the hook functions introduced in the previous vmcore fix. Just with some minor adjustment, rename some functions for more general usage, and simplify the hook infrastructure a bit as there is no module usage yet. Suggested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Omar Sandoval <osandov@fb.com> Cc: Dave Young <dyoung@redhat.com> Link: https://lkml.kernel.org/r/20190308030508.13548-1-kasong@redhat.com
2019-03-22net/mlx5e: Add VLAN ID rewrite fieldsEli Britstein
Add VLAN ID rewrite fields as a pre-step to support this rewrite. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-22sbitmap: trivial - update comment for sbitmap_deferred_clear_bitShenghui Wang
"sbitmap_batch_clear" should be "sbitmap_deferred_clear" Acked-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-22gpio: amd-fch: Fix bogus SPDX identifierThomas Gleixner
spdxcheck.py complains: include/linux/platform_data/gpio/gpio-amd-fch.h: 1:28 Invalid License ID: GPL+ which is correct because GPL+ is not a valid identifier. Of course this could have been caught by checkpatch.pl _before_ submitting or merging the patch. WARNING: 'SPDX-License-Identifier: GPL+ */' is not supported in LICENSES/... #271: FILE: include/linux/platform_data/gpio/gpio-amd-fch.h:1: +/* SPDX-License-Identifier: GPL+ */ Fix it under the assumption that the author meant GPL-2.0+, which makes sense as the corresponding C file is using that identifier. Fixes: e09d168f13f0 ("gpio: AMD G-Series PCH gpio driver") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-03-22genetlink: make policy common to familyJohannes Berg
Since maxattr is common, the policy can't really differ sanely, so make it common as well. The only user that did in fact manage to make a non-common policy is taskstats, which has to be really careful about it (since it's still using a common maxattr!). This is no longer supported, but we can fake it using pre_doit. This reduces the size of e.g. nl80211.o (which has lots of commands): text data bss dec hex filename 398745 14323 2240 415308 6564c net/wireless/nl80211.o (before) 397913 14331 2240 414484 65314 net/wireless/nl80211.o (after) -------------------------------- -832 +8 0 -824 Which is obviously just 8 bytes for each command, and an added 8 bytes for the new policy pointer. I'm not sure why the ops list is counted as .text though. Most of the code transformations were done using the following spatch: @ops@ identifier OPS; expression POLICY; @@ struct genl_ops OPS[] = { ..., { - .policy = POLICY, }, ... }; @@ identifier ops.OPS; expression ops.POLICY; identifier fam; expression M; @@ struct genl_family fam = { .ops = OPS, .maxattr = M, + .policy = POLICY, ... }; This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing the cb->data as ops, which we want to change in a later genl patch. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-22softirq: Remove tasklet_hrtimerThomas Gleixner
There are no more users of this interface. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/r/20190301224821.29843-4-bigeasy@linutronix.de
2019-03-21bpf: allow helpers to return PTR_TO_SOCK_COMMONLorenz Bauer
It's currently not possible to access timewait or request sockets from eBPF, since there is no way to return a PTR_TO_SOCK_COMMON from a helper. Introduce RET_PTR_TO_SOCK_COMMON to enable this behaviour. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-21rhashtable: rename rht_for_each*continue as *from.NeilBrown
The pattern set by list.h is that for_each..continue() iterators start at the next entry after the given one, while for_each..from() iterators start at the given entry. The rht_for_each*continue() iterators are documented as though the start at the 'next' entry, but actually start at the given entry, and they are used expecting that behaviour. So fix the documentation and change the names to *from for consistency with list.h Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21rhashtable: don't hold lock on first table throughout insertion.NeilBrown
rhashtable_try_insert() currently holds a lock on the bucket in the first table, while also locking buckets in subsequent tables. This is unnecessary and looks like a hold-over from some earlier version of the implementation. As insert and remove always lock a bucket in each table in turn, and as insert only inserts in the final table, there cannot be any races that are not covered by simply locking a bucket in each table in turn. When an insert call reaches that last table it can be sure that there is no matchinf entry in any other table as it has searched them all, and insertion never happens anywhere but in the last table. The fact that code tests for the existence of future_tbl while holding a lock on the relevant bucket ensures that two threads inserting the same key will make compatible decisions about which is the "last" table. This simplifies the code and allows the ->rehash field to be discarded. We still need a way to ensure that a dead bucket_table is never re-linked by rhashtable_walk_stop(). This can be achieved by calling call_rcu() inside the locked region, and checking with rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the bucket table is empty and dead. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21PCI/MSI: Remove unused mask_msi_irq() and unmask_msi_irq()Bjorn Helgaas
Change pcie-xilinx-nwl.c to use pci_msi_mask_irq() and pci_msi_unmask_irq() like all other PCI host controller drivers. Remove the now-unused mask_msi_irq() and unmask_msi_irq(). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Michal Simek <michal.simek@xilinx.com> CC: linux-arm-kernel@lists.infradead.org
2019-03-21PCI/MSI: Remove unused __write_msi_msg() and write_msi_msg()Bjorn Helgaas
Remove unused __write_msi_msg() and write_msi_msg(). These were added by 83a18912b0e8 ("PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg()"), they served their purpose, and they're no longer needed. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Jiang Liu <jiang.liu@linux.intel.com> # 83a18912b0e8 author
2019-03-21mtd: spinand: Use the spi-mem dirmap APIBoris Brezillon
Make use of the spi-mem direct mapping API to let advanced controllers optimize read/write operations when they support direct mapping. Signed-off-by: Boris Brezillon <bbrezillon@kernel.org> Cc: Stefan Roese <sr@denx.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Tested-by: Stefan Roese <sr@denx.de>
2019-03-21regulator: add regulator_get_linear_step() stub helperArnd Bergmann
The regulator header has empty inline functions for most interfaces, but not regulator_get_linear_step(), which has just grown a user that does not depend on regulators otherwise: drivers/clk/tegra/clk-tegra124-dfll-fcpu.c: In function 'get_alignment_from_regulator': drivers/clk/tegra/clk-tegra124-dfll-fcpu.c:555:19: error: implicit declaration of function 'regulator_get_linear_step'; did you mean 'regulator_get_drvdata'? [-Werror=implicit-function-declaration] align->step_uv = regulator_get_linear_step(reg); ^~~~~~~~~~~~~~~~~~~~~~~~~ regulator_get_drvdata cc1: all warnings being treated as errors scripts/Makefile.build:278: recipe for target 'drivers/clk/tegra/clk-tegra124-dfll-fcpu.o' failed Add the missing stub along the others. Fixes: b3cf8d069505 ("clk: tegra: dfll: CVB calculation alignment with the regulator") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-03-21Merge tag 'irqchip-5.1-2' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip updates for 5.1 from Marc Zyngier: - irqsteer error handling fix - GICv3 range coalescing fix - stm32 coprocessor coexistence fixes - mbigen MSI teardown fix - non-DT secondary GIC infrastructure removed - various cleanups (brcmstb-l2, mmp) - new DT bindings (r8a774c0)
2019-03-21genirq: Fix typo in comment of IRQD_MOVE_PCNTXTPeter Xu
Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Dou Liyang <douliyangs@gmail.com> Cc: Julien Thierry <julien.thierry@arm.com> Link: https://lkml.kernel.org/r/20190318065123.11862-1-peterx@redhat.com
2019-03-20LSM: add new hook for kernfs node initializationOndrej Mosnacek
This patch introduces a new security hook that is intended for initializing the security data for newly created kernfs nodes, which provide a way of storing a non-default security context, but need to operate independently from mounts (and therefore may not have an associated inode at the moment of creation). The main motivation is to allow kernfs nodes to inherit the context of the parent under SELinux, similar to the behavior of security_inode_init_security(). Other LSMs may implement their own logic for handling the creation of new nodes. This patch also adds helper functions to <linux/kernfs.h> for getting/setting security xattrs of a kernfs node so that LSMs hooks are able to do their job. Other important attributes should be accessible direcly in the kernfs_node fields (in case there is need for more, then new helpers should be added to kernfs.h along with the patch that needs them). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> [PM: more manual merge fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20vfs: syscall: Add fspick() to select a superblock for reconfigurationDavid Howells
Provide an fspick() system call that can be used to pick an existing mountpoint into an fs_context which can thereafter be used to reconfigure a superblock (equivalent of the superblock side of -o remount). This looks like: int fd = fspick(AT_FDCWD, "/mnt", FSPICK_CLOEXEC | FSPICK_NO_AUTOMOUNT); fsconfig(fd, FSCONFIG_SET_FLAG, "intr", NULL, 0); fsconfig(fd, FSCONFIG_SET_FLAG, "noac", NULL, 0); fsconfig(fd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0); At the point of fspick being called, the file descriptor referring to the filesystem context is in exactly the same state as the one that was created by fsopen() after fsmount() has been successfully called. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20vfs: syscall: Add fsmount() to create a mount for a superblockDavid Howells
Provide a system call by which a filesystem opened with fsopen() and configured by a series of fsconfig() calls can have a detached mount object created for it. This mount object can then be attached to the VFS mount hierarchy using move_mount() by passing the returned file descriptor as the from directory fd. The system call looks like: int mfd = fsmount(int fsfd, unsigned int flags, unsigned int attr_flags); where fsfd is the file descriptor returned by fsopen(). flags can be 0 or FSMOUNT_CLOEXEC. attr_flags is a bitwise-OR of the following flags: MOUNT_ATTR_RDONLY Mount read-only MOUNT_ATTR_NOSUID Ignore suid and sgid bits MOUNT_ATTR_NODEV Disallow access to device special files MOUNT_ATTR_NOEXEC Disallow program execution MOUNT_ATTR__ATIME Setting on how atime should be updated MOUNT_ATTR_RELATIME - Update atime relative to mtime/ctime MOUNT_ATTR_NOATIME - Do not update access times MOUNT_ATTR_STRICTATIME - Always perform atime updates MOUNT_ATTR_NODIRATIME Do not update directory access times In the event that fsmount() fails, it may be possible to get an error message by calling read() on fsfd. If no message is available, ENODATA will be reported. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20vfs: syscall: Add fsconfig() for configuring and managing a contextDavid Howells
Add a syscall for configuring a filesystem creation context and triggering actions upon it, to be used in conjunction with fsopen, fspick and fsmount. long fsconfig(int fs_fd, unsigned int cmd, const char *key, const void *value, int aux); Where fs_fd indicates the context, cmd indicates the action to take, key indicates the parameter name for parameter-setting actions and, if needed, value points to a buffer containing the value and aux can give more information for the value. The following command IDs are proposed: (*) FSCONFIG_SET_FLAG: No value is specified. The parameter must be boolean in nature. The key may be prefixed with "no" to invert the setting. value must be NULL and aux must be 0. (*) FSCONFIG_SET_STRING: A string value is specified. The parameter can be expecting boolean, integer, string or take a path. A conversion to an appropriate type will be attempted (which may include looking up as a path). value points to a NUL-terminated string and aux must be 0. (*) FSCONFIG_SET_BINARY: A binary blob is specified. value points to the blob and aux indicates its size. The parameter must be expecting a blob. (*) FSCONFIG_SET_PATH: A non-empty path is specified. The parameter must be expecting a path object. value points to a NUL-terminated string that is the path and aux is a file descriptor at which to start a relative lookup or AT_FDCWD. (*) FSCONFIG_SET_PATH_EMPTY: As fsconfig_set_path, but with AT_EMPTY_PATH implied. (*) FSCONFIG_SET_FD: An open file descriptor is specified. value must be NULL and aux indicates the file descriptor. (*) FSCONFIG_CMD_CREATE: Trigger superblock creation. (*) FSCONFIG_CMD_RECONFIGURE: Trigger superblock reconfiguration. For the "set" command IDs, the idea is that the file_system_type will point to a list of parameters and the types of value that those parameters expect to take. The core code can then do the parse and argument conversion and then give the LSM and FS a cooked option or array of options to use. Source specification is also done the same way same way, using special keys "source", "source1", "source2", etc.. [!] Note that, for the moment, the key and value are just glued back together and handed to the filesystem. Every filesystem that uses options uses match_token() and co. to do this, and this will need to be changed - but not all at once. Example usage: fd = fsopen("ext4", FSOPEN_CLOEXEC); fsconfig(fd, fsconfig_set_path, "source", "/dev/sda1", AT_FDCWD); fsconfig(fd, fsconfig_set_path_empty, "journal_path", "", journal_fd); fsconfig(fd, fsconfig_set_fd, "journal_fd", "", journal_fd); fsconfig(fd, fsconfig_set_flag, "user_xattr", NULL, 0); fsconfig(fd, fsconfig_set_flag, "noacl", NULL, 0); fsconfig(fd, fsconfig_set_string, "sb", "1", 0); fsconfig(fd, fsconfig_set_string, "errors", "continue", 0); fsconfig(fd, fsconfig_set_string, "data", "journal", 0); fsconfig(fd, fsconfig_set_string, "context", "unconfined_u:...", 0); fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0); mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC); or: fd = fsopen("ext4", FSOPEN_CLOEXEC); fsconfig(fd, fsconfig_set_string, "source", "/dev/sda1", 0); fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0); mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC); or: fd = fsopen("afs", FSOPEN_CLOEXEC); fsconfig(fd, fsconfig_set_string, "source", "#grand.central.org:root.cell", 0); fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0); mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC); or: fd = fsopen("jffs2", FSOPEN_CLOEXEC); fsconfig(fd, fsconfig_set_string, "source", "mtd0", 0); fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0); mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC); Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20vfs: Implement logging through fs_contextDavid Howells
Implement the ability for filesystems to log error, warning and informational messages through the fs_context. These can be extracted by userspace by reading from an fd created by fsopen(). Error messages are prefixed with "e ", warnings with "w " and informational messages with "i ". Inside the kernel, formatted messages are malloc'd but unformatted messages are not copied if they're either in the core .rodata section or in the .rodata section of the filesystem module pinned by fs_context::fs_type. The messages are only good till the fs_type is released. Note that the logging object is shared between duplicated fs_context structures. This is so that such as NFS which do a mount within a mount can get at least some of the errors from the inner mount. Five logging functions are provided for this: (1) void logfc(struct fs_context *fc, const char *fmt, ...); This logs a message into the context. If the buffer is full, the earliest message is discarded. (2) void errorf(fc, fmt, ...); This wraps logfc() to log an error. (3) void invalf(fc, fmt, ...); This wraps errorf() and returns -EINVAL for convenience. (4) void warnf(fc, fmt, ...); This wraps logfc() to log a warning. (5) void infof(fc, fmt, ...); This wraps logfc() to log an informational message. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20vfs: syscall: Add fsopen() to prepare for superblock creationDavid Howells
Provide an fsopen() system call that starts the process of preparing to create a superblock that will then be mountable, using an fd as a context handle. fsopen() is given the name of the filesystem that will be used: int mfd = fsopen(const char *fsname, unsigned int flags); where flags can be 0 or FSOPEN_CLOEXEC. For example: sfd = fsopen("ext4", FSOPEN_CLOEXEC); fsconfig(sfd, FSCONFIG_SET_PATH, "source", "/dev/sda1", AT_FDCWD); fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0); fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0); fsconfig(sfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); fsconfig(sfd, FSCONFIG_SET_STRING, "sb", "1", 0); fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); fsinfo(sfd, NULL, ...); // query new superblock attributes mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME); move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); sfd = fsopen("afs", -1); fsconfig(fd, FSCONFIG_SET_STRING, "source", "#grand.central.org:root.cell", 0); fsconfig(fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); mfd = fsmount(sfd, 0, MS_NODEV); move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); If an error is reported at any step, an error message may be available to be read() back (ENODATA will be reported if there isn't an error available) in the form: "e <subsys>:<problem>" "e SELinux:Mount on mountpoint not permitted" Once fsmount() has been called, further fsconfig() calls will incur EBUSY, even if the fsmount() fails. read() is still possible to retrieve error information. The fsopen() syscall creates a mount context and hangs it of the fd that it returns. Netlink is not used because it is optional and would make the core VFS dependent on the networking layer and also potentially add network namespace issues. Note that, for the moment, the caller must have SYS_CAP_ADMIN to use fsopen(). Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20vfs: syscall: Add move_mount(2) to move mounts aroundDavid Howells
Add a move_mount() system call that will move a mount from one place to another and, in the next commit, allow to attach an unattached mount tree. The new system call looks like the following: int move_mount(int from_dfd, const char *from_path, int to_dfd, const char *to_path, unsigned int flags); Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20vfs: syscall: Add open_tree(2) to reference or clone a mountAl Viro
open_tree(dfd, pathname, flags) Returns an O_PATH-opened file descriptor or an error. dfd and pathname specify the location to open, in usual fashion (see e.g. fstatat(2)). flags should be an OR of some of the following: * AT_PATH_EMPTY, AT_NO_AUTOMOUNT, AT_SYMLINK_NOFOLLOW - same meanings as usual * OPEN_TREE_CLOEXEC - make the resulting descriptor close-on-exec * OPEN_TREE_CLONE or OPEN_TREE_CLONE | AT_RECURSIVE - instead of opening the location in question, create a detached mount tree matching the subtree rooted at location specified by dfd/pathname. With AT_RECURSIVE the entire subtree is cloned, without it - only the part within in the mount containing the location in question. In other words, the same as mount --rbind or mount --bind would've taken. The detached tree will be dissolved on the final close of obtained file. Creation of such detached trees requires the same capabilities as doing mount --bind. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20block: Unexport blk_mq_add_to_requeue_list()Bart Van Assche
This function is not used outside the block layer core. Hence unexport it. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-20block: add BLK_MQ_POLL_CLASSIC for hybrid poll and return EINVAL for ↵Yufen Yu
unexpected value For q->poll_nsec == -1, means doing classic poll, not hybrid poll. We introduce a new flag BLK_MQ_POLL_CLASSIC to replace -1, which may make code much easier to read. Additionally, since val is an int obtained with kstrtoint(), val can be a negative value other than -1, so return -EINVAL for that case. Thanks to Damien Le Moal for some good suggestion. Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20packet: rework packet_pick_tx_queue() to use common code selectionPaolo Abeni
Currently packet_pick_tx_queue() is the only caller of ndo_select_queue() using a fallback argument other than netdev_pick_tx. Leveraging rx queue, we can obtain a similar queue selection behavior using core helpers. After this change, ndo_select_queue() is always invoked with netdev_pick_tx() as fallback. We can change ndo_select_queue() signature in a followup patch, dropping an indirect call per transmitted packet in some scenarios (e.g. TCP syn and XDP generic xmit) This changes slightly how af packet queue selection happens when PACKET_QDISC_BYPASS is set. It's now more similar to plan dev_queue_xmit() tacking in account both XPS and TC mapping. v1 -> v2: - rebased after helper name change RFC -> v1: - initialize sender_cpu to the expected value Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net: dev: rename queue selection helpers.Paolo Abeni
With the following patches, we are going to use __netdev_pick_tx() in many modules. Rename it to netdev_pick_tx(), to make it clear is a public API. Also rename the existing netdev_pick_tx() to netdev_core_pick_tx(), to avoid name clashes. Suggested-by: Eric Dumazet <edumazet@google.com> Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20spi: pxa2xx: Introduce DMA burst size supportAndy Shevchenko
Some masters may have different DMA burst size than hard coded default. In such case respect the value given by DMA burst size provided via platform data. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-03-20libceph: wait for latest osdmap in ceph_monc_blacklist_add()Ilya Dryomov
Because map updates are distributed lazily, an OSD may not know about the new blacklist for quite some time after "osd blacklist add" command is completed. This makes it possible for a blacklisted but still alive client to overwrite a post-blacklist update, resulting in data corruption. Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using the post-blacklist epoch for all post-blacklist requests ensures that all such requests "wait" for the blacklist to come into force on their respective OSDs. Cc: stable@vger.kernel.org Fixes: 6305a3b41515 ("libceph: support for blacklisting clients") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-03-20pwm: Fix deadlock warning when removing PWM devicePhong Hoang
This patch fixes deadlock warning if removing PWM device when CONFIG_PROVE_LOCKING is enabled. This issue can be reproceduced by the following steps on the R-Car H3 Salvator-X board if the backlight is disabled: # cd /sys/class/pwm/pwmchip0 # echo 0 > export # ls device export npwm power pwm0 subsystem uevent unexport # cd device/driver # ls bind e6e31000.pwm uevent unbind # echo e6e31000.pwm > unbind [ 87.659974] ====================================================== [ 87.666149] WARNING: possible circular locking dependency detected [ 87.672327] 5.0.0 #7 Not tainted [ 87.675549] ------------------------------------------------------ [ 87.681723] bash/2986 is trying to acquire lock: [ 87.686337] 000000005ea0e178 (kn->count#58){++++}, at: kernfs_remove_by_name_ns+0x50/0xa0 [ 87.694528] [ 87.694528] but task is already holding lock: [ 87.700353] 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.707405] [ 87.707405] which lock already depends on the new lock. [ 87.707405] [ 87.715574] [ 87.715574] the existing dependency chain (in reverse order) is: [ 87.723048] [ 87.723048] -> #1 (pwm_lock){+.+.}: [ 87.728017] __mutex_lock+0x70/0x7e4 [ 87.732108] mutex_lock_nested+0x1c/0x24 [ 87.736547] pwm_request_from_chip.part.6+0x34/0x74 [ 87.741940] pwm_request_from_chip+0x20/0x40 [ 87.746725] export_store+0x6c/0x1f4 [ 87.750820] dev_attr_store+0x18/0x28 [ 87.754998] sysfs_kf_write+0x54/0x64 [ 87.759175] kernfs_fop_write+0xe4/0x1e8 [ 87.763615] __vfs_write+0x40/0x184 [ 87.767619] vfs_write+0xa8/0x19c [ 87.771448] ksys_write+0x58/0xbc [ 87.775278] __arm64_sys_write+0x18/0x20 [ 87.779721] el0_svc_common+0xd0/0x124 [ 87.783986] el0_svc_compat_handler+0x1c/0x24 [ 87.788858] el0_svc_compat+0x8/0x18 [ 87.792947] [ 87.792947] -> #0 (kn->count#58){++++}: [ 87.798260] lock_acquire+0xc4/0x22c [ 87.802353] __kernfs_remove+0x258/0x2c4 [ 87.806790] kernfs_remove_by_name_ns+0x50/0xa0 [ 87.811836] remove_files.isra.1+0x38/0x78 [ 87.816447] sysfs_remove_group+0x48/0x98 [ 87.820971] sysfs_remove_groups+0x34/0x4c [ 87.825583] device_remove_attrs+0x6c/0x7c [ 87.830197] device_del+0x11c/0x33c [ 87.834201] device_unregister+0x14/0x2c [ 87.838638] pwmchip_sysfs_unexport+0x40/0x4c [ 87.843509] pwmchip_remove+0xf4/0x13c [ 87.847773] rcar_pwm_remove+0x28/0x34 [ 87.852039] platform_drv_remove+0x24/0x64 [ 87.856651] device_release_driver_internal+0x18c/0x21c [ 87.862391] device_release_driver+0x14/0x1c [ 87.867175] unbind_store+0xe0/0x124 [ 87.871265] drv_attr_store+0x20/0x30 [ 87.875442] sysfs_kf_write+0x54/0x64 [ 87.879618] kernfs_fop_write+0xe4/0x1e8 [ 87.884055] __vfs_write+0x40/0x184 [ 87.888057] vfs_write+0xa8/0x19c [ 87.891887] ksys_write+0x58/0xbc [ 87.895716] __arm64_sys_write+0x18/0x20 [ 87.900154] el0_svc_common+0xd0/0x124 [ 87.904417] el0_svc_compat_handler+0x1c/0x24 [ 87.909289] el0_svc_compat+0x8/0x18 [ 87.913378] [ 87.913378] other info that might help us debug this: [ 87.913378] [ 87.921374] Possible unsafe locking scenario: [ 87.921374] [ 87.927286] CPU0 CPU1 [ 87.931808] ---- ---- [ 87.936331] lock(pwm_lock); [ 87.939293] lock(kn->count#58); [ 87.945120] lock(pwm_lock); [ 87.950599] lock(kn->count#58); [ 87.953908] [ 87.953908] *** DEADLOCK *** [ 87.953908] [ 87.959821] 4 locks held by bash/2986: [ 87.963563] #0: 00000000ace7bc30 (sb_writers#6){.+.+}, at: vfs_write+0x188/0x19c [ 87.971044] #1: 00000000287991b2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xb4/0x1e8 [ 87.978872] #2: 00000000f739d016 (&dev->mutex){....}, at: device_release_driver_internal+0x40/0x21c [ 87.988001] #3: 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.995481] [ 87.995481] stack backtrace: [ 87.999836] CPU: 0 PID: 2986 Comm: bash Not tainted 5.0.0 #7 [ 88.005489] Hardware name: Renesas Salvator-X board based on r8a7795 ES1.x (DT) [ 88.012791] Call trace: [ 88.015235] dump_backtrace+0x0/0x190 [ 88.018891] show_stack+0x14/0x1c [ 88.022204] dump_stack+0xb0/0xec [ 88.025514] print_circular_bug.isra.32+0x1d0/0x2e0 [ 88.030385] __lock_acquire+0x1318/0x1864 [ 88.034388] lock_acquire+0xc4/0x22c [ 88.037958] __kernfs_remove+0x258/0x2c4 [ 88.041874] kernfs_remove_by_name_ns+0x50/0xa0 [ 88.046398] remove_files.isra.1+0x38/0x78 [ 88.050487] sysfs_remove_group+0x48/0x98 [ 88.054490] sysfs_remove_groups+0x34/0x4c [ 88.058580] device_remove_attrs+0x6c/0x7c [ 88.062671] device_del+0x11c/0x33c [ 88.066154] device_unregister+0x14/0x2c [ 88.070070] pwmchip_sysfs_unexport+0x40/0x4c [ 88.074421] pwmchip_remove+0xf4/0x13c [ 88.078163] rcar_pwm_remove+0x28/0x34 [ 88.081906] platform_drv_remove+0x24/0x64 [ 88.085996] device_release_driver_internal+0x18c/0x21c [ 88.091215] device_release_driver+0x14/0x1c [ 88.095478] unbind_store+0xe0/0x124 [ 88.099048] drv_attr_store+0x20/0x30 [ 88.102704] sysfs_kf_write+0x54/0x64 [ 88.106359] kernfs_fop_write+0xe4/0x1e8 [ 88.110275] __vfs_write+0x40/0x184 [ 88.113757] vfs_write+0xa8/0x19c [ 88.117065] ksys_write+0x58/0xbc [ 88.120374] __arm64_sys_write+0x18/0x20 [ 88.124291] el0_svc_common+0xd0/0x124 [ 88.128034] el0_svc_compat_handler+0x1c/0x24 [ 88.132384] el0_svc_compat+0x8/0x18 The sysfs unexport in pwmchip_remove() is completely asymmetric to what we do in pwmchip_add_with_polarity() and commit 0733424c9ba9 ("pwm: Unexport children before chip removal") is a strong indication that this was wrong to begin with. We should just move pwmchip_sysfs_unexport() where it belongs, which is right after pwmchip_sysfs_unexport_children(). In that case, we do not need separate functions anymore either. We also really want to remove sysfs irrespective of whether or not the chip will be removed as a result of pwmchip_remove(). We can only assume that the driver will be gone after that, so we shouldn't leave any dangling sysfs files around. This warning disappears if we move pwmchip_sysfs_unexport() to the top of pwmchip_remove(), pwmchip_sysfs_unexport_children(). That way it is also outside of the pwm_lock section, which indeed doesn't seem to be needed. Moving the pwmchip_sysfs_export() call outside of that section also seems fine and it'd be perfectly symmetric with pwmchip_remove() again. So, this patch fixes them. Signed-off-by: Phong Hoang <phong.hoang.wz@renesas.com> [shimoda: revise the commit log and code] Fixes: 76abbdde2d95 ("pwm: Add sysfs interface") Fixes: 0733424c9ba9 ("pwm: Unexport children before chip removal") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Hoan Nguyen An <na-hoan@jinso.co.jp> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2019-03-20reset: Add acquire/release support for arraysThierry Reding
Add implementations that apply acquire and release operations to all reset controls part of a reset control array. Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-03-20reset: Add acquired flag to of_reset_control_array_get()Thierry Reding
In order to be able to request an array of reset controls in acquired or released mode, add the acquired flag to of_reset_control_array_get() and pass the flag to subsequent calls of __of_reset_control_get(). Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-03-20reset: add acquired/released state for exclusive reset controlsPhilipp Zabel
There are cases where a driver needs explicit control over a reset line that is exclusively conneted to its device, but this control has to be temporarily handed over to the power domain controller to handle reset requirements during power transitions. Allow multiple exclusive reset controls to be requested in 'released' state for the same physical reset line, only one of which can be acquired at the same time. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-03-19blk-mq: remove unused 'nr_expired' from blk_mq_hw_ctxDongli Zhang
There is no usage of 'nr_expired'. The 'nr_expired' was introduced by commit 1d9bd5161ba3 ("blk-mq: replace timeout synchronization with a RCU and generation based scheme"). Its usage was removed since commit 12f5b9314545 ("blk-mq: Remove generation seqeunce"). Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-19USB: usb.h: tweak struct urb to remove wasted spaceGreg Kroah-Hartman
By moving one field around in 'struct urb' we reduce the size of the structure by 8 bytes. Before the patch on x86_64 the overall size of the structure as reported by pahole was: /* size: 192, cachelines: 3, members: 30 */ /* sum members: 184, holes: 2, sum holes: 8 */ After the patch we now have: /* size: 184, cachelines: 3, members: 30 */ /* last cacheline: 56 bytes */ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-19Merge tag 'v5.1-rc1' into spi-5.2Mark Brown
Linux 5.1-rc1
2019-03-19Merge tag 'v5.1-rc1' into regulator-5.2Mark Brown
Linux 5.1-rc1
2019-03-19HID: intel-ish-hid: Add interface function for PCI device pointerSrinivas Pandruvada
Instead of directly accessing PCI device poitner via struct ishtp_cl, create interface function for same. This is required for DMA transfer. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-03-19HID: intel-ish-hid: Move functions related to bus and deviceSrinivas Pandruvada
Move function idefinitions related to bus and device to common header file. Also create new function to get fw client id and move ish_hw_reset() from inline to exported function. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-03-19HID: intel-ish-hid: Add interface functions for struct ishtp_clSrinivas Pandruvada
Instead of directly accessing members of struct ishtp_cl, create interface functions to access them. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-03-19HID: intel-ish-hid: Move the common functions from client.hSrinivas Pandruvada
Move the interface functions in client.h to common include. These are already abstracted well to use as is. Also move any associated structures used by these functions. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-03-19HID: intel-ish-hid: Move driver registry functionsSrinivas Pandruvada
Move the driver registry with the ishtp bus to the common interface file, which clients can include. Also rename __ishtp_cl_driver_register() to ishtp_cl_driver_register() and removed define for ishtp_cl_driver_register. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-03-19HID: intel-ish-hid: Hide members of struct ishtp_cl_deviceSrinivas Pandruvada
ISH clients don't need to access any field of struct ishtp_cl_device. To avoid this create an interface functions instead where it is required. In the case of ishtp_cl_allocate(), modify the parameters so that the clients don't have to dereference. Clients can also use tracing, here a new interface is added to get the common trace function pointer, instead of direct call. The new interface functions defined in one external header file, named intel-ish-client-if.h. This is the only header files all ISHTP clients must include. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-03-18block: add BIO_NO_PAGE_REF flagJens Axboe
If bio_iov_iter_get_pages() is called on an iov_iter that is flagged with NO_REF, then we don't need to add a page reference for the pages that we add. Add BIO_NO_PAGE_REF to track this in the bio, so IO completion knows not to drop a reference to these pages. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-18iov_iter: add ITER_BVEC_FLAG_NO_REF flagJens Axboe
For ITER_BVEC, if we're holding on to kernel pages, the caller doesn't need to grab a reference to the bvec pages, and drop that same reference on IO completion. This is essentially safe for any ITER_BVEC, but some use cases end up reusing pages and uncondtionally dropping a page reference on completion. And example of that is sendfile(2), that ends up being a splice_in + splice_out on the pipe pages. Add a flag that tells us it's fine to not grab a page reference to the bvec pages, since that caller knows not to drop a reference when it's done with the pages. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-18drivers: Defer probe if firmware is not readyRajan Vaja
Driver needs ZynqMP firmware interface to call EEMI APIs. In case firmware is not ready, dependent drivers should wait until the firmware is ready. Signed-off-by: Rajan Vaja <rajan.vaja@xilinx.com> Signed-off-by: Jolly Shah <jollys@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2019-03-18driver core: remove BUS_ATTR()Greg Kroah-Hartman
There are now no in-kernel users of BUS_ATTR() so drop it from device.h Everyone should use BUS_ATTR_RO/RW/WO() from now on. Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-17IB/mlx5: Use mlx5 core to create/destroy a DEVX DCTYishai Hadas
To prevent a hardware memory leak when a DEVX DCT object is destroyed without calling DRAIN DCT before, (e.g. under cleanup flow), need to manage its creation and destruction via mlx5 core. In that case the DRAIN DCT command will be called and only once that it will be completed the DESTROY DCT command will be called. Otherwise, the DESTROY DCT may fail and a hardware leak may occur. As of that change the DRAIN DCT command should not be exposed any more from DEVX, it's managed internally by the driver to work as expected by the device specification. Fixes: 7efce3691d33 ("IB/mlx5: Add obj create and destroy functionality") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>