summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-05-30pstore/ram: Introduce max_reason and convert dump_oopsKees Cook
Now that pstore_register() can correctly pass max_reason to the kmesg dump facility, introduce a new "max_reason" module parameter and "max-reason" Device Tree field. The "dump_oops" module parameter and "dump-oops" Device Tree field are now considered deprecated, but are now automatically converted to their corresponding max_reason values when present, though the new max_reason setting has precedence. For struct ramoops_platform_data, the "dump_oops" member is entirely replaced by a new "max_reason" member, with the only existing user updated in place. Additionally remove the "reason" filter logic from ramoops_pstore_write(), as that is not specifically needed anymore, though technically this is a change in behavior for any ramoops users also setting the printk.always_kmsg_dump boot param, which will cause ramoops to behave as if max_reason was set to KMSG_DUMP_MAX. Co-developed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/lkml/20200515184434.8470-6-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/platform: Pass max_reason to kmesg dumpPavel Tatashin
Add a new member to struct pstore_info for passing information about kmesg dump maximum reason. This allows a finer control of what kmesg dumps are sent to pstore storage backends. Those backends that do not explicitly set this field (keeping it equal to 0), get the default behavior: store only Oopses and Panics, or everything if the printk.always_kmsg_dump boot param is set. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/lkml/20200515184434.8470-5-keescook@chromium.org/ Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30printk: Introduce kmsg_dump_reason_str()Kees Cook
The pstore subsystem already had a private version of this function. With the coming addition of the pstore/zone driver, this needs to be shared. As it really should live with printk, move it there instead. Link: https://lore.kernel.org/lkml/20200515184434.8470-4-keescook@chromium.org/ Acked-by: Petr Mladek <pmladek@suse.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30printk: honor the max_reason field in kmsg_dumperPavel Tatashin
kmsg_dump() allows to dump kmesg buffer for various system events: oops, panic, reboot, etc. It provides an interface to register a callback call for clients, and in that callback interface there is a field "max_reason", but it was getting ignored when set to any "reason" higher than KMSG_DUMP_OOPS unless "always_kmsg_dump" was passed as kernel parameter. Allow clients to actually control their "max_reason", and keep the current behavior when "max_reason" is not set. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/lkml/20200515184434.8470-3-keescook@chromium.org/ Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30printk: Collapse shutdown types into a single dump reasonKees Cook
To turn the KMSG_DUMP_* reasons into a more ordered list, collapse the redundant KMSG_DUMP_(RESTART|HALT|POWEROFF) reasons into KMSG_DUMP_SHUTDOWN. The current users already don't meaningfully distinguish between them, so there's no need to, as discussed here: https://lore.kernel.org/lkml/CA+CK2bAPv5u1ih5y9t5FUnTyximtFCtDYXJCpuyjOyHNOkRdqw@mail.gmail.com/ Link: https://lore.kernel.org/lkml/20200515184434.8470-2-keescook@chromium.org/ Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/platform: Switch pstore_info::name to constKees Cook
In order to more cleanly pass around backend names, make the "name" member const. This means the module param needs to be dynamic (technically, it was before, so this actually cleans up a minor memory leak if a backend was specified and then gets unloaded.) Link: https://lore.kernel.org/lkml/20200510202436.63222-3-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30Merge tag 'irqchip-5.8' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - A few new drivers for the Loongson MIPS platform (HTVEC, PIC, MSI) - A cleanup of the __irq_domain_add() API - A cleanup of the IRQ simulator to actually use some of the irq infrastructure - Some fixes for the Sifive PLIC when used in a multi-controller context - Fixes for the GICv3 ITS to spread interrupts according to the load of each CPU, and to honor managed interrupts - Numerous cleanups and documentation fixes
2020-05-29net/mlx5: IPSec: Fix incorrect type for spiSaeed Mahameed
spi is __be32, fix that. Fixes sparse warning: drivers/net/ethernet/mellanox/mlx5/core/accel/ipsec.c:74:64 warning: incorrect type Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29net/mlx5: cmd: Fix memset with byte count warningSaeed Mahameed
Fix sparse warning: drivers/net/ethernet/mellanox/mlx5/core/cmd.c:1949:15: warning: memset with byte count of 271720 mlx5_cmd_stats array is too big to be held inline in mlx5_cmd. Allocate it separately. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29net: Make mpls_entry_encode() available for generic usersEli Cohen
Move mpls_entry_encode() from net/mpls/internal.h to include/net/mpls.h and make it available for other users. Specifically, hardware driver that offload MPLS can benefit from that. Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29exec: Compute file based creds only onceEric W. Biederman
Move the computation of creds from prepare_binfmt into begin_new_exec so that the creds need only be computed once. This is just code reorganization no semantic changes of any kind are made. Moving the computation is safe. I have looked through the kernel and verified none of the binfmts look at bprm->cred directly, and that there are no helpers that look at bprm->cred indirectly. Which means that it is not a problem to compute the bprm->cred later in the execution flow as it is not used until it becomes current->cred. A new function bprm_creds_from_file is added to contain the work that needs to be done. bprm_creds_from_file first computes which file bprm->executable or most likely bprm->file that the bprm->creds will be computed from. The funciton bprm_fill_uid is updated to receive the file instead of accessing bprm->file. The now unnecessary work needed to reset the bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid. A small comment to document that bprm_fill_uid now only deals with the work to handle suid and sgid files. The default case is already heandled by prepare_exec_creds. The function security_bprm_repopulate_creds is renamed security_bprm_creds_from_file and now is explicitly passed the file from which to compute the creds. The documentation of the bprm_creds_from_file security hook is updated to explain when the hook is called and what it needs to do. The file is passed from cap_bprm_creds_from_file into get_file_caps so that the caps are computed for the appropriate file. The now unnecessary work in cap_bprm_creds_from_file to reset the ambient capabilites has been removed. A small comment to document that the work of cap_bprm_creds_from_file is to read capabilities from the files secureity attribute and derive capabilities from the fact the user had uid 0 has been added. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-29exec: Add a per bprm->file version of per_clearEric W. Biederman
There is a small bug in the code that recomputes parts of bprm->cred for every bprm->file. The code never recomputes the part of clear_dangerous_personality_flags it is responsible for. Which means that in practice if someone creates a sgid script the interpreter will not be able to use any of: READ_IMPLIES_EXEC ADDR_NO_RANDOMIZE ADDR_COMPAT_LAYOUT MMAP_PAGE_ZERO. This accentially clearing of personality flags probably does not matter in practice because no one has complained but it does make the code more difficult to understand. Further remaining bug compatible prevents the recomputation from being removed and replaced by simply computing bprm->cred once from the final bprm->file. Making this change removes the last behavior difference between computing bprm->creds from the final file and recomputing bprm->cred several times. Which allows this behavior change to be justified for it's own reasons, and for any but hunts looking into why the behavior changed to wind up here instead of in the code that will follow that computes bprm->cred from the final bprm->file. This small logic bug appears to have existed since the code started clearing dangerous personality bits. History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-30ASoC: soc-card: add snd_soc_card_remove_dai_link()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87mu5szv2h.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_add_dai_link()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. This patch adds missing return when error case. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87o8q8zv2m.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_set_bias_level_post()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87pnaozv2s.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_set_bias_level()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87sgfkzv4g.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_remove()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87tv00zv4p.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_late_probe()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. card has "card->probe" and "card->late_probe" callbacks, and "late_probe" callback is called after "probe". This means, we can set "card->probed" flag afer "late_probe" for all cases. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87v9kgzv4w.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_probe()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. One note here is that card has "card->probe" and "card->late_probe" callbacks. Because it needs to care "late_probe", "card->probed" flag is set under if (card->probe) at snd_soc_card_probe(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87wo4wzv54.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add probed bit field to snd_soc_cardKuninori Morimoto
We already have bit field to control snd_soc_card. Let's add "probed" field on it instead of local variable. One note here is that soc_cleanup_card_resources() will be called as (A) formal cleanup or as (B) error handling, thus, it needs to distinguish these. In (A) case, card will have "instantiated" flag if all probe callback functions were called without error. Thus, snd_soc_unbind_card() is using it to judging card was probed. But this this patch removes it, because it is no longer needed. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87r1v4zv36.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_resume_post()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87y2pczv5d.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_resume_pre()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87zh9szv5k.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_suspend_post()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/871rn425j3.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_suspend_pre()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87367k25jc.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_subclass to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/874ks025jn.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_get_codec_dai() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/875zcg25jv.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_set/get_drvdata() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/877dww25k4.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_jack_new() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/878shc25kc.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_get_kcontrol() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87a71s25kj.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: add soc-card.cKuninori Morimoto
Current ALSA SoC has some snd_soc_card_xxx() functions, and card->xxx() callbacks. But, it is implemented randomly at random place. To collect all card related functions into one place, this patch creats new soc-card.c. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87blm825kt.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc.h: convert bool to bit field for snd_soc_cardKuninori Morimoto
snd_soc_card has many bool, but it can be bit field. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87d06o25l2.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-29tcp: tcp_init_buffer_space can be staticFlorian Westphal
As of commit 98fa6271cfcb ("tcp: refactor setting the initial congestion window") this is called only from tcp_input.c, so it can be static. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux net/mlx5: Add ability to read and write ECE options net/mlx5: Add support for RDMA TX FT headers modifying net/mlx5: Move iseg access helper routines close to mlx5_core driver net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits net/mlx5: Add support in forward to namespace {IB/net}/mlx5: Simplify don't trap code net/mlx5: Replace zero-length array with flexible-array Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Nothing profound here, just a last set of long standing bug fixes: - Incorrect error unwind in qib and pvrdma - User triggerable NULL pointer crash in mlx5 with ODP prefetch - syzkaller RCU race in uverbs - Rare double free crash in ipoib" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode RDMA/core: Fix double destruction of uobject RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe() RDMA/mlx5: Fix NULL pointer dereference in destroy_prefetch_work IB/qib: Call kobject_put() when kobject_init_and_add() fails
2020-05-29default csum_and_copy_to_user(): don't bother with access_ok()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29take the dummy csum_and_copy_from_user() into net/checksum.hAl Viro
now that can be done conveniently - all non-trivial cases have _HAVE_ARCH_COPY_AND_CSUM_FROM_USER defined, so the fallback in net/checksum.h is used only for dummy (copy_from_user, then csum_partial) implementation. Allowing us to get rid of all dummy instances, both of csum_and_copy_from_user() and csum_partial_copy_from_user(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29net: remove kernel_setsockoptChristoph Hellwig
No users left. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29net: add a new bind_add methodChristoph Hellwig
The SCTP protocol allows to bind multiple address to a socket. That feature is currently only exposed as a socket option. Add a bind_add method struct proto that allows to bind additional addresses, and switch the dlm code to use the method instead of going through the socket option from kernel space. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29sctp: add sctp_sock_set_nodelayChristoph Hellwig
Add a helper to directly set the SCTP_NODELAY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-05-29 1) Several fixes for ESP gro/gso in transport and beet mode when IPv6 extension headers are present. From Xin Long. 2) Fix a wrong comment on XFRMA_OFFLOAD_DEV. From Antony Antony. 3) Fix sk_destruct callback handling on ESP in TCP encapsulation. From Sabrina Dubroca. 4) Fix a use after free in xfrm_output_gso when used with vxlan. From Xin Long. 5) Fix secpath handling of VTI when used wiuth IPCOMP. From Xin Long. 6) Fix an oops when deleting a x-netns xfrm interface. From Nicolas Dichtel. 7) Fix a possible warning on policy updates. We had a case where it was possible to add two policies with the same lookup keys. From Xin Long. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-05-29 1) Add IPv6 encapsulation support for ESP over UDP and TCP. From Sabrina Dubroca. 2) Remove unneeded reference when initializing xfrm interfaces. From Nicolas Dichtel. 3) Remove some indirect calls from the state_afinfo. From Florian Westphal. Please note that this pull request has two merge conflicts between commit: 0c922a4850eb ("xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finish") from Linus' tree and commit: 2ab6096db2f1 ("xfrm: remove output_finish indirection from xfrm_state_afinfo") from the ipsec-next tree. and between commit: 3986912f6a9a ("ipv6: move SIOCADDRT and SIOCDELRT handling into ->compat_ioctl") from the net-next tree and commit: 0146dca70b87 ("xfrm: add support for UDPv6 encapsulation of ESP") from the ipsec-next tree. Both conflicts can be resolved as done in linux-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29RDMA/core: Introduce shared CQ pool APIYamin Friedman
Allow a ULP to ask the core to provide a completion queue based on a least-used search on a per-device CQ pools. The device CQ pools grow in a lazy fashion when more CQs are requested. This feature reduces the amount of interrupts when using many QPs. Using shared CQs allows for more effcient completion handling. It also reduces the amount of overhead needed for CQ contexts. Test setup: Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz servers. Running NVMeoF 4KB read IOs over ConnectX-5EX across Spectrum switch. TX-depth = 32. The patch was applied in the nvme driver on both the target and initiator. Four controllers are accessed from each core. In the current test case we have exposed sixteen NVMe namespaces using four different subsystems (four namespaces per subsystem) from one NVM port. Each controller allocated X queues (RDMA QPs) and attached to Y CQs. Before this series we had X == Y, i.e for four controllers we've created total of 4X QPs and 4X CQs. In the shared case, we've created 4X QPs and only X CQs which means that we have four controllers that share a completion queue per core. Until fourteen cores there is no significant change in performance and the number of interrupts per second is less than a million in the current case. ================================================== |Cores|Current KIOPs |Shared KIOPs |improvement| |-----|---------------|--------------|-----------| |14 |2332 |2723 |16.7% | |-----|---------------|--------------|-----------| |20 |2086 |2712 |30% | |-----|---------------|--------------|-----------| |28 |1971 |2669 |35.4% | |================================================= |Cores|Current avg lat|Shared avg lat|improvement| |-----|---------------|--------------|-----------| |14 |767us |657us |14.3% | |-----|---------------|--------------|-----------| |20 |1225us |943us |23% | |-----|---------------|--------------|-----------| |28 |1816us |1341us |26.1% | ======================================================== |Cores|Current interrupts|Shared interrupts|improvement| |-----|------------------|-----------------|-----------| |14 |1.6M/sec |0.4M/sec |72% | |-----|------------------|-----------------|-----------| |20 |2.8M/sec |0.6M/sec |72.4% | |-----|------------------|-----------------|-----------| |28 |2.9M/sec |0.8M/sec |63.4% | ==================================================================== |Cores|Current 99.99th PCTL lat|Shared 99.99th PCTL lat|improvement| |-----|------------------------|-----------------------|-----------| |14 |67ms |6ms |90.9% | |-----|------------------------|-----------------------|-----------| |20 |5ms |6ms |-10% | |-----|------------------------|-----------------------|-----------| |28 |8.7ms |6ms |25.9% | |=================================================================== Performance improvement with sixteen disks (sixteen CQs per core) is comparable. Link: https://lore.kernel.org/r/1590568495-101621-3-git-send-email-yaminf@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/core: Add protection for shared CQs used by ULPsYamin Friedman
A pre-step for adding shared CQs. Add the infrastructure to prevent shared CQ users from altering the CQ configurations. For now all cqs are marked as private (non-shared). The core driver should use the new force functions to perform resize/destroy/moderation changes that are not allowed for users of shared CQs. Link: https://lore.kernel.org/r/1590568495-101621-2-git-send-email-yaminf@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/core: Use offsetofend() instead of open codingJason Gunthorpe
No reason to open code this. Link: https://lore.kernel.org/r/0-v1-0bc346e08476+585-drop_offsetofend_jgg@mellanox.com Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29blk-mq: drain I/O when all CPUs in a hctx are offlineMing Lei
Most of blk-mq drivers depend on managed IRQ's auto-affinity to setup up queue mapping. Thomas mentioned the following point[1]: "That was the constraint of managed interrupts from the very beginning: The driver/subsystem has to quiesce the interrupt line and the associated queue _before_ it gets shutdown in CPU unplug and not fiddle with it until it's restarted by the core when the CPU is plugged in again." However, current blk-mq implementation doesn't quiesce hw queue before the last CPU in the hctx is shutdown. Even worse, CPUHP_BLK_MQ_DEAD is a cpuhp state handled after the CPU is down, so there isn't any chance to quiesce the hctx before shutting down the CPU. Add new CPUHP_AP_BLK_MQ_ONLINE state to stop allocating from blk-mq hctxs where the last CPU goes away, and wait for completion of in-flight requests. This guarantees that there is no inflight I/O before shutting down the managed IRQ. Add a BLK_MQ_F_STACKING and set it for dm-rq and loop, so we don't need to wait for completion of in-flight requests from these drivers to avoid a potential dead-lock. It is safe to do this for stacking drivers as those do not use interrupts at all and their I/O completions are triggered by underlying devices I/O completion. [1] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/ [hch: different retry mechanism, merged two patches, minor cleanups] Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29blk-mq: remove the bio argument to ->prepare_requestChristoph Hellwig
None of the I/O schedulers actually needs it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29blk-mq: blk-mq: provide forced completion methodKeith Busch
Drivers may need to bypass error injection for error recovery. Rename __blk_mq_complete_request() to blk_mq_force_complete_rq() and export that function so drivers may skip potential fake timeouts after they've reclaimed lost requests. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29regulator: core: Add regulator bypass trace pointsCharles Keepax
Add new trace points for the start and end of enabling bypass on a regulator, to allow monitoring of when regulators are moved into bypass and how long that takes. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200529152216.9671-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-29PM: runtime: Replace pm_runtime_callbacks_present()Rafael J. Wysocki
The name of pm_runtime_callbacks_present() is confusing, because it suggests that the device has PM-runtime callbacks if 'true' is returned by that function, but in fact that may not be the case, so replace it with pm_runtime_has_no_callbacks() which is not ambiguous. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-05-29Merge tag 'v5.7-rc7' into x86/amdJoerg Roedel
Linux 5.7-rc7