summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-05-29net: Make mpls_entry_encode() available for generic usersEli Cohen
Move mpls_entry_encode() from net/mpls/internal.h to include/net/mpls.h and make it available for other users. Specifically, hardware driver that offload MPLS can benefit from that. Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29exec: Compute file based creds only onceEric W. Biederman
Move the computation of creds from prepare_binfmt into begin_new_exec so that the creds need only be computed once. This is just code reorganization no semantic changes of any kind are made. Moving the computation is safe. I have looked through the kernel and verified none of the binfmts look at bprm->cred directly, and that there are no helpers that look at bprm->cred indirectly. Which means that it is not a problem to compute the bprm->cred later in the execution flow as it is not used until it becomes current->cred. A new function bprm_creds_from_file is added to contain the work that needs to be done. bprm_creds_from_file first computes which file bprm->executable or most likely bprm->file that the bprm->creds will be computed from. The funciton bprm_fill_uid is updated to receive the file instead of accessing bprm->file. The now unnecessary work needed to reset the bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid. A small comment to document that bprm_fill_uid now only deals with the work to handle suid and sgid files. The default case is already heandled by prepare_exec_creds. The function security_bprm_repopulate_creds is renamed security_bprm_creds_from_file and now is explicitly passed the file from which to compute the creds. The documentation of the bprm_creds_from_file security hook is updated to explain when the hook is called and what it needs to do. The file is passed from cap_bprm_creds_from_file into get_file_caps so that the caps are computed for the appropriate file. The now unnecessary work in cap_bprm_creds_from_file to reset the ambient capabilites has been removed. A small comment to document that the work of cap_bprm_creds_from_file is to read capabilities from the files secureity attribute and derive capabilities from the fact the user had uid 0 has been added. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-29exec: Add a per bprm->file version of per_clearEric W. Biederman
There is a small bug in the code that recomputes parts of bprm->cred for every bprm->file. The code never recomputes the part of clear_dangerous_personality_flags it is responsible for. Which means that in practice if someone creates a sgid script the interpreter will not be able to use any of: READ_IMPLIES_EXEC ADDR_NO_RANDOMIZE ADDR_COMPAT_LAYOUT MMAP_PAGE_ZERO. This accentially clearing of personality flags probably does not matter in practice because no one has complained but it does make the code more difficult to understand. Further remaining bug compatible prevents the recomputation from being removed and replaced by simply computing bprm->cred once from the final bprm->file. Making this change removes the last behavior difference between computing bprm->creds from the final file and recomputing bprm->cred several times. Which allows this behavior change to be justified for it's own reasons, and for any but hunts looking into why the behavior changed to wind up here instead of in the code that will follow that computes bprm->cred from the final bprm->file. This small logic bug appears to have existed since the code started clearing dangerous personality bits. History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-30ASoC: soc-card: add snd_soc_card_remove_dai_link()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87mu5szv2h.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_add_dai_link()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. This patch adds missing return when error case. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87o8q8zv2m.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_set_bias_level_post()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87pnaozv2s.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_set_bias_level()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87sgfkzv4g.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_remove()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87tv00zv4p.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_late_probe()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. card has "card->probe" and "card->late_probe" callbacks, and "late_probe" callback is called after "probe". This means, we can set "card->probed" flag afer "late_probe" for all cases. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87v9kgzv4w.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_probe()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. One note here is that card has "card->probe" and "card->late_probe" callbacks. Because it needs to care "late_probe", "card->probed" flag is set under if (card->probe) at snd_soc_card_probe(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87wo4wzv54.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add probed bit field to snd_soc_cardKuninori Morimoto
We already have bit field to control snd_soc_card. Let's add "probed" field on it instead of local variable. One note here is that soc_cleanup_card_resources() will be called as (A) formal cleanup or as (B) error handling, thus, it needs to distinguish these. In (A) case, card will have "instantiated" flag if all probe callback functions were called without error. Thus, snd_soc_unbind_card() is using it to judging card was probed. But this this patch removes it, because it is no longer needed. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87r1v4zv36.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_resume_post()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87y2pczv5d.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_resume_pre()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87zh9szv5k.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_suspend_post()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/871rn425j3.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: add snd_soc_card_suspend_pre()Kuninori Morimoto
Card related function should be implemented at soc-card now. This patch adds it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87367k25jc.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_subclass to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/874ks025jn.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_get_codec_dai() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/875zcg25jv.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_set/get_drvdata() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/877dww25k4.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_jack_new() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/878shc25kc.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc-card: move snd_soc_card_get_kcontrol() to soc-cardKuninori Morimoto
Card related function should be implemented at soc-card now. This patch moves it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87a71s25kj.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: add soc-card.cKuninori Morimoto
Current ALSA SoC has some snd_soc_card_xxx() functions, and card->xxx() callbacks. But, it is implemented randomly at random place. To collect all card related functions into one place, this patch creats new soc-card.c. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87blm825kt.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-30ASoC: soc.h: convert bool to bit field for snd_soc_cardKuninori Morimoto
snd_soc_card has many bool, but it can be bit field. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/87d06o25l2.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-29tcp: tcp_init_buffer_space can be staticFlorian Westphal
As of commit 98fa6271cfcb ("tcp: refactor setting the initial congestion window") this is called only from tcp_input.c, so it can be static. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux net/mlx5: Add ability to read and write ECE options net/mlx5: Add support for RDMA TX FT headers modifying net/mlx5: Move iseg access helper routines close to mlx5_core driver net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits net/mlx5: Add support in forward to namespace {IB/net}/mlx5: Simplify don't trap code net/mlx5: Replace zero-length array with flexible-array Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Nothing profound here, just a last set of long standing bug fixes: - Incorrect error unwind in qib and pvrdma - User triggerable NULL pointer crash in mlx5 with ODP prefetch - syzkaller RCU race in uverbs - Rare double free crash in ipoib" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode RDMA/core: Fix double destruction of uobject RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe() RDMA/mlx5: Fix NULL pointer dereference in destroy_prefetch_work IB/qib: Call kobject_put() when kobject_init_and_add() fails
2020-05-29default csum_and_copy_to_user(): don't bother with access_ok()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29take the dummy csum_and_copy_from_user() into net/checksum.hAl Viro
now that can be done conveniently - all non-trivial cases have _HAVE_ARCH_COPY_AND_CSUM_FROM_USER defined, so the fallback in net/checksum.h is used only for dummy (copy_from_user, then csum_partial) implementation. Allowing us to get rid of all dummy instances, both of csum_and_copy_from_user() and csum_partial_copy_from_user(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29net: remove kernel_setsockoptChristoph Hellwig
No users left. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29net: add a new bind_add methodChristoph Hellwig
The SCTP protocol allows to bind multiple address to a socket. That feature is currently only exposed as a socket option. Add a bind_add method struct proto that allows to bind additional addresses, and switch the dlm code to use the method instead of going through the socket option from kernel space. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29sctp: add sctp_sock_set_nodelayChristoph Hellwig
Add a helper to directly set the SCTP_NODELAY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-05-29 1) Several fixes for ESP gro/gso in transport and beet mode when IPv6 extension headers are present. From Xin Long. 2) Fix a wrong comment on XFRMA_OFFLOAD_DEV. From Antony Antony. 3) Fix sk_destruct callback handling on ESP in TCP encapsulation. From Sabrina Dubroca. 4) Fix a use after free in xfrm_output_gso when used with vxlan. From Xin Long. 5) Fix secpath handling of VTI when used wiuth IPCOMP. From Xin Long. 6) Fix an oops when deleting a x-netns xfrm interface. From Nicolas Dichtel. 7) Fix a possible warning on policy updates. We had a case where it was possible to add two policies with the same lookup keys. From Xin Long. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-05-29 1) Add IPv6 encapsulation support for ESP over UDP and TCP. From Sabrina Dubroca. 2) Remove unneeded reference when initializing xfrm interfaces. From Nicolas Dichtel. 3) Remove some indirect calls from the state_afinfo. From Florian Westphal. Please note that this pull request has two merge conflicts between commit: 0c922a4850eb ("xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finish") from Linus' tree and commit: 2ab6096db2f1 ("xfrm: remove output_finish indirection from xfrm_state_afinfo") from the ipsec-next tree. and between commit: 3986912f6a9a ("ipv6: move SIOCADDRT and SIOCDELRT handling into ->compat_ioctl") from the net-next tree and commit: 0146dca70b87 ("xfrm: add support for UDPv6 encapsulation of ESP") from the ipsec-next tree. Both conflicts can be resolved as done in linux-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29RDMA/core: Introduce shared CQ pool APIYamin Friedman
Allow a ULP to ask the core to provide a completion queue based on a least-used search on a per-device CQ pools. The device CQ pools grow in a lazy fashion when more CQs are requested. This feature reduces the amount of interrupts when using many QPs. Using shared CQs allows for more effcient completion handling. It also reduces the amount of overhead needed for CQ contexts. Test setup: Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz servers. Running NVMeoF 4KB read IOs over ConnectX-5EX across Spectrum switch. TX-depth = 32. The patch was applied in the nvme driver on both the target and initiator. Four controllers are accessed from each core. In the current test case we have exposed sixteen NVMe namespaces using four different subsystems (four namespaces per subsystem) from one NVM port. Each controller allocated X queues (RDMA QPs) and attached to Y CQs. Before this series we had X == Y, i.e for four controllers we've created total of 4X QPs and 4X CQs. In the shared case, we've created 4X QPs and only X CQs which means that we have four controllers that share a completion queue per core. Until fourteen cores there is no significant change in performance and the number of interrupts per second is less than a million in the current case. ================================================== |Cores|Current KIOPs |Shared KIOPs |improvement| |-----|---------------|--------------|-----------| |14 |2332 |2723 |16.7% | |-----|---------------|--------------|-----------| |20 |2086 |2712 |30% | |-----|---------------|--------------|-----------| |28 |1971 |2669 |35.4% | |================================================= |Cores|Current avg lat|Shared avg lat|improvement| |-----|---------------|--------------|-----------| |14 |767us |657us |14.3% | |-----|---------------|--------------|-----------| |20 |1225us |943us |23% | |-----|---------------|--------------|-----------| |28 |1816us |1341us |26.1% | ======================================================== |Cores|Current interrupts|Shared interrupts|improvement| |-----|------------------|-----------------|-----------| |14 |1.6M/sec |0.4M/sec |72% | |-----|------------------|-----------------|-----------| |20 |2.8M/sec |0.6M/sec |72.4% | |-----|------------------|-----------------|-----------| |28 |2.9M/sec |0.8M/sec |63.4% | ==================================================================== |Cores|Current 99.99th PCTL lat|Shared 99.99th PCTL lat|improvement| |-----|------------------------|-----------------------|-----------| |14 |67ms |6ms |90.9% | |-----|------------------------|-----------------------|-----------| |20 |5ms |6ms |-10% | |-----|------------------------|-----------------------|-----------| |28 |8.7ms |6ms |25.9% | |=================================================================== Performance improvement with sixteen disks (sixteen CQs per core) is comparable. Link: https://lore.kernel.org/r/1590568495-101621-3-git-send-email-yaminf@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/core: Add protection for shared CQs used by ULPsYamin Friedman
A pre-step for adding shared CQs. Add the infrastructure to prevent shared CQ users from altering the CQ configurations. For now all cqs are marked as private (non-shared). The core driver should use the new force functions to perform resize/destroy/moderation changes that are not allowed for users of shared CQs. Link: https://lore.kernel.org/r/1590568495-101621-2-git-send-email-yaminf@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/core: Use offsetofend() instead of open codingJason Gunthorpe
No reason to open code this. Link: https://lore.kernel.org/r/0-v1-0bc346e08476+585-drop_offsetofend_jgg@mellanox.com Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29blk-mq: drain I/O when all CPUs in a hctx are offlineMing Lei
Most of blk-mq drivers depend on managed IRQ's auto-affinity to setup up queue mapping. Thomas mentioned the following point[1]: "That was the constraint of managed interrupts from the very beginning: The driver/subsystem has to quiesce the interrupt line and the associated queue _before_ it gets shutdown in CPU unplug and not fiddle with it until it's restarted by the core when the CPU is plugged in again." However, current blk-mq implementation doesn't quiesce hw queue before the last CPU in the hctx is shutdown. Even worse, CPUHP_BLK_MQ_DEAD is a cpuhp state handled after the CPU is down, so there isn't any chance to quiesce the hctx before shutting down the CPU. Add new CPUHP_AP_BLK_MQ_ONLINE state to stop allocating from blk-mq hctxs where the last CPU goes away, and wait for completion of in-flight requests. This guarantees that there is no inflight I/O before shutting down the managed IRQ. Add a BLK_MQ_F_STACKING and set it for dm-rq and loop, so we don't need to wait for completion of in-flight requests from these drivers to avoid a potential dead-lock. It is safe to do this for stacking drivers as those do not use interrupts at all and their I/O completions are triggered by underlying devices I/O completion. [1] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/ [hch: different retry mechanism, merged two patches, minor cleanups] Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29blk-mq: remove the bio argument to ->prepare_requestChristoph Hellwig
None of the I/O schedulers actually needs it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29blk-mq: blk-mq: provide forced completion methodKeith Busch
Drivers may need to bypass error injection for error recovery. Rename __blk_mq_complete_request() to blk_mq_force_complete_rq() and export that function so drivers may skip potential fake timeouts after they've reclaimed lost requests. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29regulator: core: Add regulator bypass trace pointsCharles Keepax
Add new trace points for the start and end of enabling bypass on a regulator, to allow monitoring of when regulators are moved into bypass and how long that takes. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200529152216.9671-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-29PM: runtime: Replace pm_runtime_callbacks_present()Rafael J. Wysocki
The name of pm_runtime_callbacks_present() is confusing, because it suggests that the device has PM-runtime callbacks if 'true' is returned by that function, but in fact that may not be the case, so replace it with pm_runtime_has_no_callbacks() which is not ambiguous. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-05-29Merge tag 'v5.7-rc7' into x86/amdJoerg Roedel
Linux 5.7-rc7
2020-05-29Merge series "Fix regulators coupling for Exynos5800" from Marek Szyprowski ↵Mark Brown
<m.szyprowski@samsung.com>: Hi! This patchset is another attempt to fix the regulator coupling on Exynos5800/5422 SoCs. Here are links to the previous attempts: https://lore.kernel.org/linux-samsung-soc/20191008101709.qVNy8eijBi0LynOteWFMnTg4GUwKG599n6OyYoX1Abs@z/ https://lore.kernel.org/lkml/20191017102758.8104-1-m.szyprowski@samsung.com/ https://lore.kernel.org/linux-pm/cover.1589528491.git.viresh.kumar@linaro.org/ https://lore.kernel.org/linux-pm/20200528131130.17984-1-m.szyprowski@samsung.com/ The problem is with "vdd_int" regulator coupled with "vdd_arm" on Odroid XU3/XU4 boards family. "vdd_arm" is handled by CPUfreq. "vdd_int" is handled by devfreq. CPUfreq initialized quite early during boot and it starts changing OPPs and "vdd_arm" value. Sometimes CPU activity during boot goes down and some low-frequency OPPs are selected, what in turn causes lowering "vdd_arm". This happens before devfreq applies its requirements on "vdd_int". Regulator balancing code reduces "vdd_arm" voltage value, what in turn causes lowering "vdd_int" value to the lowest possible value. This is much below the operation point of the wcore bus, which still runs at the highest frequency. The issue was hard to notice because in the most cases the board managed to boot properly, even when the regulator was set to lowest value allowed by the regulator constraints. However, it caused some random issues, which can be observed as "Unhandled prefetch abort" or low USB stability. Adding more and more special cases to the generic code has been rejected, so the only way to ensure the desired behavior on Exynos5800-based SoCs is to make a custom regulator coupler driver. Best regards, Marek Szyprowski Patch summary: Marek Szyprowski (2): regulator: extract voltage balancing code to separate function soc: samsung: Add simple voltage coupler for Exynos5800 arch/arm/mach-exynos/Kconfig | 1 + drivers/regulator/core.c | 49 ++++++++------- drivers/soc/samsung/Kconfig | 3 + drivers/soc/samsung/Makefile | 1 + .../soc/samsung/exynos-regulator-coupler.c | 59 +++++++++++++++++++ include/linux/regulator/coupler.h | 8 +++ 6 files changed, 101 insertions(+), 20 deletions(-) create mode 100644 drivers/soc/samsung/exynos-regulator-coupler.c -- 2.17.1 base-commit: 8f3d9f354286745c751374f5f1fcafee6b3f3136
2020-05-29regulator: extract voltage balancing code to the separate functionMarek Szyprowski
Move the coupled regulators voltage balancing code to the separate function and allow to call it from the custom regulator couplers. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20200529124940.10675-2-m.szyprowski@samsung.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-29iommu/vt-d: Allocate domain info for real DMA sub-devicesJon Derrick
Sub-devices of a real DMA device might exist on a separate segment than the real DMA device and its IOMMU. These devices should still have a valid device_domain_info, but the current dma alias model won't allocate info for the subdevice. This patch adds a segment member to struct device_domain_info and uses the sub-device's BDF so that these sub-devices won't alias to other devices. Fixes: 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Cc: stable@vger.kernel.org # v5.6+ Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20200527165617.297470-3-jonathan.derrick@intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-29Merge remote-tracking branch 'regmap/for-5.8' into regmap-nextMark Brown
2020-05-29Merge series "regmap: provide simple bitops and use them in a driver" from ↵Mark Brown
Bartosz Golaszewski <brgl@bgdev.pl> Bartosz Golaszewski <bgolaszewski@baylibre.com>: From: Bartosz Golaszewski <bgolaszewski@baylibre.com> I noticed that oftentimes I use regmap_update_bits() for simple bit setting or clearing. In this case the fourth argument is superfluous as it's always 0 or equal to the mask argument. This series proposes to add simple bit operations for setting, clearing and testing specific bits with regmap. The second patch uses all three in a driver that got recently picked into the net-next tree. The patches obviously target different trees so - if you're ok with the change itself - I propose you pick the first one into your regmap tree for v5.8 and then I'll resend the second patch to add the first user for these macros for v5.9. v1 -> v2: - convert the new macros to static inline functions v2 -> v3: - drop unneeded ternary operator Bartosz Golaszewski (2): regmap: provide helpers for simple bit operations net: ethernet: mtk-star-emac: use regmap bitops drivers/base/regmap/regmap.c | 22 +++++ drivers/net/ethernet/mediatek/mtk_star_emac.c | 80 ++++++++----------- include/linux/regmap.h | 36 +++++++++ 3 files changed, 93 insertions(+), 45 deletions(-) base-commit: 8f3d9f354286745c751374f5f1fcafee6b3f3136 -- 2.26.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
2020-05-29Merge series "New DSA driver for VSC9953 Seville switch" from Vladimir ↵Mark Brown
Oltean <olteanv@gmail.com>: Looking at the Felix and Ocelot drivers, Maxim asked if it would be possible to use them as a base for a new driver for the switch inside NXP T1040. Turns out, it is! The result is a driver eerily similar to Felix. The biggest challenge seems to be getting register read/write API generic enough to cover such wild bitfield variations between hardware generations. There is a patch on the regmap core which I would like to get in through the networking subsystem, if possible (and if Mark is ok), since it's a trivial addition. Maxim Kochetkov (4): soc/mscc: ocelot: add MII registers description net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield net: mscc: ocelot: extend watermark encoding function net: dsa: ocelot: introduce driver for Seville VSC9953 switch Vladimir Oltean (7): regmap: add helper for per-port regfield initialization net: mscc: ocelot: unexport ocelot_probe_port net: mscc: ocelot: convert port registers to regmap net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields net: dsa: ocelot: create a template for the DSA tags on xmit net: mscc: ocelot: split writes to pause frame enable bit and to thresholds net: mscc: ocelot: disable flow control on NPI interface drivers/net/dsa/ocelot/Kconfig | 12 + drivers/net/dsa/ocelot/Makefile | 6 + drivers/net/dsa/ocelot/felix.c | 49 +- drivers/net/dsa/ocelot/felix_vsc9959.c | 72 +- drivers/net/dsa/ocelot/seville.c | 742 +++++++++++++++ drivers/net/dsa/ocelot/seville.h | 50 + drivers/net/dsa/ocelot/seville_vsc9953.c | 1064 ++++++++++++++++++++++ drivers/net/ethernet/mscc/ocelot.c | 87 +- drivers/net/ethernet/mscc/ocelot.h | 9 +- drivers/net/ethernet/mscc/ocelot_board.c | 21 +- drivers/net/ethernet/mscc/ocelot_io.c | 18 +- drivers/net/ethernet/mscc/ocelot_regs.c | 57 ++ include/linux/regmap.h | 8 + include/soc/mscc/ocelot.h | 68 +- include/soc/mscc/ocelot_dev.h | 78 -- include/soc/mscc/ocelot_qsys.h | 13 - include/soc/mscc/ocelot_sys.h | 23 - net/dsa/tag_ocelot.c | 21 +- 18 files changed, 2196 insertions(+), 202 deletions(-) create mode 100644 drivers/net/dsa/ocelot/seville.c create mode 100644 drivers/net/dsa/ocelot/seville.h create mode 100644 drivers/net/dsa/ocelot/seville_vsc9953.c base-commit: 8f3d9f354286745c751374f5f1fcafee6b3f3136 -- 2.25.1
2020-05-29regmap: provide helpers for simple bit operationsBartosz Golaszewski
In many instances regmap_update_bits() is used for simple bit setting and clearing. In these cases the last argument is redundant and we can hide it with a static inline function. This adds three new helpers for simple bit operations: set_bits, clear_bits and test_bits (the last one defined as a regular function). Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20200528154503.26304-2-brgl@bgdev.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-29iommu: Remove iommu_sva_ops::mm_exit()Jean-Philippe Brucker
After binding a device to an mm, device drivers currently need to register a mm_exit handler. This function is called when the mm exits, to gracefully stop DMA targeting the address space and flush page faults to the IOMMU. This is deemed too complex for the MMU release() notifier, which may be triggered by any mmput() invocation, from about 120 callsites [1]. The upcoming SVA module has an example of such complexity: the I/O Page Fault handler would need to call mmput_async() instead of mmput() after handling an IOPF, to avoid triggering the release() notifier which would in turn drain the IOPF queue and lock up. Another concern is the DMA stop function taking too long, up to several minutes [2]. For some mmput() callers this may disturb other users. For example, if the OOM killer picks the mm bound to a device as the victim and that mm's memory is locked, if the release() takes too long, it might choose additional innocent victims to kill. To simplify the MMU release notifier, don't forward the notification to device drivers. Since they don't stop DMA on mm exit anymore, the PASID lifetime is extended: (1) The device driver calls bind(). A PASID is allocated. Here any DMA fault is handled by mm, and on error we don't print anything to dmesg. Userspace can easily trigger errors by issuing DMA on unmapped buffers. (2) exit_mmap(), for example the process took a SIGKILL. This step doesn't happen during normal operations. Remove the pgd from the PASID table, since the page tables are about to be freed. Invalidate the IOTLBs. Here the device may still perform DMA on the address space. Incoming transactions are aborted but faults aren't printed out. ATS Translation Requests return Successful Translation Completions with R=W=0. PRI Page Requests return with Invalid Request. (3) The device driver stops DMA, possibly following release of a fd, and calls unbind(). PASID table is cleared, IOTLB invalidated if necessary. The page fault queues are drained, and the PASID is freed. If DMA for that PASID is still running here, something went seriously wrong and errors should be reported. For now remove iommu_sva_ops entirely. We might need to re-introduce them at some point, for example to notify device drivers of unhandled IOPF. [1] https://lore.kernel.org/linux-iommu/20200306174239.GM31668@ziepe.ca/ [2] https://lore.kernel.org/linux-iommu/4d68da96-0ad5-b412-5987-2f7a6aa796c3@amd.com/ Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20200423125329.782066-3-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-29uacce: Remove mm_exit() opJean-Philippe Brucker
The mm_exit() op will be removed from the SVA API. When a process dies and its mm goes away, the IOMMU driver won't notify device drivers anymore. Drivers should expect to handle a lot more aborted DMA. On the upside, it does greatly simplify the queue management. The uacce_mm struct, that tracks all queues bound to an mm, was only used by the mm_exit() callback. Remove it. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20200423125329.782066-2-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>