summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2018-09-23netpoll: make ndo_poll_controller() optionalEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. It seems that all networking drivers that do use NAPI for their TX completions, should not provide a ndo_poll_controller(). NAPI drivers have netpoll support already handled in core networking stack, since netpoll_poll_dev() uses poll_napi(dev) to iterate through registered NAPI contexts for a device. This patch allows netpoll_poll_dev() to process NAPI contexts even for drivers not providing ndo_poll_controller(), allowing for following patches in NAPI drivers. Also we export netpoll_poll_dev() so that it can be called by bonding/team drivers in following patches. Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23soc: qcom: geni: Make version macros simplerStephen Boyd
This macro doesn't work, because it hides a local variable inside of the macro to hold the version and that variable name is called 'ver' and 'version' sometimes. Let's change this to be more explicit. Introduce three macros for the major, minor, and step of the version, and require callers to pass the version in to get the part of the version out. This way we don't hide local variables inside macros and things are less evil overall. Cc: Karthikeyan Ramasubramanian <kramasub@codeaurora.org> Cc: Sagar Dharia <sdharia@codeaurora.org> Cc: Girish Mahadevan <girishm@codeaurora.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-09-23Merge tag 'mfd-fixes-4.19' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Lee writes: "MFD fixes for v4.19 - Fix Dialog DA9063 regulator constraints issue causing failure in probe - Fix OMAP Device Tree compatible strings to match DT" * tag 'mfd-fixes-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: omap-usb-host: Fix dts probe of children mfd: da9063: Fix DT probing with constraints
2018-09-23Merge tag 'for-linus-20180922' of git://git.kernel.dk/linux-blockGreg Kroah-Hartman
Jens writes: "Just a single fix in this pull request, fixing a regression in /proc/diskstats caused by the unification of timestamps." * tag 'for-linus-20180922' of git://git.kernel.dk/linux-block: block: use nanosecond resolution for iostat
2018-09-22EDAC, ghes: Use CPER module handles to locate DIMMsFan Wu
Use SMBIOS module handle type 17, on platforms which provide valid ones, to locate the corresponding DIMM and thus have per-DIMM error counter updates. Signed-off-by: Fan Wu <wufan@codeaurora.org> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tyler Baicar <baicar.tyler@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: Toshi Kani <toshi.kani@hpe.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: baicar.tyler@gmail.com Cc: john.garry@huawei.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: shiju.jose@huawei.com Cc: tanxiaofei@huawei.com Cc: wanghuiqiang@huawei.com Link: http://lkml.kernel.org/r/1537322340-1860-1-git-send-email-wufan@codeaurora.org
2018-09-22Merge tag 'spi-cs-word' into togregJonathan Cameron
spi: Provide SPI_CS_WORD This provides a SPI operation mode which changes chip select after every word, used by some devices such as ADCs and DACs.
2018-09-21tcp: provide earliest departure time in skb->tstampEric Dumazet
Switch internal TCP skb->skb_mstamp to skb->skb_mstamp_ns, from usec units to nsec units. Do not clear skb->tstamp before entering IP stacks in TX, so that qdisc or devices can implement pacing based on the earliest departure time instead of socket sk->sk_pacing_rate Packets are fed with tcp_wstamp_ns, and following patch will update tcp_wstamp_ns when both TCP and sch_fq switch to the earliest departure time mechanism. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21tcp: add tcp_wstamp_ns socket fieldEric Dumazet
TCP will soon provide earliest departure time on TX skbs. It needs to track this in a new variable. tcp_mstamp_refresh() needs to update this variable, and became too big to stay an inline. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21tcp: introduce tcp_skb_timestamp_us() helperEric Dumazet
There are few places where TCP reads skb->skb_mstamp expecting a value in usec unit. skb->tstamp (aka skb->skb_mstamp) will soon store CLOCK_TAI nsec value. Add tcp_skb_timestamp_us() to provide proper conversion when needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21tcp: switch tcp_clock_ns() to CLOCK_TAI baseEric Dumazet
TCP pacing is either implemented in sch_fq or internally. We have the goal of being able to offload pacing on the NICS. TCP will soon provide per skb skb->tstamp as early departure time. Like ETF in commit 25db26a91364 ("net/sched: Introduce the ETF Qdisc") we chose CLOCK_T as the clock base, so that TCP and pacers can share a common clock, to get better RTT samples (without pacing artificially inflating these samples). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21blkcg: rename blkg_try_get to blkg_trygetDennis Zhou (Facebook)
blkg reference counting now uses percpu_ref rather than atomic_t. Let's make this consistent with css_tryget. This renames blkg_try_get to blkg_tryget and now returns a bool rather than the blkg or NULL. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: change blkg reference counting to use percpu_refDennis Zhou (Facebook)
Now that every bio is associated with a blkg, this puts the use of blkg_get, blkg_try_get, and blkg_put on the hot path. This switches over the refcnt in blkg to use percpu_ref. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: cleanup and make blk_get_rl use blkg_lookup_createDennis Zhou (Facebook)
blk_get_rl is responsible for identifying which request_list a request should be allocated to. Try get logic was added earlier, but semantically the logic was not changed. This patch makes better use of the bio already having a reference to the blkg in the hot path. The cold path uses a better fallback of blkg_lookup_create rather than just blkg_lookup and then falling back to the q->root_rl. If lookup_create fails with anything but -ENODEV, it falls back to q->root_rl. A clarifying comment is added to explain why q->root_rl is used rather than the root blkg's rl. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: remove additional reference to the cssDennis Zhou (Facebook)
The previous patch in this series removed carrying around a pointer to the css in blkg. However, the blkg association logic still relied on taking a reference on the css to ensure we wouldn't fail in getting a reference for the blkg. Here the implicit dependency on the css is removed. The association continues to rely on the tryget logic walking up the blkg tree. This streamlines the three ways that association can happen: normal, swap, and writeback. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: remove bio->bi_css and instead use bio->bi_blkgDennis Zhou (Facebook)
Prior patches ensured that all bios are now associated with some blkg. This now makes bio->bi_css unnecessary as blkg maintains a reference to the blkcg already. This patch removes the field bi_css and transfers corresponding uses to access via bi_blkg. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: associate writeback bios with a blkgDennis Zhou (Facebook)
One of the goals of this series is to remove a separate reference to the css of the bio. This can and should be accessed via bio_blkcg. In this patch, the wbc_init_bio call is changed such that it must be called after a queue has been associated with the bio. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: associate a blkg for pages being evicted by swapDennis Zhou (Facebook)
A prior patch in this series added blkg association to bios issued by cgroups. There are two other paths that we want to attribute work back to the appropriate cgroup: swap and writeback. Here we modify the way swap tags bios to include the blkg. Writeback will be tackle in the next patch. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: consolidate bio_issue_init to be a part of coreDennis Zhou (Facebook)
bio_issue_init among other things initializes the timestamp for an IO. Rather than have this logic handled by policies, this consolidates it to be on the init paths (normal, clone, bounce clone). Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: always associate a bio with a blkgDennis Zhou (Facebook)
Previously, blkg's were only assigned as needed by blk-iolatency and blk-throttle. bio->css was also always being associated while blkg was being looked up and then thrown away in blkcg_bio_issue_check. This patch begins the cleanup of bio->css and bio->bi_blkg by always associating a blkg in blkcg_bio_issue_check. This tries to create the blkg, but if it is not possible, falls back to using the root_blkg of the request_queue. Therefore, a bio will always be associated with a blkg. The duplicate association logic is removed from blk-throttle and blk-iolatency. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: convert blkg_lookup_create to find closest blkgDennis Zhou (Facebook)
There are several scenarios where blkg_lookup_create can fail. Examples include the blkcg dying, request_queue is dying, or simply being OOM. At the end of the day, most handle this by simply falling back to the q->root_blkg and calling it a day. This patch implements the notion of closest blkg. During blkg_lookup_create, if it fails to create, return the closest blkg found or the q->root_blkg. blkg_try_get_closest is introduced and used during association so a bio is always attached to a blkg. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: update blkg_lookup_create to do lockingDennis Zhou (Facebook)
To know when to create a blkg, the general pattern is to do a blkg_lookup and if that fails, lock and then do a lookup again and if that fails finally create. It doesn't make much sense for everyone who wants to do creation to write this themselves. This changes blkg_lookup_create to do locking and implement this pattern. The old blkg_lookup_create is renamed to __blkg_lookup_create. If a call site wants to do its own error handling or already owns the queue lock, they can use __blkg_lookup_create. This will be used in upcoming patches. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21blkcg: fix ref count issue with bio_blkcg using task_cssDennis Zhou (Facebook)
The accessor function bio_blkcg either returns the blkcg associated with the bio or finds one in the current context. This can cause an issue when trying to associate a bio with a blkcg. Particularly, it's the third case that is problematic: return css_to_blkcg(task_css(current, io_cgrp_id)); As the above may race against task migration and the cgroup exiting, it is not always ok to take a reference on the blkcg returned from bio_blkcg. This patch adds association ahead of calling bio_blkcg rather than after. This makes association a required and explicit step along the code paths for calling bio_blkcg. blk_get_rl is modified as well to get a reference to the blkcg it may use and blk_put_rl will always put the reference back. Association is also moved above the bio_blkcg call to ensure it will not return NULL in blk-iolatency. BFQ and CFQ utilize this flaw, but due to the complexity, I do not want to address this in this series. I've created a private version of the function with notes not to use it describing the flaw. Hopefully soon, that code can be cleaned up. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21block: use nanosecond resolution for iostatOmar Sandoval
Klaus Kusche reported that the I/O busy time in /proc/diskstats was not updating properly on 4.18. This is because we started using ktime to track elapsed time, and we convert nanoseconds to jiffies when we update the partition counter. However, this gets rounded down, so any I/Os that take less than a jiffy are not accounted for. Previously in this case, the value of jiffies would sometimes increment while we were doing I/O, so at least some I/Os were accounted for. Let's convert the stats to use nanoseconds internally. We still report milliseconds as before, now more accurately than ever. The value is still truncated to 32 bits for backwards compatibility. Fixes: 522a777566f5 ("block: consolidate struct request timestamp fields") Cc: stable@vger.kernel.org Reported-by: Klaus Kusche <klaus.kusche@computerix.info> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-21net: if_arp: use define instead of hard-coded valueHåkon Bugge
uapi/linux/if_arp.h includes linux/netdevice.h, which uses IFNAMSIZ. Hence, use it instead of hard-coded value. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21net: if_arp: Fix incorrect indentsHåkon Bugge
Fixing incorrect indents and align comments. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21net/tls: Add support for async encryption of records for performanceVakul Garg
In current implementation, tls records are encrypted & transmitted serially. Till the time the previously submitted user data is encrypted, the implementation waits and on finish starts transmitting the record. This approach of encrypt-one record at a time is inefficient when asynchronous crypto accelerators are used. For each record, there are overheads of interrupts, driver softIRQ scheduling etc. Also the crypto accelerator sits idle most of time while an encrypted record's pages are handed over to tcp stack for transmission. This patch enables encryption of multiple records in parallel when an async capable crypto accelerator is present in system. This is achieved by allowing the user space application to send more data using sendmsg() even while previously issued data is being processed by crypto accelerator. This requires returning the control back to user space application after submitting encryption request to accelerator. This also means that zero-copy mode of encryption cannot be used with async accelerator as we must be done with user space application buffer before returning from sendmsg(). There can be multiple records in flight to/from the accelerator. Each of the record is represented by 'struct tls_rec'. This is used to store the memory pages for the record. After the records are encrypted, they are added in a linked list called tx_ready_list which contains encrypted tls records sorted as per tls sequence number. The records from tx_ready_list are transmitted using a newly introduced function called tls_tx_records(). The tx_ready_list is polled for any record ready to be transmitted in sendmsg(), sendpage() after initiating encryption of new tls records. This achieves parallel encryption and transmission of records when async accelerator is present. There could be situation when crypto accelerator completes encryption later than polling of tx_ready_list by sendmsg()/sendpage(). Therefore we need a deferred work context to be able to transmit records from tx_ready_list. The deferred work context gets scheduled if applications are not sending much data through the socket. If the applications issue sendmsg()/sendpage() in quick succession, then the scheduling of tx_work_handler gets cancelled as the tx_ready_list would be polled from application's context itself. This saves scheduling overhead of deferred work. The patch also brings some side benefit. We are able to get rid of the concept of CLOSED record. This is because the records once closed are either encrypted and then placed into tx_ready_list or if encryption fails, the socket error is set. This simplifies the kernel tls sendpath. However since tls_device.c is still using macros, accessory functions for CLOSED records have been retained. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21Merge branch 'mlx5-vport-loopback' into rdma.getDoug Ledford
For dependencies, branch based on 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5 mcast/ucast loopback control enhancements from Leon Romanovsky: ==================== This is short series from Mark which extends handling of loopback traffic. Originally mlx5 IB dynamically enabled/disabled both unicast and multicast based on number of users. However RAW ethernet QPs need more granular access. ==================== Fixed failed automerge in mlx5_ib.h (minor context conflict issue) mlx5-vport-loopback branch: RDMA/mlx5: Enable vport loopback when user context or QP mandate RDMA/mlx5: Allow creating RAW ethernet QP with loopback support RDMA/mlx5: Refactor transport domain bookkeeping logic net/mlx5: Rename incorrect naming in IFC file Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/mlx5: Allow creating RAW ethernet QP with loopback supportMark Bloch
Expose two new flags: MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC Those flags can be used at creation time in order to allow a QP to be able to receive loopback traffic (unicast and multicast). We store the state in the QP to be used on the destroy path to indicate with which flags the QP was created with. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-22net/mlx5: Rename incorrect naming in IFC fileMark Bloch
Remove a trailing underscore from the multicast/unicast names. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-09-21ASoC: add for_each_component_dais() macroKuninori Morimoto
To be more readable code, this patch adds new for_each_component_dais() macro, and replace existing code to it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-09-21RDMA/uverbs: Get rid of ucontext->tgidJason Gunthorpe
Nothing uses this now, just delete it. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Avoid synchronize_srcu in the ODP MR destruction pathJason Gunthorpe
synchronize_rcu is slow enough that it should be avoided on the syscall path when user space is destroying MRs. After all the rework we can now trivially do this by having call_srcu kfree the per_mm. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Handle a half-complete start/end sequenceJason Gunthorpe
mmu_notifier_unregister() can race between a invalidate_start/end and cause the invalidate_end to be skipped. This causes an imbalance in the locking, which lockdep complains about. This is not actually a bug, as we immediately kfree the memory holding the lock, but it simple enough to fix. Mark when the notifier is being destroyed and abort the start callback. This can be done under the lock we already obtained, and can re-purpose the invalidate_range test we already have. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Get rid of per_mm->notifier_countJason Gunthorpe
This is intrinsically racy and the scheme is simply unnecessary. New MR registration can wait for any on going invalidation to fully complete. CPU0 CPU1 if (atomic_read()) if (atomic_dec_and_test() && !list_empty()) { /* not taken */ } list_add() Putting the new UMEM into some kind of purgatory until another invalidate rolls through.. Instead hold the read side of the umem_rwsem across the pair'd start/end and get rid of the racy 'deferred add' approach. Since all umem's in the rbt are always ready to go, also get rid of the mn_counters_active stuff. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Use umem->owning_mm inside ODPJason Gunthorpe
Since ODP had a single struct mmu_notifier located in the ucontext it could only handle a single MM at a time, and this prevented it from using the new owning_mm system. With the prior rework it is now simple to let ODP track multiple MMs per ucontext, finish the job so that the per_mm is allocated on a mm by mm basis, and freed when the last umem is dropped from the ucontext. As a side effect the new saner locking removes the lockdep splat about nesting the umem_rwsem between mmu_notifier_unregister and ib_umem_odp_release. It also makes ODP work with multiple processes, across, fork, etc. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Move all the ODP related stuff out of ucontext and into per_mmJason Gunthorpe
This is the first step to make ODP use the owning_mm that is now part of struct ib_umem. Each ODP umem is linked to a single per_mm structure, which in turn, is linked to a single mm, via the embedded mmu_notifier. This first patch introduces the structure and reworks eveything to use it. This also needs to introduce tgid into the ib_ucontext_per_mm, as get_user_pages_remote() requires the originating task for statistics tracking. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Get rid of struct ib_umem.odp_dataJason Gunthorpe
This no longer has any use, we can use container_of to get to the umem_odp, and a simple flag to indicate if this is an odp MR. Remove the few remaining references to it. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Make ib_umem_odp into a sub structure of ib_umemJason Gunthorpe
These two structures are linked together, use the container_of pattern instead of a double allocation to make the code simpler and easier to follow. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/umem: Use ib_umem_odp in all function signatures connected to ODPJason Gunthorpe
All of these functions already require the ODP version of the umem struct, make this very clear by having the signature require it. This paves the way to using the container_of() pattern to link umem_odp and umem together. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmGreg Kroah-Hartman
Paolo writes: "It's mostly small bugfixes and cleanups, mostly around x86 nested virtualization. One important change, not related to nested virtualization, is that the ability for the guest kernel to trap CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is now masked by default. This is because the feature is detected through an MSR; a very bad idea that Intel seems to like more and more. Some applications choke if the other fields of that MSR are not initialized as on real hardware, hence we have to disable the whole MSR by default, as was the case before Linux 4.12." * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits) KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs kvm: selftests: Add platform_info_test KVM: x86: Control guest reads of MSR_PLATFORM_INFO KVM: x86: Turbo bits in MSR_PLATFORM_INFO nVMX x86: Check VPID value on vmentry of L2 guests nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2 KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv KVM: VMX: check nested state and CR4.VMXE against SMM kvm: x86: make kvm_{load|put}_guest_fpu() static x86/hyper-v: rename ipi_arg_{ex,non_ex} structures KVM: VMX: use preemption timer to force immediate VMExit KVM: VMX: modify preemption timer bit only when arming timer KVM: VMX: immediately mark preemption timer expired only for zero value KVM: SVM: Switch to bitmap_zalloc() KVM/MMU: Fix comment in walk_shadow_page_lockless_end() kvm: selftests: use -pthread instead of -lpthread KVM: x86: don't reset root in kvm_mmu_setup() kvm: mmu: Don't read PDPTEs when paging is not enabled x86/kvm/lapic: always disable MMIO interface in x2APIC mode KVM: s390: Make huge pages unavailable in ucontrol VMs ...
2018-09-21Merge tag 'drm-fixes-2018-09-21' of git://anongit.freedesktop.org/drm/drmGreg Kroah-Hartman
David writes: "drm fixes for 4.19-rc5: - core: fix debugfs for atomic, fix the check for atomic for non-modesetting drivers - amdgpu: adds a new PCI id, some kfd fixes and a sdma fix - i915: a bunch of GVT fixes. - vc4: scaling fix - vmwgfx: modesetting fixes and a old buffer eviction fix - udl: framebuffer destruction fix - sun4i: disable on R40 fix until next kernel - pl111: NULL termination on table fix" * tag 'drm-fixes-2018-09-21' of git://anongit.freedesktop.org/drm/drm: (21 commits) drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9 drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7 drm/vmwgfx: Fix buffer object eviction drm/vmwgfx: Don't impose STDU limits on framebuffer size drm/vmwgfx: limit mode size for all display unit to texture_max drm/vmwgfx: limit screen size to stdu_max during check_modeset drm/vmwgfx: don't check for old_crtc_state enable status drm/amdgpu: add new polaris pci id drm: sun4i: drop second PLL from A64 HDMI PHY drm: fix drm_drv_uses_atomic_modeset on non modesetting drivers. drm/i915/gvt: clear ggtt entries when destroy vgpu drm/i915/gvt: request srcu_read_lock before checking if one gfn is valid drm/i915/gvt: Add GEN9_CLKGATE_DIS_4 to default BXT mmio handler drm/i915/gvt: Init PHY related registers for BXT drm/atomic: Use drm_drv_uses_atomic_modeset() for debugfs creation drm/fb-helper: Remove set but not used variable 'connector_funcs' drm: udl: Destroy framebuffer only if it was initialized drm/sun4i: Remove R40 display pipeline compatibles drm/pl111: Make sure of_device_id tables are NULL terminated ...
2018-09-21cpufeature: avoid warning when compiling with clangStefan Agner
The table id (second) argument to MODULE_DEVICE_TABLE is often referenced otherwise. This is not the case for CPU features. This leads to warnings when building the kernel with Clang: arch/arm/crypto/aes-ce-glue.c:450:1: warning: variable 'cpu_feature_match_AES' is not needed and will not be emitted [-Wunneeded-internal-declaration] module_cpu_feature_match(AES, aes_init); ^ Avoid warnings by using __maybe_unused, similar to commit 1f318a8bafcf ("modules: mark __inittest/__exittest as __maybe_unused"). Fixes: 67bad2fdb754 ("cpu: add generic support for CPU feature based module autoloading") Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-21soc: fsl: dpio: add congestion notification supportHoria Geantă
Add support for Congestion State Change Notifications (CSCN), which allow DPIO users to be notified when a congestion group changes its state (due to hitting the entrance / exit threshold). Acked-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-21soc: fsl: dpio: add frame list format supportHoria Geantă
Add support for dpaa2_fd_list format, i.e. dpaa2_fl_entry structure and accessors. Frame list entries (FLEs) are similar, but not identical to FDs: + "F" (final) bit - FMT[b'01] is reserved - DD, SC, DROPP bits (covered by "FD compatibility" field in FLE case) - FLC[5:0] not used for stashing Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Acked-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-21soc: fsl: dpio: add back some frame queue functionsHoria Geantă
This commit adds back functions removed in commit a211c8170b3c ("staging: fsl-mc/dpio: remove couple of unused functions") since dpseci object will make use of them. Acked-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-21bus: fsl-mc: add support for dpseci device typeHoria Geantă
Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Acked-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-21crypto: chacha20 - Fix chacha20_block() keystream alignment (again)Eric Biggers
In commit 9f480faec58c ("crypto: chacha20 - Fix keystream alignment for chacha20_block()"), I had missed that chacha20_block() can be called directly on the buffer passed to get_random_bytes(), which can have any alignment. So, while my commit didn't break anything, it didn't fully solve the alignment problems. Revert my solution and just update chacha20_block() to use put_unaligned_le32(), so the output buffer need not be aligned. This is simpler, and on many CPUs it's the same speed. But, I kept the 'tmp' buffers in extract_crng_user() and _get_random_bytes() 4-byte aligned, since that alignment is actually needed for _crng_backtrack_protect() too. Reported-by: Stephan Müller <smueller@chronox.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-20net/ipv4: Move device validation to helperDavid Ahern
Move the device matching check in __fib_validate_source to a helper and export it for use by netfilter modules. Code move only; no functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21Merge branch 'drm-next-4.20' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next This is a new pull for drm-next on top of last weeks with the following changes: - Fixed 64 bit divide - Fixed vram type on vega20 - Misc vega20 fixes - Misc DC fixes - Fix GDS/GWS/OA domain handling Previous changes from last week: amdgpu/kfd: - Picasso (new APU) support - Raven2 (new APU) support - Vega20 enablement - ACP powergating improvements - Add ABGR/XBGR display support - VCN JPEG engine support - Initial xGMI support - Use load balancing for engine scheduling - Lots of new documentation - Rework and clean up i2c and aux handling in DC - Add DP YCbCr 4:2:0 support in DC - Add DMCU firmware loading for Raven (used for ABM and PSR) - New debugfs features in DC - LVDS support in DC - Implement wave kill for gfx/compute (light weight reset for shaders) - Use AGP aperture to avoid gart mappings when possible - GPUVM performance improvements - Bulk moves for more efficient GPUVM LRU handling - Merge amdgpu and amdkfd into one module - Enable gfxoff and stutter mode on Raven - Misc cleanups Scheduler: - Load balancing support - Bug fixes ttm: - Bulk move functionality - Bug fixes radeon: - Misc cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180920150438.12693-1-alexander.deucher@amd.com
2018-09-20ARM: OMAP1: ams-delta-fiq: Use <linux/platform_data/gpio-omap.h>Janusz Krzysztofik
Instead of defining symbols already defined in linux/platform_data/gpio-omap.h, use that header file. Since we include the header into an assembler code, prevent C only bits from being read in. Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>