summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2020-10-01power: supply: ucs1002: fix some health status issuesLucas Stach
Some fault events like the over-current condition will get resolved by the hardware, by e.g. disabling the port. As the status in the interrupt status register is cleared on read when the fault is resolved, the sysfs health property will only contain the correct health status for the first time it is read after such an event, even if the actual fault condition (like a VBUS short) still persists. To reflect this properly in the property we cache the last health status and only update the cache when a actual change happens, i.e. the ERR bit in the status register flips, as this one properly reflects a continued fault condition. The ALERT pin however, is not driven by the ERR status, but by the actual fault status, so the pin will change back to it's default state when the hardware has automatically resolved the fault by cutting the power. Thus we never get an IRQ when the actual fault condition has been resolved and the ERR status bit has been cleared in auto-recovery mode. To get this information we need to poll the interrupt status register after some time to see if the fault is gone and update our cache in that case. To avoid any additional locking, we handle both paths (IRQ firing and delayed polling) through the same single-threaded delayed work. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2020-09-30ionic: prevent early watchdog checkShannon Nelson
In one corner case scenario, the driver device lif setup can get delayed such that the ionic_watchdog_cb() timer goes off before the ionic->lif is set, thus causing a NULL pointer panic. We catch the problem by checking for a NULL lif just a little earlier in the callback. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30ionic: stop watchdog timer earlier on removeShannon Nelson
We need to be better at making sure we don't have a link check watchdog go off while we're shutting things down, so let's stop the timer as soon as we start the remove. Meanwhile, since that was the only thing in ionic_dev_teardown(), simplify and remove that function. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30octeontx2-pf: Fix synchnorization issue in mboxHariprasad Kelam
Mbox implementation in octeontx2 driver has three states alloc, send and reset in mbox response. VF allocate and sends message to PF for processing, PF ACKs them back and reset the mbox memory. In some case we see synchronization issue where after msgs_acked is incremented and before mbox_reset API is called, if current execution is scheduled out and a different thread is scheduled in which checks for msgs_acked. Since the new thread sees msgs_acked == msgs_sent it will try to allocate a new message and to send a new mbox message to PF.Now if mbox_reset is scheduled in, PF will see '0' in msgs_send. This patch fixes the issue by calling mbox_reset before incrementing msgs_acked flag for last processing message and checks for valid message size. Fixes: d424b6c02 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30octeontx2-pf: Fix the device state on errorHariprasad Kelam
Currently in otx2_open on failure of nix_lf_start transmit queues are not stopped which are already started in link_event. Since the tx queues are not stopped network stack still try's to send the packets leading to driver crash while access the device resources. Fixes: 50fe6c02e ("octeontx2-pf: Register and handle link notifications") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30octeontx2-pf: Fix TCP/UDP checksum offload for IPv6 framesGeetha sowjanya
For TCP/UDP checksum offload feature in Octeontx2 expects L3TYPE to be set irrespective of IP header checksum is being offloaded or not. Currently for IPv6 frames L3TYPE is not being set resulting in packet drop with checksum error. This patch fixes this issue. Fixes: 3ca6c4c88 ("octeontx2-pf: Add packet transmission support") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30octeontx2-af: Fix enable/disable of default NPC entriesSubbaraya Sundeep
Packet replication feature present in Octeontx2 is a hardware linked list of PF and its VF interfaces so that broadcast packets are sent to all interfaces present in the list. It is driver job to add and delete a PF/VF interface to/from the list when the interface is brought up and down. This patch fixes the npc_enadis_default_entries function to handle broadcast replication properly if packet replication feature is present. Fixes: 40df309e4166 ("octeontx2-af: Support to enable/disable default MCAM entries") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30PCI/PM: Revert "PCI/PM: Apply D2 delay as milliseconds, not microseconds"Bjorn Helgaas
This reverts commit 7e24bc347e57992d532bc2ed700209b0fc0a4bf5. 7e24bc347e57 was based on PCIe r5.0, sec 5.9, which claims we need a 200 ms delay when transitioning to or from D2. However, sec 5.3.1.3 states the delay as 200 μs (microseconds), as does the table in PCIe r4.0, sec 5.9.1. This looks like a typo in the r5.0 spec, so revert back to a 200 μs delay instead of a 200 ms delay. Fixes: 7e24bc347e57 ("PCI/PM: Apply D2 delay as milliseconds, not microseconds") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-30PCI/PM: Remove unused PCI_PM_BUS_WAITBjorn Helgaas
476e7faefc43 ("PCI PM: Do not wait for buses in B2 or B3 during resume") removed the last use of PCI_PM_BUS_WAIT. Remove the definition as well. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-30PCI/P2PDMA: Drop double zeroing for sg_init_table()Julia Lawall
sg_init_table() zeroes its first argument, so the allocation of that argument doesn't have to. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; @@ x = - kzalloc + kmalloc (...) ... sg_init_table(x,...) // </smpl> Link: https://lore.kernel.org/r/1600601186-7420-15-git-send-email-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-30PCI: Simplify bool comparisonsKrzysztof Wilczyński
Take care about Coccinelle warnings: drivers/pci/pci.c:6008:6-12: WARNING: Comparison to bool drivers/pci/pci.c:6024:7-13: WARNING: Comparison to bool No change to functionality intended. Link: https://lore.kernel.org/r/20200925224555.1752460-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-30Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Another batch of clk driver fixes: - Make sure DRAM and ChipID region doesn't get disabled on Exynos - Fix a SATA failure on Tegra - Fix the emac_ptp clk divider on stratix10" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk clk: samsung: exynos4: mark 'chipid' clock as CLK_IGNORE_UNUSED clk: tegra: Fix missing prototype for tegra210_clk_register_emc() clk: tegra: Always program PLL_E when enabled clk: tegra: Capitalization fixes clk: samsung: Keep top BPLL mux on Exynos542x enabled
2020-09-30net: macb: move pdata to private headerAlexandre Belloni
struct macb_platform_data is only used by macb_pci to register the platform device, move its definition to cadence/macb.h and remove platform_data/macb.h Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30HID: add vivaldi HID driverSean O'Brien
Add vivaldi HID driver. This driver allows us to read and report the top row layout of keyboards which provide a vendor-defined (Google) HID usage. Signed-off-by: Sean O'Brien <seobrien@chromium.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-09-30can: flexcan: disable runtime PM if register flexcandev failedJoakim Zhang
Disable runtime PM if register flexcandev failed, and balance reference of usage_count. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20200929211557.14153-4-qiangqing.zhang@nxp.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-09-30can: flexcan: add flexcan driver for i.MX8MPJoakim Zhang
Add flexcan driver for i.MX8MP, which supports CAN FD and ECC. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20200929211557.14153-3-qiangqing.zhang@nxp.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-09-30can: flexcan: initialize all flexcan memory for ECC functionJoakim Zhang
One issue was reported at a baremetal environment, which is used for FPGA verification. "The first transfer will fail for extended ID format(for both 2.0B and FD format), following frames can be transmitted and received successfully for extended format, and standard format don't have this issue. This issue occurred randomly with high possiblity, when it occurs, the transmitter will detect a BIT1 error, the receiver a CRC error. According to the spec, a non-correctable error may cause this transfer failure." With FLEXCAN_QUIRK_DISABLE_MECR quirk, it supports correctable errors, disable non-correctable errors interrupt and freeze mode. Platform has ECC hardware support, but select this quirk, this issue may not come to light. Initialize all FlexCAN memory before accessing them, at least it can avoid non-correctable errors detected due to memory uninitialized. The internal region can't be initialized when the hardware doesn't support ECC. According to IMX8MPRM, Rev.C, 04/2020. There is a NOTE at the section 11.8.3.13 Detection and correction of memory errors: "All FlexCAN memory must be initialized before starting its operation in order to have the parity bits in memory properly updated. CTRL2[WRMFRZ] grants write access to all memory positions that require initialization, ranging from 0x080 to 0xADF and from 0xF28 to 0xFFF when the CAN FD feature is enabled. The RXMGMASK, RX14MASK, RX15MASK, and RXFGMASK registers need to be initialized as well. MCR[RFEN] must not be set during memory initialization." Memory range from 0x080 to 0xADF, there are reserved memory (unimplemented by hardware, e.g. only configure 64 MBs), these memory can be initialized or not. In this patch, initialize all flexcan memory which includes reserved memory. In this patch, create FLEXCAN_QUIRK_SUPPORT_ECC for platforms which has ECC feature. If you have a ECC platform in your hand, please select this qurik to initialize all flexcan memory firstly, then you can select FLEXCAN_QUIRK_DISABLE_MECR to only enable correctable errors. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20200929211557.14153-2-qiangqing.zhang@nxp.com [mkl: wrap long lines] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-09-30can: mcp251xfd: rename all remaining occurrence to mcp251xfdMarc Kleine-Budde
In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver, which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming, but incompatible chips. In the previous patch the autodetect DT compatbile has been renamed to "microchip,mcp251xfd", this patch changes all non user facing occurrence of "mcp25xxfd" to "mcp251xfd" and "MCP25XXFD" to "MCP251XFD". [1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Link: https://lore.kernel.org/r/20200930091424.792165-10-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-09-30can: mcp251xfd: rename all user facing strings to mcp251xfdMarc Kleine-Budde
In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver, which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming, but incompatible chips. In the previous patch the autodetect DT compatbile has been renamed to "microchip,mcp251xfd", this patch changes all user facing strings from "mcp25xxfd" to "mcp251xfd" and "MCP25XXFD" to "MCP251XFD", including: - kconfig symbols - name of kernel module - DT and SPI compatible [1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Link: https://lore.kernel.org/r/20200930091424.792165-9-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-09-30can: mcp251xfd: rename driver files and subdir to mcp251xfdMarc Kleine-Budde
In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver, which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming, but incompatible chips. In the previous patch the autodetect DT compatbile has been renamed to "microchip,mcp251xfd", this patch changes the name of the driver subdir and the individual files accordinly. [1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Link: https://lore.kernel.org/r/20200930091424.792165-8-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-09-30dm: fix missing imposition of queue_limits from dm_wq_work() threadMike Snitzer
If a DM device was suspended when bios were issued to it, those bios would be deferred using queue_io(). Once the DM device was resumed dm_process_bio() could be called by dm_wq_work() for original bio that still needs splitting. dm_process_bio()'s check for current->bio_list (meaning call chain is within ->submit_bio) as a prerequisite for calling blk_queue_split() for "abnormal IO" would result in dm_process_bio() never imposing corresponding queue_limits (e.g. discard_granularity, discard_max_bytes, etc). Fix this by always having dm_wq_work() resubmit deferred bios using submit_bio_noacct(). Side-effect is blk_queue_split() is always called for "abnormal IO" from ->submit_bio, be it from application thread or dm_wq_work() workqueue, so proper bio splitting and depth-first bio submission is performed. For sake of clarity, remove current->bio_list check before call to blk_queue_split(). Also, remove dm_wq_work()'s use of dm_{get,put}_live_table() -- no longer needed since IO will be reissued in terms of ->submit_bio. And rename bio variable from 'c' to 'bio'. Fixes: cf9c37865557 ("dm: fix comment in dm_process_bio()") Reported-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-09-30drm/amd/amdkfd: Surface files in Sysfs to allow users to get number ofRamesh Errabolu
compute units that are in use. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Surface files in Sysfs that allow user to determine the number of compute units that are in use for a given process. One Sysfs file is used per device. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amd/amdgpu: Define and implement a function that collects number ofRamesh Errabolu
waves that are in flight. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Read registers of SQ that give number of waves that are in flight of various queues. Use this information to determine number of CU's in use. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()Jason Gunthorpe
This three thread race can result in the work being run once the callback becomes NULL: CPU1 CPU2 CPU3 netevent_callback() process_one_req() rdma_addr_cancel() [..] spin_lock_bh() set_timeout() spin_unlock_bh() spin_lock_bh() list_del_init(&req->list); spin_unlock_bh() req->callback = NULL spin_lock_bh() if (!list_empty(&req->list)) // Skipped! // cancel_delayed_work(&req->work); spin_unlock_bh() process_one_req() // again req->callback() // BOOM cancel_delayed_work_sync() The solution is to always cancel the work once it is completed so any in between set_timeout() does not result in it running again. Cc: stable@vger.kernel.org Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence") Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org Reported-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-30RDMA/core: Remove ucontext->closingJason Gunthorpe
Nothing reads this any more, and the reason for its existence has passed due to the deferred fput() scheme. Fixes: 8ea1f989aa07 ("drivers/IB,usnic: reduce scope of mmap_sem") Link: https://lore.kernel.org/r/0-v1-df64ff042436+42-uctx_closing_jgg@nvidia.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-30drm/i915: Avoid mixing integer types during batch copiesChris Wilson
Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Always test execution status on closing the contextChris Wilson
Verify that if a context is active at the time it is closed, that it is either persistent and preemptible (with hangcheck running) or it shall be removed from execution. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-close Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk (cherry picked from commit d3bb2f9b5ee66d5e000293edd6b6575e59d11db9) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gt: Always send a pulse down the engine after disabling heartbeatChris Wilson
Currently, we check we can send a pulse prior to disabling the heartbeat to verify that we can change the heartbeat, but since we may re-evaluate execution upon changing the heartbeat interval we need another pulse afterwards to refresh execution. v2: Tvrtko asked if we could reduce the double pulse to a single, which opened up a discussion of how we should handle the pulse-error after attempting to change the property, and the desire to serialise adjustment of the property with its validating pulse, and unwind upon failure. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk (cherry picked from commit 3dd66a94de59d7792e7917eb3075342e70f06f44) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Cancel outstanding work after disabling heartbeats on an engineChris Wilson
We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk (cherry picked from commit 7a991cd3e3da9a56d5616b62d425db000a3242f2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Hold request reference for canceling an active contextChris Wilson
We have to be very careful while walking the timeline->requests list under the RCU guard, as the requests (and so rq->link) use SLAB_TYPESAFE_BY_RCU and so the requests may be reallocated within an rcu grace period. As the requests are reallocated, they are removed from one list and placed on another, and if we are iterating over that request at that moment, the list iteration jumps from one list to the next and promptly gets confused. Verify we hold the request reference to ensure that the request is not added to a new list behind our backs. <4> [582.745252] general protection fault, probably for non-canonical address 0xcccccccccccccd5c: 0000 [#1] PREEMPT SMP PTI <4> [582.745297] CPU: 0 PID: 1475 Comm: gem_ctx_persist Not tainted 5.9.0-rc1-CI-CI_DRM_8908+ #1 <4> [582.745304] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018 <4> [582.745317] RIP: 0010:__lock_acquire+0x2c3/0x1f40 <4> [582.745323] Code: 00 65 8b 05 c7 8a ef 7e 85 c0 0f 85 b4 07 00 00 44 8b 9d c4 08 00 00 45 85 db 0f 84 0f 01 00 00 ba 05 00 00 00 e9 c8 06 00 00 <48> 81 3f c0 89 c7 82 b8 00 00 00 00 41 0f 45 c0 83 fe 01 41 89 c3 <4> [582.745334] RSP: 0018:ffffc9000461bc40 EFLAGS: 00010002 <4> [582.745340] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 <4> [582.745345] RDX: 0000000000000000 RSI: 0000000000000000 RDI: cccccccccccccd5c <4> [582.745350] RBP: ffff8881ec4a2880 R08: 0000000000000001 R09: 0000000000000001 <4> [582.745356] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 <4> [582.745361] R13: 0000000000000000 R14: 0000000000000000 R15: cccccccccccccd5c <4> [582.745367] FS: 00007fb44da78e40(0000) GS:ffff888278000000(0000) knlGS:0000000000000000 <4> [582.745373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [582.745378] CR2: 00007fb44daad040 CR3: 0000000268428000 CR4: 0000000000350ef0 <4> [582.745383] Call Trace: <4> [582.745390] ? __lock_acquire+0x913/0x1f40 <4> [582.745397] lock_acquire+0xb5/0x3c0 <4> [582.745526] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745533] ? find_held_lock+0x2d/0x90 <4> [582.745541] _raw_spin_lock_irq+0x30/0x40 <4> [582.745635] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745727] kill_engines+0x19a/0x4b0 [i915] <4> [582.745820] context_close+0x195/0x410 [i915] <4> [582.745912] i915_gem_context_close+0x5b/0x160 [i915] <4> [582.745994] i915_driver_postclose+0x14/0x40 [i915] <4> [582.746003] drm_file_free.part.13+0x240/0x290 <4> [582.746009] drm_release_noglobal+0x16/0x50 <4> [582.746016] __fput+0xa5/0x250 <4> [582.746021] task_work_run+0x6e/0xb0 <4> [582.746028] exit_to_user_mode_prepare+0x178/0x180 <4> [582.746034] syscall_exit_to_user_mode+0x36/0x220 <4> [582.746040] entry_SYSCALL_64_after_hwframe+0x44/0xa9 <4> [582.746045] RIP: 0033:0x7fb44d1dc421 <4> [582.746050] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 ea cf 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f f3 c3 0f 1f 44 00 00 53 89 fb 48 83 ec 10 <4> [582.746062] RSP: 002b:00007ffed2e83818 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 <4> [582.746069] RAX: 0000000000000000 RBX: 0000556410bfe840 RCX: 00007fb44d1dc421 <4> [582.746075] RDX: 000000000000000a RSI: 00000000c0406469 RDI: 0000000000000008 <4> [582.746080] RBP: 0000000000000008 R08: 00007fb44d1c51cc R09: 00007fb44d1c5240 <4> [582.746086] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000fffffffb <4> [582.746091] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000000000a <4> [582.746099] Modules linked in: vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio btusb btrtl btbcm btintel x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul bluetooth ghash_clmulni_intel ecdh_generic ecc i915 r8169 realtek mei_me mei snd_hda_intel i2c_hid snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_geminilake pinctrl_intel prime_numbers [last unloaded: test_drm_mm] Fixes: 736e785f9b28 ("drm/i915/gem: Reduce context termination list iteration guard to RCU") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-2-chris@chris-wilson.co.uk (cherry picked from commit badef44deff1fae8d21c5c1cfc4dde95fb5bf993) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Redo "Remove i915_request.lock requirement for execution callbacks"Chris Wilson
The reordering and rebasing of commit 2e4c6c1a9db5 ("drm/i915: Remove i915_request.lock requirement for execution callbacks") caused it to revert an earlier correction. Let us restore commit 99f0a640d464 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: 2e4c6c1a9db5 ("drm/i915: Remove i915_request.lock requirement for execution callbacks") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk (cherry picked from commit 35faeb7de9ef83da510a048f2016061f1e31d5fc) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutexChris Wilson
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk (cherry picked from commit 102f5aa491f262c818e607fc4fee08a724a76c69) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: check i915_vm_alloc_pt_stash for errorsMatthew Auld
If we are really unlucky and encounter an error during i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the way down into the low-level ppgtt alloc code, leading to explosions, since it expects at least the required number of pt/pd for the va range. [ 211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 211.981421] #PF: supervisor read access in kernel mode [ 211.981422] #PF: error_code(0x0000) - not-present page [ 211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0 [ 211.981427] Oops: 0000 [#1] SMP PTI [ 211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G U I 5.9.0-rc5+ #3 [ 211.981430] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017 [ 211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915] [ 211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa [ 211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202 [ 211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004 [ 211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300 [ 211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000 [ 211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000 [ 211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8 [ 211.981534] FS: 00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000 [ 211.981536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0 [ 211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 211.981542] Call Trace: [ 211.981609] gen8_ppgtt_alloc+0x79/0x90 [i915] [ 211.981678] ppgtt_bind_vma+0x36/0x80 [i915] [ 211.981756] __vma_bind+0x39/0x40 [i915] [ 211.981818] fence_work+0x21/0x98 [i915] [ 211.981879] fence_notify+0x8d/0x128 [i915] [ 211.981939] __i915_sw_fence_complete+0x62/0x240 [i915] [ 211.982018] i915_vma_pin_ww+0x1ee/0x9c0 [i915] Fixes: cd0452aa2a0d ("drm/i915: Preallocate stashes for vma page-directories") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com (cherry picked from commit 1604cb2aa7fafd83e11f9257f765a5f5dd7c19d3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Fix uninitialised variable in intel_context_create_request.Maarten Lankhorst
In case backoff fails with an error, we return an undefined rq, assign err to rq correctly. Fixes: 8a929c9eb1c2 ("drm/i915: Use ww pinning for intel_context_create_request()") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 4316b19dee27cc5cd34a95fdbc0a3a5237507701) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Break up error capture compression loops with cond_resched()Chris Wilson
As the error capture will compress user buffers as directed to by the user, it can take an arbitrary amount of time and space. Break up the compression loops with a call to cond_resched(), that will allow other processes to schedule (avoiding the soft lockups) and also serve as a warning should we try to make this loop atomic in the future. Testcase: igt/gem_exec_capture/many-* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk (cherry picked from commit 293f43c80c0027ff9299036c24218ac705ce584e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Fix an error code i915_gem_object_copy_blt()Dan Carpenter
This code should use "vma[1]" instead of "vma". The "vma" variable is a valid pointer. Fixes: 6b05030496f7 ("drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200911075243.GG12635@kadam (cherry picked from commit 68ba71e3ae6dd86a23486655e33c5f8c9bd90777) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gt: Clear the buffer pool age before useChris Wilson
If we create a new node, it is possible for the slab allocator to return us a recently freed node. If that node was just retired, it will retain the current jiffy as its node->age. There is then a miniscule window, where as that node is retired, it will appear on the free list with an incorrect age and be eligible for reuse by one thread, and then by a second thread as the correct node->age is written. Fixes: 06b73c2d0b65 ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk (cherry picked from commit 9bb34ff25c458a2a48fb61409df42f465ede37f8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Prevent using pgprot_writecombine() if PAT is not supportedChris Wilson
Let's not try and use PAT attributes for I915_MAP_WC if the CPU doesn't support PAT. Fixes: 6056e50033d9 ("drm/i915/gem: Support discontiguous lmem object maps") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-2-chris@chris-wilson.co.uk (cherry picked from commit 121ba69ffddc60df11da56f6d5b29bdb45c8eb80) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Avoid implicit vmap for highmem on x86-32Chris Wilson
On 32b, highmem using a finite set of indirect PTE (i.e. vmap) to provide virtual mappings of the high pages. As these are finite, map_new_virtual() must wait for some other kmap() to finish when it runs out. If we map a large number of objects, there is no method for it to tell us to release the mappings, and we deadlock. However, if we make an explicit vmap of the page, that uses a larger vmalloc arena, and also has the ability to tell us to release unwanted mappings. Most importantly, it will fail and propagate an error instead of waiting forever. Fixes: fb8621d3bee8 ("drm/i915: Avoid allocating a vmap arena for a single page") #x86-32 References: e87666b52f00 ("drm/i915/shrinker: Hook up vmap allocation failure notifier") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v4.7+ Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-1-chris@chris-wilson.co.uk (cherry picked from commit 060bb115c2d664f04db9c7613a104dfaef3fdd98) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30RDMA/rtrs: Remove unused field of rtrs_iuGioh Kim
list field is not used anywhere Link: https://lore.kernel.org/r/20200930131407.6438-1-gi-oh.kim@clous.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-30drm/amdgpu: disable gfxoff temporarily for navy_flounderJiansong Chen
gfxoff is temporarily disabled for navy_flounder, since at present the feature caused some tdr when performing display operations. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amdgpu: drop duplicated ecc check for vega10 (v5)Guchun Chen
The same ECC check has been executed in amdgpu_ras_init for vega10, prior to gmc_v9_0_late_init. v2: drop all atombios helper callings v3: use bit operation v4: correct inline comment, remove parity check statement v5: squash in build fix Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amd/display: add pipe reassignment prevention code to dcn3Dmytro Laktyushkin
Add code to gracefuly handle any pipe reassignment occuring on dcn3 hardware. This should only happen when new surfaces are used for an update rather than old ones updated. Fixes: 69fc1f4b976cea ("amd/drm/display: avoid dcn3 on flip opp change for slave pipes") Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amdgpu: use function pointer for gfxhub functionsOak Zeng
gfxhub functions are now called from function pointers, instead of from asic-specific functions. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amd/amdgpu: Prepare implementation to support reporting of CU usageRamesh Errabolu
[Why] Allow user to know number of compute units (CU) that are in use at any given moment. [How] Read registers of SQ that give number of waves that are in flight of various queues. Use this information to determine number of CU's in use. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amd/amdgpu: Clean up header file of symbols that are defined to be staticRamesh Errabolu
[Why] Header file exports functions get_gpu_clock_counter(), get_cu_info() and select_se_sh() that are defined to be static Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30leds: ns2: do not guard OF match pointer with of_match_ptrMarek Behún
Do not match OF match pointer with of_match_ptr, so that even if CONFIG_OF is disabled, the driver can still be bound via another method. Move definition of of_ns2_leds_match just before ns2_led_driver definition, since it is not needed sooner. Signed-off-by: Marek Behún <kabel@kernel.org> Tested-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-09-30leds: ns2: convert to fwnode APIMarek Behún
Convert from OF api to fwnode API, so that it is possible to bind this driver without device-tree. The fwnode API does not expose a function to read a specific element of an array. We therefore change the types of the ns2_led_modval structure so that we can read the whole modval array with one fwnode call. Signed-off-by: Marek Behún <kabel@kernel.org> Tested-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-09-30leds: tlc591xx: fix leak of device node iteratorTobias Jordan
In one of the error paths of the for_each_child_of_node loop in tlc591xx_probe, add missing call to of_node_put. Fixes: 1ab4531ad132 ("leds: tlc591xx: simplify driver by using the managed led API") Signed-off-by: Tobias Jordan <kernel@cdqe.de> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-09-30leds: pca963x: use struct led_init_data when registeringMarek Behún
By using struct led_init_data when registering we do not need to parse `label` DT property. Moreover `label` is deprecated and if it is not present but `color` and `function` are, LED core will compose a name from these properties instead. Previously if the `label` DT property was not present, the code composed name for the LED in the form "pca963x:%d:%.2x:%u" For backwards compatibility we therefore set init_data->default_label to this value so that the LED will not get a different name if `label` property is not present, nor are `color` and `function`. Signed-off-by: Marek Behún <marek.behun@nic.cz> Cc: Peter Meerwald <p.meerwald@bct-electronic.com> Cc: Ricardo Ribalda <ribalda@kernel.org> Cc: Zahari Petkov <zahari@balena.io> Signed-off-by: Pavel Machek <pavel@ucw.cz>