summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2021-11-17usb: dwc2: hcd_queue: Fix use of floating point literalNathan Chancellor
A new commit in LLVM causes an error on the use of 'long double' when '-mno-x87' is used, which the kernel does through an alias, '-mno-80387' (see the LLVM commit below for more details around why it does this). drivers/usb/dwc2/hcd_queue.c:1744:25: error: expression requires 'long double' type support, but target 'x86_64-unknown-linux-gnu' does not support it delay = ktime_set(0, DWC2_RETRY_WAIT_DELAY); ^ drivers/usb/dwc2/hcd_queue.c:62:34: note: expanded from macro 'DWC2_RETRY_WAIT_DELAY' #define DWC2_RETRY_WAIT_DELAY (1 * 1E6L) ^ 1 error generated. This happens due to the use of a 'long double' literal. The 'E6' part of '1E6L' causes the literal to be a 'double' then the 'L' suffix promotes it to 'long double'. There is no visible reason for a floating point value in this driver, as the value is only used as a parameter to a function that expects an integer type. Use NSEC_PER_MSEC, which is the same integer value as '1E6L', to avoid changing functionality but fix the error. Link: https://github.com/ClangBuiltLinux/linux/issues/1497 Link: https://github.com/llvm/llvm-project/commit/a8083d42b1c346e21623a1d36d1f0cadd7801d83 Fixes: 6ed30a7d8ec2 ("usb: dwc2: host: use hrtimer for NAK retries") Cc: stable <stable@vger.kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: John Keeping <john@metanate.com> Acked-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211105145802.2520658-1-nathan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: dwc3: gadget: Fix null pointer exceptionAlbert Wang
In the endpoint interrupt functions dwc3_gadget_endpoint_transfer_in_progress() and dwc3_gadget_endpoint_trbs_complete() will dereference the endpoint descriptor. But it could be cleared in __dwc3_gadget_ep_disable() when accessory disconnected. So we need to check whether it is null or not before dereferencing it. Fixes: f09ddcfcb8c5 ("usb: dwc3: gadget: Prevent EP queuing while stopping transfers") Cc: stable <stable@vger.kernel.org> Reviewed-by: Jack Pham <quic_jackp@quicinc.com> Signed-off-by: Albert Wang <albertccwang@google.com> Link: https://lore.kernel.org/r/20211109092642.3507692-1-albertccwang@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: gadget: udc-xilinx: Fix an error handling path in 'xudc_probe()'Christophe JAILLET
A successful 'clk_prepare_enable()' call should be balanced by a corresponding 'clk_disable_unprepare()' call in the error handling path of the probe, as already done in the remove function. Fixes: 24749229211c ("usb: gadget: udc-xilinx: Add clock support") Reviewed-by: Shubhrajyoti Datta <shubhraj@xilinx.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/ec61a89b83ce34b53a3bdaacfd1413a9869cc608.1636302246.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: xhci: tegra: Check padctrl interrupt presence in device treeDmitry Osipenko
Older device-trees don't specify padctrl interrupt and xhci-tegra driver now fails to probe with -EINVAL using those device-trees. Check interrupt presence and keep runtime PM disabled if it's missing to fix the trouble. Fixes: 971ee247060d ("usb: xhci: tegra: Enable ELPG for runtime/system PM") Cc: <stable@vger.kernel.org> # 5.14+ Reported-by: Nicolas Chauvet <kwizart@gmail.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> # T124 TK1 Tested-by: Thomas Graichen <thomas.graichen@gmail.com> # T124 Nyan Big Tested-by: Thierry Reding <treding@nvidia.com> # Tegra CI Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20211107224455.10359-1-digetx@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: dwc2: gadget: Fix ISOC flow for elapsed framesMinas Harutyunyan
Added updating of request frame number for elapsed frames, otherwise frame number will remain as previous use of request. This will allow function driver to correctly track frames in case of Missed ISOC occurs. Added setting request actual length to 0 for elapsed frames. In Slave mode when pushing data to RxFIFO by dwords, request actual length incrementing accordingly. But before whole packet will be pushed into RxFIFO and send to host can occurs Missed ISOC and data will not send to host. So, in this case request actual length should be reset to 0. Fixes: 91bb163e1e4f ("usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave") Cc: stable <stable@vger.kernel.org> Reviewed-by: John Keeping <john@metanate.com> Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> Link: https://lore.kernel.org/r/c356baade6e9716d312d43df08d53ae557cb8037.1636011277.git.Minas.Harutyunyan@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: dwc3: gadget: Check for L1/L2/U3 for Start TransferThinh Nguyen
The programming guide noted that the driver needs to verify if the link state is in U0 before executing the Start Transfer command. If it's not in U0, the driver needs to perform remote wakeup. This is not accurate. If the link state is in U1/U2, then the controller will not respond to link recovery request from DCTL.ULSTCHNGREQ. The Start Transfer command will trigger a link recovery if it is in U1/U2. A clarification will be added to the programming guide for all controller versions. The current implementation shouldn't cause any functional issue. It may occasionally report an invalid time out warning from failed link recovery request. The driver will still go ahead with the Start Transfer command if the remote wakeup fails. The new change only initiates remote wakeup where it is needed, which is when the link state is in L1/L2/U3. Fixes: c36d8e947a56 ("usb: dwc3: gadget: put link to U0 before Start Transfer") Cc: <stable@vger.kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/05b4a5fbfbd0863fc9b1d7af934a366219e3d0b4.1635204761.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: dwc3: gadget: Ignore NoStream after End TransferThinh Nguyen
The End Transfer command from a stream endpoint will generate a NoStream event, and we should ignore it. Currently we set the flag DWC3_EP_IGNORE_NEXT_NOSTREAM to track this prior to sending the command, and it will be cleared on the next stream event. However, a stream event may be generated before the End Transfer command completion and prematurely clear the flag. Fix this by setting the flag on End Transfer completion instead. Fixes: 140ca4cfea8a ("usb: dwc3: gadget: Handle stream transfers") Cc: <stable@vger.kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/cee1253af4c3600edb878d11c9c08b040817ae23.1635203975.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17usb: dwc3: core: Revise GHWPARAMS9 offsetThinh Nguyen
During our predesign phase for DWC_usb32, the GHWPARAMS9 register offset was 0xc680. We revised our final design, and the GHWPARAMS9 offset is now moved to 0xc6e8 on release. Fixes: 16710380d3aa ("usb: dwc3: Capture new capability register GHWPARAMS9") Cc: <stable@vger.kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/1541737108266a97208ff827805be1f32852590c.1635202893.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17drm/i915/guc: fix NULL vs IS_ERR() checkingDan Carpenter
The intel_engine_create_virtual() function does not return NULL. It returns error pointers. Fixes: e5e32171a2cf ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211116114916.GB11936@kili (cherry picked from commit fc12b70d12d07598cde27cc17dbfafc2a2a33ff8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-17drm/i915/dsi/xelpd: Fix the bit mask for wakeup GBVandita Kulkarni
v2: Fix the typo, move out the hardcoding from macro(Jani, Ville) Fixes: f87c46c43175 ("drm/i915/dsi/xelpd: Add WA to program LP to HS wakeup guardband") Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211019151435.20477-2-vandita.kulkarni@intel.com (cherry picked from commit 6f07707fa09e1dc58c431d57c25ef2e68b9bec47) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-17Revert "drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping"Vandita Kulkarni
This reverts commit 991d9557b0c4 ("drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping"). The Bspec was updated recently with the pll ungate sequence similar to that of icl dsi enable sequence. Hence reverting. Bspec: 49187 Fixes: 991d9557b0c4 ("drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping") Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211109120428.15211-1-vandita.kulkarni@intel.com (cherry picked from commit 4579509ef181480f4e4510d436c691519167c5c2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-17staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect()Dan Carpenter
The free_rtllib() function frees the "dev" pointer so there is use after free on the next line. Re-arrange things to avoid that. Fixes: 66898177e7e5 ("staging: rtl8192e: Fix unload/reload problem") Cc: stable <stable@vger.kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211117072016.GA5237@kili Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17staging: greybus: Add missing rwsem around snd_ctl_remove() callsTakashi Iwai
snd_ctl_remove() has to be called with card->controls_rwsem held (when called after the card instantiation). This patch adds the missing rwsem calls around it. Fixes: 510e340efe0c ("staging: greybus: audio: Add helper APIs for dynamic audio modules") Cc: stable <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20211116072027.18466-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17firmware: arm_scmi: Fix type error assignment in voltage protocolCristian Marussi
Fix incorrect type assignment error reported by sparse as: drivers/firmware/arm_scmi/voltage.c:159:42: warning: incorrect type in assignment (different base types) drivers/firmware/arm_scmi/voltage.c:159:42: expected restricted __le32 [usertype] level_index drivers/firmware/arm_scmi/voltage.c:159:42: got unsigned int [usertype] desc_index Link: https://lore.kernel.org/r/20211115154043.49284-1-cristian.marussi@arm.com Fixes: 2add5cacff353 ("firmware: arm_scmi: Add voltage domain management protocol support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-11-17firmware: arm_scmi: Fix type error in sensor protocolCristian Marussi
Fix incorrect type error reported by sparse as: drivers/firmware/arm_scmi/sensors.c:640:28: warning: incorrect type in argument 1 (different base types) drivers/firmware/arm_scmi/sensors.c:640:28: expected unsigned int [usertype] val drivers/firmware/arm_scmi/sensors.c:640:28: got restricted __le32 [usertype] Link: https://lore.kernel.org/r/20211115154043.49284-2-cristian.marussi@arm.com Fixes: 7b83c5f410889 ("firmware: arm_scmi: Add SCMI v3.0 sensor configuration support") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-11-17firmware: arm_scmi: pm: Propagate return value to callerPeng Fan
of_genpd_add_provider_onecell may return error, so let's propagate its return value to caller Link: https://lore.kernel.org/r/20211116064227.20571-1-peng.fan@oss.nxp.com Fixes: 898216c97ed2 ("firmware: arm_scmi: add device power domain support using genpd") Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-11-17firmware: arm_scmi: Fix base agent discover responseVincent Guittot
According to scmi specification, the response of the discover agent request is made of: - int32 status - uint32 agent_id - uint8 name[16] but the current implementation doesn't take into account the agent_id field and only allocates a rx buffer of SCMI_MAX_STR_SIZE length Allocate the correct length for rx buffer and copy the name from the correct offset in the response. While no error were returned until v5.15, v5.16-rc1 fails with virtio_scmi transport channel: | arm-scmi firmware:scmi0: SCMI Notifications - Core Enabled. | arm-scmi firmware:scmi0: SCMI Protocol v2.0 'Linaro:PMWG' Firmware version 0x2090000 | scmi-virtio virtio0: tx:used len 28 is larger than in buflen 24 Link: https://lore.kernel.org/r/20211117081856.9932-1-vincent.guittot@linaro.org Fixes: b6f20ff8bd94 ("firmware: arm_scmi: add common infrastructure and support for base protocol") Tested-by: Cristian Marussi <cristian.marussi@arm.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-11-17Merge tag 'mlx5-fixes-2021-11-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2021-11-16 Please pull this mlx5 fixes series, or let me know in case of any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17parisc/sticon: fix reverse colorsSven Schnelle
sticon_build_attr() checked the reverse argument and flipped background and foreground color, but returned the non-reverse value afterwards. Fix this and also add two local variables for foreground and background color to make the code easier to read. Signed-off-by: Sven Schnelle <svens@stackframe.org> Cc: <stable@vger.kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-17mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiBAdrian Hunter
The block layer forces a minimum segment size of PAGE_SIZE, so a segment can be too big for the ADMA table, if PAGE_SIZE >= 64KiB. Fix by writing multiple descriptors, noting that the ADMA table is sized for 4KiB chunks anyway, so it will be big enough. Reported-and-tested-by: Bough Chen <haibo.chen@nxp.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211115082345.802238-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-11-17fbdev: Prevent probing generic drivers if a FB is already registeredJavier Martinez Canillas
The efifb and simplefb drivers just render to a pre-allocated frame buffer and rely on the display hardware being initialized before the kernel boots. But if another driver already probed correctly and registered a fbdev, the generic drivers shouldn't be probed since an actual driver for the display hardware is already present. This is more likely to occur after commit d391c5827107 ("drivers/firmware: move x86 Generic System Framebuffers support") since the "efi-framebuffer" and "simple-framebuffer" platform devices are registered at a later time. Link: https://lore.kernel.org/r/20211110200253.rfudkt3edbd3nsyj@lahvuun/ Fixes: d391c5827107 ("drivers/firmware: move x86 Generic System Framebuffers support") Reported-by: Ilya Trukhanov <lahvuun@gmail.com> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Ilya Trukhanov <lahvuun@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211111115757.1351045-1-javierm@redhat.com
2021-11-17drm/scheduler: fix drm_sched_job_add_implicit_dependencies harderRob Clark
drm_sched_job_add_dependency() could drop the last ref, so we need to do the dma_fence_get() first. Cc: Christian König <christian.koenig@amd.com> Fixes: 9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211116155545.473311-1-robdclark@gmail.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-11-16net: stmmac: Fix signed/unsigned wreckageThomas Gleixner
The recent addition of timestamp correction to compensate the CDC error introduced a subtle signed/unsigned bug in stmmac_get_tx_hwtstamp() while it managed for some obscure reason to avoid that in stmmac_get_rx_hwtstamp(). The issue is: s64 adjust = 0; u64 ns; adjust += -(2 * (NSEC_PER_SEC / priv->plat->clk_ptp_rate)); ns += adjust; works by chance on 64bit, but falls apart on 32bit because the compiler knows that adjust fits into 32bit and then treats the addition as a u64 + u32 resulting in an off by ~2 seconds failure. The RX variant uses an u64 for adjust and does the adjustment via ns -= adjust; because consistency is obviously overrated. Get rid of the pointless zero initialized adjust variable and do: ns -= (2 * NSEC_PER_SEC) / priv->plat->clk_ptp_rate; which is obviously correct and spares the adjust obfuscation. Aside of that it yields a more accurate result because the multiplication takes place before the integer divide truncation and not afterwards. Stick the calculation into an inline so it can't be accidentally disimproved. Return an u32 from that inline as the result is guaranteed to fit which lets the compiler optimize the substraction. Cc: stable@vger.kernel.org Fixes: 3600be5f58c1 ("net: stmmac: add timestamp correction to rid CDC sync error") Reported-by: Benedikt Spranger <b.spranger@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Benedikt Spranger <b.spranger@linutronix.de> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel EHL Link: https://lore.kernel.org/r/87mtm578cs.ffs@tglx Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16amt: cancel delayed_work synchronously in amt_fini()Taehee Yoo
When the amt module is being removed, it calls cancel_delayed_work() to cancel pending delayed_work. But this function doesn't wait for canceling delayed_work. So, workers can be still doing after module delete. In order to avoid this, cancel_delayed_work_sync() should be used instead. Suggested-by: Jakub Kicinski <kuba@kernel.org> Fixes: bc54e49c140b ("amt: add multicast(IGMP) report message handler") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20211116160923.25258-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16bnxt_en: Fix compile error regression when CONFIG_BNXT_SRIOV is not setMichael Chan
bp->sriov_cfg is not defined when CONFIG_BNXT_SRIOV is not set. Fix it by adding a helper function bnxt_sriov_cfg() to handle the logic with or without the config option. Fixes: 46d08f55d24e ("bnxt_en: extend RTNL to VF check in devlink driver_reinit") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1637090770-22835-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16net: mvmdio: fix compilation warningMarcin Wojtas
The kernel test robot reported a following issue: >> drivers/net/ethernet/marvell/mvmdio.c:426:36: warning: unused variable 'orion_mdio_acpi_match' [-Wunused-const-variable] static const struct acpi_device_id orion_mdio_acpi_match[] = { ^ 1 warning generated. Fix that by surrounding the variable by appropriate ifdef. Fixes: c54da4c1acb1 ("net: mvmdio: add ACPI support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20211115153024.209083-1-mw@semihalf.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16scsi: qla2xxx: Fix mailbox direction flags in qla2xxx_get_adapter_id()Ewan D. Milne
The SCM changes set the flags in mcp->out_mb instead of mcp->in_mb so the data was not actually being read into the mcp->mb[] array from the adapter. Link: https://lore.kernel.org/r/20211108183012.13895-1-emilne@redhat.com Fixes: 9f2475fe7406 ("scsi: qla2xxx: SAN congestion management implementation") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: ufs: core: Fix another task management completion raceAdrian Hunter
hba->outstanding_tasks, which is read under host_lock spinlock, tells the interrupt handler what task management tags are in use by the driver. The doorbell register bits indicate which tags are in use by the hardware. A doorbell bit that is 0 is because the bit has yet to be set by the driver, or because the task is complete. It is only possible to disambiguate the 2 cases, if reading/writing the doorbell register is synchronized with reading/writing hba->outstanding_tasks. For that reason, reading REG_UTP_TASK_REQ_DOOR_BELL must be done under spinlock. Link: https://lore.kernel.org/r/20211108064815.569494-3-adrian.hunter@intel.com Fixes: f5ef336fd2e4 ("scsi: ufs: core: Fix task management completion") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: ufs: core: Fix task management completion timeout raceAdrian Hunter
__ufshcd_issue_tm_cmd() clears req->end_io_data after timing out, which races with the completion function ufshcd_tmc_handler() which expects req->end_io_data to have a value. Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already synchronized using hba->tmf_rqs and hba->outstanding_tasks under the host_lock spinlock. It is also not necessary (nor typical) to clear req->end_io_data because the block layer does it before allocating out requests e.g. via blk_get_request(). So fix by not clearing it. Link: https://lore.kernel.org/r/20211108064815.569494-2-adrian.hunter@intel.com Fixes: f5ef336fd2e4 ("scsi: ufs: core: Fix task management completion") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: core: sysfs: Fix hang when device state is set via sysfsMike Christie
This fixes a regression added with: commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") The problem is that after iSCSI recovery, iscsid will call into the kernel to set the dev's state to running, and with that patch we now call scsi_rescan_device() with the state_mutex held. If the SCSI error handler thread is just starting to test the device in scsi_send_eh_cmnd() then it's going to try to grab the state_mutex. We are then stuck, because when scsi_rescan_device() tries to send its I/O scsi_queue_rq() calls -> scsi_host_queue_ready() -> scsi_host_in_recovery() which will return true (the host state is still in recovery) and I/O will just be requeued. scsi_send_eh_cmnd() will then never be able to grab the state_mutex to finish error handling. To prevent the deadlock move the rescan-related code to after we drop the state_mutex. This also adds a check for if we are already in the running state. This prevents extra scans and helps the iscsid case where if the transport class has already onlined the device during its recovery process then we don't need userspace to do it again plus possibly block that daemon. Link: https://lore.kernel.org/r/20211105221048.6541-3-michael.christie@oracle.com Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") Cc: Bart Van Assche <bvanassche@acm.org> Cc: lijinlin <lijinlin3@huawei.com> Cc: Wu Bo <wubo40@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: iscsi: Unblock session then wake up error handlerMike Christie
We can race where iscsi_session_recovery_timedout() has woken up the error handler thread and it's now setting the devices to offline, and session_recovery_timedout()'s call to scsi_target_unblock() is also trying to set the device's state to transport-offline. We can then get a mix of states. For the case where we can't relogin we want the devices to be in transport-offline so when we have repaired the connection __iscsi_unblock_session() can set the state back to running. Set the device state then call into libiscsi to wake up the error handler. Link: https://lore.kernel.org/r/20211105221048.6541-2-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: ufs: core: Improve SCSI abort handlingBart Van Assche
The following has been observed on a test setup: WARNING: CPU: 4 PID: 250 at drivers/scsi/ufs/ufshcd.c:2737 ufshcd_queuecommand+0x468/0x65c Call trace: ufshcd_queuecommand+0x468/0x65c scsi_send_eh_cmnd+0x224/0x6a0 scsi_eh_test_devices+0x248/0x418 scsi_eh_ready_devs+0xc34/0xe58 scsi_error_handler+0x204/0x80c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 That warning is triggered by the following statement: WARN_ON(lrbp->cmd); Fix this warning by clearing lrbp->cmd from the abort handler. Link: https://lore.kernel.org/r/20211104181059.4129537-1-bvanassche@acm.org Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-17ata: libata: improve ata_read_log_page() error messageDamien Le Moal
If ata_read_log_page() fails to read a log page, the ata_dev_err() error message only print the page number, omitting the log number. In case of error, facilitate debugging by also printing the log number. Cc: stable@kernel.org # 5.15 Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Matthew Perkowski <mgperkow@gmail.com>
2021-11-16xen: don't continue xenstore initialization in case of errorsStefano Stabellini
In case of errors in xenbus_init (e.g. missing xen_store_gfn parameter), we goto out_error but we forget to reset xen_store_domain_type to XS_UNKNOWN. As a consequence xenbus_probe_initcall and other initcalls will still try to initialize xenstore resulting into a crash at boot. [ 2.479830] Call trace: [ 2.482314] xb_init_comms+0x18/0x150 [ 2.486354] xs_init+0x34/0x138 [ 2.489786] xenbus_probe+0x4c/0x70 [ 2.498432] xenbus_probe_initcall+0x2c/0x7c [ 2.503944] do_one_initcall+0x54/0x1b8 [ 2.507358] kernel_init_freeable+0x1ac/0x210 [ 2.511617] kernel_init+0x28/0x130 [ 2.516112] ret_from_fork+0x10/0x20 Cc: <Stable@vger.kernel.org> Cc: jbeulich@suse.com Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Link: https://lore.kernel.org/r/20211115222719.2558207-1-sstabellini@kernel.org Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-16xen/privcmd: make option visible in KconfigJuergen Gross
This configuration option provides a misc device as an API to userspace. Make this API usable without having to select the module as a transitive dependency. This also fixes an issue where localyesconfig would select CONFIG_XEN_PRIVCMD=m because it was not visible and defaulted to building as module. [boris: clarified help message per Jan's suggestion] Based-on-patch-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20211116143323.18866-1-jgross@suse.com Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-16net/mlx5: E-Switch, return error if encap isn't supportedRaed Salem
On regular ConnectX HCAs getting encap mode isn't supported when the E-Switch is in NONE mode. Current code would return no error code when trying to get encap mode in such case which is wrong. Fix by returning error value to indicate failure to caller in such case. Fixes: 8e0aa4bc959c ("net/mlx5: E-switch, Protect eswitch mode changes") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: Lag, update tracker when state change event receivedMaher Sanalla
Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events handling, tracking is not fully completed if the LAG device is not ready at the time the events occur. But, we must keep track of the upper and lower states after receiving the events because RoCE needs this info in mlx5_lag_get_roce_netdev() - in order to return the corresponding port that its running on. Returning the wrong (not most recent) port will lead to gids table being incorrect. For example: If during the attachment of a slave to the bond, the other non-attached port performs pci_reload, then the LAG device is not ready, but that should not result in dismissing attached slave tracker update automatically (which is performed in mlx5_handle_changelowerstate()), Since these events might not come later, which can lead to both bond ports having tx_enabled=0 - which is not a valid state of LAG bond. Fixes: 9b412cc35f00 ("net/mlx5e: Add LAG warning if bond slave is not lag master") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5e: CT, Fix multiple allocations and memleak of mod actsRoi Dayan
CT clear action offload adds additional mod hdr actions to the flow's original mod actions in order to clear the registers which hold ct_state. When such flow also includes encap action, a neigh update event can cause the driver to unoffload the flow and then reoffload it. Each time this happens, the ct clear handling adds that same set of mod hdr actions to reset ct_state until the max of mod hdr actions is reached. Also the driver never releases the allocated mod hdr actions and causing a memleak. Fix above two issues by moving CT clear mod acts allocation into the parsing actions phase and only use it when offloading the rule. The release of mod acts will be done in the normal flow_put(). backtrace: [<000000007316e2f3>] krealloc+0x83/0xd0 [<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core] [<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core] [<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core] [<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core] [<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core] [<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core] [<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core] [<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core] [<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core] Fixes: 1ef3018f5af3 ("net/mlx5e: CT: Support clear action") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: Fix flow counters SF bulk query lenAvihai Horon
When doing a flow counters bulk query, the number of counters to query must be aligned to 4. Current SF bulk query len is not aligned to 4, which leads to an error when trying to query more than 4 counters. Fix it by aligning SF bulk query len to 4. Fixes: 2fdeb4f4c2ae ("net/mlx5: Reduce flow counters bulk query buffer size for SFs") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: E-Switch, rebuild lag only when neededMark Bloch
A user can enable VFs without changing E-Switch mode, this can happen when a user moves straight to switchdev mode and only once in switchdev VFs are enabled via the sysfs interface. The cited commit assumed this isn't possible and exposed a single API function where the E-switch calls into the lag code, breaks the lag and prevents any other lag operations to take place until the E-switch update has ended. Breaking the hardware lag when it isn't needed can make it such that hardware lag can't be enabled again. In the sysfs call path check if the current E-Switch mode is NONE, in the context of the function it can only mean the E-Switch is moving out of NONE mode and the hardware lag should be disabled and enabled once the mode change has ended. If the mode isn't NONE it means VFs are about to be enabled and such operation doesn't require toggling the hardware lag. Fixes: cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: Update error handler for UCTX and UMEMNeta Ostrovsky
In the fast unload flow, the device state is set to internal error, which indicates that the driver started the destroy process. In this case, when a destroy command is being executed, it should return MLX5_CMD_STAT_OK. Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK instead of EIO. This fixes a call trace in the umem release process - [ 2633.536695] Call Trace: [ 2633.537518] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] [ 2633.538596] remove_client_context+0x8b/0xd0 [ib_core] [ 2633.539641] disable_device+0x8c/0x130 [ib_core] [ 2633.540615] __ib_unregister_device+0x35/0xa0 [ib_core] [ 2633.541640] ib_unregister_device+0x21/0x30 [ib_core] [ 2633.542663] __mlx5_ib_remove+0x38/0x90 [mlx5_ib] [ 2633.543640] auxiliary_bus_remove+0x1e/0x30 [auxiliary] [ 2633.544661] device_release_driver_internal+0x103/0x1f0 [ 2633.545679] bus_remove_device+0xf7/0x170 [ 2633.546640] device_del+0x181/0x410 [ 2633.547606] mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core] [ 2633.548777] mlx5_unregister_device+0x27/0x40 [mlx5_core] [ 2633.549841] mlx5_uninit_one+0x21/0xc0 [mlx5_core] [ 2633.550864] remove_one+0x69/0xe0 [mlx5_core] [ 2633.551819] pci_device_remove+0x3b/0xc0 [ 2633.552731] device_release_driver_internal+0x103/0x1f0 [ 2633.553746] unbind_store+0xf6/0x130 [ 2633.554657] kernfs_fop_write+0x116/0x190 [ 2633.555567] vfs_write+0xa5/0x1a0 [ 2633.556407] ksys_write+0x4f/0xb0 [ 2633.557233] do_syscall_64+0x5b/0x1a0 [ 2633.558071] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 2633.559018] RIP: 0033:0x7f9977132648 [ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 [ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648 [ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001 [ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740 [ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0 [ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c [ 2633.568725] ---[ end trace 10b4fe52945e544d ]--- Fixes: 6a6fabbfa3e8 ("net/mlx5: Update pci error handler entries and command translation") Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: DR, Fix check for unsupported fields in match paramYevgeny Kliteynik
The existing loop doesn't cast the buffer while scanning it, which results in out-of-bounds read and failure to create the matcher. Fixes: 941f19798a11 ("net/mlx5: DR, Add check for unsupported fields in match param") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: DR, Handle eswitch manager and uplink vports separatelyYevgeny Kliteynik
When querying eswitch manager vport capabilities as "other = 1", we encounter a FW compatibility issue with older FW versions. To maintain backward compatibility, eswitch manager vport should be queried as "other = 0" vport both for ECPF and non-ECPF cases. This patch fixes these queries and improves the code readability by handling eswitch manager and uplink vports separately, avoiding the excessive 'if' conditions. Also, uplink caps are stored similar to esw manager and not as part of xarray. Fixes: dd4acb2a0954 ("net/mlx5: DR, Add missing query for vport 0") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove()Valentine Fatiev
Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds to rest of destroy operations. mlx5_core_destroy_cq() could be called again by user and cause additional call of mlx5_debug_cq_remove(). cq->dbg was not nullify in previous call and cause the crash. Fix it by nullify cq->dbg pointer after removal. Also proceed to destroy operations only if FW return 0 for MLX5_CMD_OP_DESTROY_CQ command. general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockref_get+0x1/0x60 Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02 00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17 48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48 RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058 RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000 R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0 FS: 00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0 Call Trace: simple_recursive_removal+0x33/0x2e0 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 mlx5_debug_cq_remove+0x32/0x70 [mlx5_core] mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core] devx_obj_cleanup+0x151/0x330 [mlx5_ib] ? __pollwait+0xd0/0xd0 ? xas_load+0x5/0x70 ? xa_load+0x62/0xa0 destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs] uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs] uobj_destroy+0x54/0xa0 [ib_uverbs] ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs] ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs] ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs] __x64_sys_ioctl+0x3e4/0x8e0 Fixes: 94b960b9deff ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path") Signed-off-by: Valentine Fatiev <valentinef@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdevPaul Blakey
E-Switch encap mode is relevant only when in switchdev mode. The RDMA driver can query the encap configuration via mlx5_eswitch_get_encap_mode(). Make sure it returns the currently used mode and not the set one. This reverts the cited commit which reset the encap mode on entering switchdev and fixes the original issue properly. Fixes: 9a64144d683a ("net/mlx5: E-Switch, Fix default encap mode") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5e: Wait for concurrent flow deletion during neigh/fib eventsVlad Buslov
Function mlx5e_take_tmp_flow() skips flows with zero reference count. This can cause syndrome 0x179e84 when the called from neigh or route update code and the skipped flow is not removed from the hardware by the time underlying encap/decap resource is deleted. Add new completion 'del_hw_done' that is completed when flow is unoffloaded. This is safe to do because flow with reference count zero needs to be detached from encap/decap entry before its memory is deallocated, which requires taking the encap_tbl_lock mutex that is held by the event handlers code. Fixes: 8914add2c9e5 ("net/mlx5e: Handle FIB events to update tunnel endpoint device") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5e: kTLS, Fix crash in RX resync flowTariq Toukan
For the TLS RX resync flow, we maintain a list of TLS contexts that require some attention, to communicate their resync information to the HW. Here we fix list corruptions, by protecting the entries against movements coming from resync_handle_seq_match(), until their resync handling in napi is fully completed. Fixes: e9ce991bce5b ("net/mlx5e: kTLS, Add resiliency to RX resync failures") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16thermal: core: Reset previous low and high trip during thermal zone initManaf Meethalavalappu Pallikunhi
During the suspend is in process, thermal_zone_device_update bails out thermal zone re-evaluation for any sensor trip violation without setting next valid trip to that sensor. It assumes during resume it will re-evaluate same thermal zone and update trip. But when it is in suspend temperature goes down and on resume path while updating thermal zone if temperature is less than previously violated trip, thermal zone set trip function evaluates the same previous high and previous low trip as new high and low trip. Since there is no change in high/low trip, it bails out from thermal zone set trip API without setting any trip. It leads to a case where sensor high trip or low trip is disabled forever even though thermal zone has a valid high or low trip. During thermal zone device init, reset thermal zone previous high and low trip. It resolves above mentioned scenario. Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-16thermal: int340x: Limit Kconfig to 64-bitArnd Bergmann
32-bit processors cannot generally access 64-bit MMIO registers atomically, and it is unknown in which order the two halves of this registers would need to be read: drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c: In function 'send_mbox_cmd': drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c:79:37: error: implicit declaration of function 'readq'; did you mean 'readl'? [-Werror=implicit-function-declaration] 79 | *cmd_resp = readq((void __iomem *) (proc_priv->mmio_base + MBOX_OFFSET_DATA)); | ^~~~~ | readl The driver already does not build for anything other than x86, so limit it further to x86-64. Fixes: aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-16IB/hfi1: Properly allocate rdma counter desc memoryDennis Dalessandro
When optional counter support was added the allocation of the memory holding the counter descriptors was not cleared properly. This caused WARN_ON()s in the IB/sysfs code to be hit. This is because the uninitialized memory made some of the counters wrongly look like optional counters. Use kzalloc. While here change the sizeof() calls to use the pointer rather than the name of the type. WARNING: CPU: 0 PID: 32644 at drivers/infiniband/core/sysfs.c:1064 ib_setup_port_attrs+0x7e1/0x890 [ib_core] CPU: 0 PID: 32644 Comm: kworker/0:2 Tainted: G S W 5.15.0+ #36 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016 Workqueue: events work_for_cpu_fn RIP: 0010:ib_setup_port_attrs+0x7e1/0x890 [ib_core] RSP: 0018:ffffc90006ea3c40 EFLAGS: 00010202 RAX: 0000000000000068 RBX: ffff888106ad8000 RCX: 0000000000000138 RDX: ffff888126c84c00 RSI: ffff888103c41000 RDI: 0000000000000124 RBP: ffff88810f63a801 R08: ffff888126c8a000 R09: 0000000000000001 R10: ffffffffa09acf20 R11: 0000000000000065 R12: ffff88810f63a800 R13: ffff88810f63a800 R14: ffff88810f63a8e0 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff888667a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005590102cb078 CR3: 000000000240a003 CR4: 00000000001706f0 Call Trace: ib_register_device.cold.44+0x23e/0x2d0 [ib_core] rvt_register_device+0xfa/0x230 [rdmavt] hfi1_register_ib_device+0x623/0x690 [hfi1] init_one.cold.36+0x2d1/0x49b [hfi1] local_pci_probe+0x45/0x80 work_for_cpu_fn+0x16/0x20 process_one_work+0x1b1/0x360 worker_thread+0x1d4/0x3a0 kthread+0x11a/0x140 ret_from_fork+0x22/0x30 Fixes: 5e2ddd1e5982 ("RDMA/counter: Add optional counter support") Link: https://lore.kernel.org/r/20211115200913.124104.47770.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>