summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-12-18zram: fix uninitialized ZRAM not releasing backing deviceKairui Song
Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever. Fix this by always reset the ZRAM fully on rmmod or reset store. Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Desheng Wu <deshengwu@tencent.com> Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18zram: refuse to use zero sized block device as backing deviceKairui Song
Patch series "zram: fix backing device setup issue", v2. This series fixes two bugs of backing device setting: - ZRAM should reject using a zero sized (or the uninitialized ZRAM device itself) as the backing device. - Fix backing device leaking when removing a uninitialized ZRAM device. This patch (of 2): Setting a zero sized block device as backing device is pointless, and one can easily create a recursive loop by setting the uninitialized ZRAM device itself as its own backing device by (zram0 is uninitialized): echo /dev/zram0 > /sys/block/zram0/backing_dev It's definitely a wrong config, and the module will pin itself, kernel should refuse doing so in the first place. By refusing to use zero sized device we avoided misuse cases including this one above. Link: https://lkml.kernel.org/r/20241209165717.94215-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20241209165717.94215-2-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Desheng Wu <deshengwu@tencent.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18Merge tag 'linux-can-fixes-for-6.13-20241218' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2024-12-18 There are 2 patches by Matthias Schiffer for the m_can_pci driver that handles the m_can cores found on the Intel Elkhart Lake processor. They fix the initialization and the interrupt handling under high CAN bus load. * tag 'linux-can-fixes-for-6.13-20241218' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: m_can: fix missed interrupts with m_can_pci can: m_can: set init flag earlier in probe ==================== Link: https://patch.msgid.link/20241218121722.2311963-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18net: usb: qmi_wwan: add Quectel RG255CMartin Hou
Add support for Quectel RG255C which is based on Qualcomm SDX35 chip. The composition is DM / NMEA / AT / QMI. T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.01 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=0316 Rev= 5.15 S: Manufacturer=Quectel S: Product=RG255C-CN S: SerialNumber=c68192c1 C:* #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Martin Hou <martin.hou@foxmail.com> Link: https://patch.msgid.link/tencent_17DDD787B48E8A5AB8379ED69E23A0CD9309@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18net: phy: avoid undefined behavior in *_led_polarity_set()Arnd Bergmann
gcc runs into undefined behavior at the end of the three led_polarity_set() callback functions if it were called with a zero 'modes' argument and it just ends the function there without returning from it. This gets flagged by 'objtool' as a function that continues on to the next one: drivers/net/phy/aquantia/aquantia_leds.o: warning: objtool: aqr_phy_led_polarity_set+0xf: can't find jump dest instruction at .text+0x5d9 drivers/net/phy/intel-xway.o: warning: objtool: xway_gphy_led_polarity_set() falls through to next function xway_gphy_config_init() drivers/net/phy/mxl-gpy.o: warning: objtool: gpy_led_polarity_set() falls through to next function gpy_led_hw_control_get() There is no point to micro-optimize the behavior here to save a single-digit number of bytes in the kernel, so just change this to a "return -EINVAL" as we do when any unexpected bits are set. Fixes: 1758af47b98c ("net: phy: intel-xway: add support for PHY LEDs") Fixes: 9d55e68b19f2 ("net: phy: aquantia: correctly describe LED polarity override") Fixes: eb89c79c1b8f ("net: phy: mxl-gpy: correctly describe LED polarity") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241217081056.238792-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-19power: supply: bq24190: Fix BQ24296 Vbus regulator supportHans de Goede
There are 2 issues with bq24296_set_otg_vbus(): 1. When writing the OTG_CONFIG bit it uses POC_CHG_CONFIG_SHIFT which should be POC_OTG_CONFIG_SHIFT. 2. When turning the regulator off it never turns charging back on. Note this must be done through bq24190_charger_set_charge_type(), to ensure that the charge_type property value of none/trickle/fast is honored. Resolve both issues to fix BQ24296 Vbus regulator support not working. Fixes: b150a703b56f ("power: supply: bq24190_charger: Add support for BQ24296") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241116203648.169100-2-hdegoede@redhat.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-18Merge tag 'cxl-fixes-6.13-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Ira Weiny: - prevent probe failure when non-critical RAS unmasking fails - fix CXL 1.1 link status sysfs attribute - fix 4 way (and greater) switch interleave region creation * tag 'cxl-fixes-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix region creation for greater than x2 switches cxl/pci: Check dport->regs.rcd_pcie_cap availability before accessing cxl/pci: Fix potential bogus return value upon successful probing
2024-12-18drm/amdgpu/nbio7.0: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0ec43fbece784215d3c4469973e4556d70bce915) Cc: stable@vger.kernel.org
2024-12-18drm/amd: Update strapping for NBIO 2.5.0Mario Limonciello
This helps to avoid a spurious PME event on hotplug to Azalia. Cc: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Reported-and-tested-by: ionut_n2001@yahoo.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215884 Tested-by: Gabriel Marcano <gabemarcano@yahoo.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241211024414.7840-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3f6f237b9dd189e1fb85b8a3f7c97a8f27c1e49a) Cc: stable@vger.kernel.org
2024-12-18ACPI: EC: Enable EC support on LoongArch by defaultHuacai Chen
Commit a6021aa24f6417416d933 ("ACPI: EC: make EC support compile-time conditional") only enable ACPI_EC on X86 by default, but the embedded controller is also widely used on LoongArch laptops so we also enable ACPI_EC for LoongArch. The laptop driver cannot work without EC, so also update the dependency of LOONGSON_LAPTOP to let it depend on APCI_EC. Fixes: a6021aa24f6417416d933 ("ACPI: EC: make EC support compile-time conditional") Reported-by: Xiaotian Wu <wuxiaotian@loongson.cn> Tested-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20241217073704.3339587-1-chenhuacai@loongson.cn [ rjw: Added Fixes: ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-12-18drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_updateMichel Dänzer
Third time's the charm, I hope? Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 695c2c745e5dff201b75da8a1d237ce403600d04) Cc: stable@vger.kernel.org
2024-12-18drm/amdgpu: fix amdgpu_coredumpChristian König
The VM pointer might already be outdated when that function is called. Use the PASID instead to gather the information instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 57f812d171af4ba233d3ed7c94dfa5b8e92dcc04) Cc: stable@vger.kernel.org
2024-12-18drm/amdgpu/smu14.0.2: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8f2cd1067afe68372a1723e05e19b68ed187676a) Cc: stable@vger.kernel.org
2024-12-18drm/amdgpu/gfx12: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f1fd1d0f40272948aa6ab82a3a82ecbbc76dff53) Cc: stable@vger.kernel.org
2024-12-18drm/amdgpu/mmhub4.1: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 63bfd24088b42c6f55c2096bfc41b50213d419b2) Cc: stable@vger.kernel.org
2024-12-18drm/amdgpu/nbio7.11: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2c8eeaaa0fe5841ccf07a0eb51b1426f34ef39f7) Cc: stable@vger.kernel.org
2024-12-18drm/amdgpu/nbio7.7: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 22b9555bc90df22b585bdd1f161b61584b13af51) Cc: stable@vger.kernel.org
2024-12-18wifi: cw1200: Fix potential NULL dereferenceLinus Walleij
A recent refactoring was identified by smatch to cause another potential NULL dereference: drivers/net/wireless/st/cw1200/cw1200_spi.c:440 cw1200_spi_disconnect() error: we previously assumed 'self' could be null (see line 433) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202411271742.Xa7CNVh1-lkp@intel.com/ Fixes: 2719a9e7156c ("wifi: cw1200: Convert to GPIO descriptors") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241217-cw1200-fix-v1-1-911e6b5823ec@linaro.org
2024-12-18drm/amdgpu: don't access invalid schedPierre-Eric Pelloux-Prayer
Since 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") accessing job->base.sched can produce unexpected results as the initialisation of (*job)->base.sched done in amdgpu_job_alloc is overwritten by the memset. This commit fixes an issue when a CS would fail validation and would be rejected after job->num_ibs is incremented. In this case, amdgpu_ib_free(ring->adev, ...) will be called, which would crash the machine because the ring value is bogus. To fix this, pass a NULL pointer to amdgpu_ib_free(): we can do this because the device is actually not used in this function. The next commit will remove the ring argument completely. Fixes: 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2ae520cb12831d264ceb97c61f72c59d33c0dbd7)
2024-12-18drm/amd: Require CONFIG_HOTPLUG_PCI_PCIE for BOCOMario Limonciello
If the kernel hasn't been compiled with PCIe hotplug support this can lead to problems with dGPUs that use BOCO because they effectively drop off the bus. To prevent issues, disable BOCO support when compiled without PCIe hotplug. Reported-by: Gabriel Marcano <gabemarcano@yahoo.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707#note_2696862 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241211155601.3585256-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1ad5bdc28bafa66db0f041cc6cdd278a80426aae)
2024-12-18Merge tag 'hyperv-fixes-signed-20241217' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Various fixes to Hyper-V tools in the kernel tree (Dexuan Cui, Olaf Hering, Vitaly Kuznetsov) - Fix a bug in the Hyper-V TSC page based sched_clock() (Naman Jain) - Two bug fixes in the Hyper-V utility functions (Michael Kelley) - Convert open-coded timeouts to secs_to_jiffies() in Hyper-V drivers (Easwar Hariharan) * tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: tools/hv: reduce resource usage in hv_kvp_daemon tools/hv: add a .gitignore file tools/hv: reduce resouce usage in hv_get_dns_info helper hv/hv_kvp_daemon: Pass NIC name to hv_get_dns_info as well Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet Drivers: hv: util: Don't force error code to ENODEV in util_probe() tools/hv: terminate fcopy daemon if read from uio fails drivers: hv: Convert open-coded timeouts to secs_to_jiffies() tools: hv: change permissions of NetworkManager configuration file x86/hyperv: Fix hv tsc page based sched_clock for hibernation tools: hv: Fix a complier warning in the fcopy uio daemon
2024-12-18wifi: iwlwifi: mvm: Fix __counted_by usage in cfg80211_wowlan_nd_*Kees Cook
Both struct cfg80211_wowlan_nd_match and struct cfg80211_wowlan_nd_info pre-allocate space for channels and matches, but then may end up using fewer that the full allocation. Shrink the associated counter (n_channels and n_matches) after counting the results. This avoids compile-time (and run-time) warnings from __counted_by. (The counter member needs to be updated _before_ accessing the array index.) Seen with coming GCC 15: drivers/net/wireless/intel/iwlwifi/mvm/d3.c: In function 'iwl_mvm_query_set_freqs': drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2877:66: warning: operation on 'match->n_channels' may be undefined [-Wsequence-point] 2877 | match->channels[match->n_channels++] = | ~~~~~~~~~~~~~~~~~^~ drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2885:66: warning: operation on 'match->n_channels' may be undefined [-Wsequence-point] 2885 | match->channels[match->n_channels++] = | ~~~~~~~~~~~~~~~~~^~ drivers/net/wireless/intel/iwlwifi/mvm/d3.c: In function 'iwl_mvm_query_netdetect_reasons': drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2982:58: warning: operation on 'net_detect->n_matches' may be undefined [-Wsequence-point] 2982 | net_detect->matches[net_detect->n_matches++] = match; | ~~~~~~~~~~~~~~~~~~~~~^~ Cc: stable@vger.kernel.org Fixes: aa4ec06c455d ("wifi: cfg80211: use __counted_by where appropriate") Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/20240619211233.work.355-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-12-18spi: rockchip-sfc: Fix error in remove progressJon Lin
Fix error in remove progress: [ 43.026148] Call trace: [ 43.026370] klist_next+0x1c/0x1d4 [ 43.026671] device_for_each_child+0x48/0xac [ 43.027049] spi_unregister_controller+0x30/0x130 [ 43.027469] rockchip_sfc_remove+0x48/0x80 [spi_rockchip_sfc] Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://patch.msgid.link/20241218154741.901591-1-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-18Merge branch 'pci-device-recovery' into featuresAlexander Gordeev
Niklas Schnelle says: =================== This patch series enhances the introspectability of the PCI device recovery for firmware. Until now when Linux performs recovery in response to a firmware error report. For example, until now firmware debug data would have no indication if the recovery was successfull or if it failed, for example due to KVM pass-through. Improve on this by reporting recovery status as well as some debug information such as device driver name and s390dbf/pci_msg/sprintf logs via the SCLP Write Event Data Action Qualifier 2 (Log Data provided) mechanism. =================== Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-18Merge tag 'amd-pstate-v6.13-2024-12-11' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge amd-pstate driver fixes for 6.13-rc4 from Mario Liminciello: "Fix a problem where systems without preferred cores were misdetecting preferred cores. Fix issues with with boost numerator handling leading to inconsistently programmed CPPC max performance values." * tag 'amd-pstate-v6.13-2024-12-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies cpufreq/amd-pstate: Store the boost numerator as highest perf again cpufreq/amd-pstate: Detect preferred core support before driver registration
2024-12-18nvme: use blk_validate_block_size() for max LBA checkLuis Chamberlain
The block layer already has support to validates proper block sizes with blk_validate_block_size(), we can leverage that as well. No functional changes. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241218020212.3657139-3-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-18x86/cpu: Make all all CPUID leaf names consistentDave Hansen
The leaf names are not consistent. Give them all a CPUID_LEAF_ prefix for consistency and vertical alignment. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Jiang <dave.jiang@intel.com> # for ioatdma bits Link: https://lore.kernel.org/all/20241213205040.7B0C3241%40davehans-spike.ostc.intel.com
2024-12-18x86/tsc: Move away from TSC leaf magic numbersDave Hansen
The TSC code has a bunch of hard-coded references to leaf 0x15. Change them over to the symbolic name. Also zap the 'ART_CPUID_LEAF' definition. It was a duplicate of 'CPUID_TSC_LEAF'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213205034.B79D6224%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Refresh DCA leaf reading codeDave Hansen
The DCA leaf number is also hard-coded in the CPUID level dependency table. Move its definition to common code and use it. While at it, fix up the naming and types in the probe code. All CPUID data is provided in 32-bit registers, not 'unsigned long'. Also stop referring to "level_9". Move away from test_bit() because the type is no longer an 'unsigned long'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205032.476A30FE%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Remove unnecessary MwAIT leaf checksDave Hansen
The CPUID leaf dependency checker will remove X86_FEATURE_MWAIT if the CPUID level is below the required level (CPUID_MWAIT_LEAF). Thus, if you check X86_FEATURE_MWAIT you do not need to also check the CPUID level. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213205030.9B42B458%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Move MWAIT leaf definition to common headerDave Hansen
Begin constructing a common place to keep all CPUID leaf definitions. Move CPUID_MWAIT_LEAF to the CPUID header and include it where needed. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205028.EE94D02A%40davehans-spike.ostc.intel.com
2024-12-18thermal/thresholds: Fix boundaries and detection routineDaniel Lezcano
The current implementation does not work if the thermal zone is interrupt driven only. The boundaries are not correctly checked and computed as it happens only when the temperature is increasing or decreasing. The problem arises because the routine to detect when we cross a threshold is correlated with the computation of the boundaries. We assume we have to recompute the boundaries when a threshold is crossed but actually we should do that even if the it is not the case. Mixing the boundaries computation and the threshold detection for the sake of optimizing the routine is much more complex as it appears intuitively and prone to errors. This fix separates the boundaries computation and the threshold crossing detection into different routines. The result is a code much more simple to understand, thus easier to maintain. The drawback is we browse the thresholds list several time but we can consider that as neglictible because that happens when the temperature is updated. There are certainly some aeras to improve in the temperature update routine but it would be not adequate as this change aims to fix the thresholds for v6.13. Fixes: 445936f9e258 ("thermal: core: Add user thresholds support") Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> # rock5b, Lenovo x13s Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20241216212644.1145122-1-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-12-18pwm: stm32: Fix complementary output in round_waveform_tohw()Fabrice Gasnier
When the timer supports complementary output, the CCxNE bit must be set additionally to the CCxE bit. So to not overwrite the latter use |= instead of = to set the former. Fixes: deaba9cff809 ("pwm: stm32: Implementation of the waveform callbacks") Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Link: https://lore.kernel.org/r/20241217150021.2030213-1-fabrice.gasnier@foss.st.com [ukleinek: Slightly improve commit log] Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-12-18can: m_can: fix missed interrupts with m_can_pciMatthias Schiffer
The interrupt line of PCI devices is interpreted as edge-triggered, however the interrupt signal of the m_can controller integrated in Intel Elkhart Lake CPUs appears to be generated level-triggered. Consider the following sequence of events: - IR register is read, interrupt X is set - A new interrupt Y is triggered in the m_can controller - IR register is written to acknowledge interrupt X. Y remains set in IR As at no point in this sequence no interrupt flag is set in IR, the m_can interrupt line will never become deasserted, and no edge will ever be observed to trigger another run of the ISR. This was observed to result in the TX queue of the EHL m_can to get stuck under high load, because frames were queued to the hardware in m_can_start_xmit(), but m_can_finish_tx() was never run to account for their successful transmission. On an Elkhart Lake based board with the two CAN interfaces connected to each other, the following script can reproduce the issue: ip link set can0 up type can bitrate 1000000 ip link set can1 up type can bitrate 1000000 cangen can0 -g 2 -I 000 -L 8 & cangen can0 -g 2 -I 001 -L 8 & cangen can0 -g 2 -I 002 -L 8 & cangen can0 -g 2 -I 003 -L 8 & cangen can0 -g 2 -I 004 -L 8 & cangen can0 -g 2 -I 005 -L 8 & cangen can0 -g 2 -I 006 -L 8 & cangen can0 -g 2 -I 007 -L 8 & cangen can1 -g 2 -I 100 -L 8 & cangen can1 -g 2 -I 101 -L 8 & cangen can1 -g 2 -I 102 -L 8 & cangen can1 -g 2 -I 103 -L 8 & cangen can1 -g 2 -I 104 -L 8 & cangen can1 -g 2 -I 105 -L 8 & cangen can1 -g 2 -I 106 -L 8 & cangen can1 -g 2 -I 107 -L 8 & stress-ng --matrix 0 & To fix the issue, repeatedly read and acknowledge interrupts at the start of the ISR until no interrupt flags are set, so the next incoming interrupt will also result in an edge on the interrupt line. While we have received a report that even with this patch, the TX queue can become stuck under certain (currently unknown) circumstances on the Elkhart Lake, this patch completely fixes the issue with the above reproducer, and it is unclear whether the remaining issue has a similar cause at all. Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/fdf0439c51bcb3a46c21e9fb21c7f1d06363be84.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18can: m_can: set init flag earlier in probeMatthias Schiffer
While an m_can controller usually already has the init flag from a hardware reset, no such reset happens on the integrated m_can_pci of the Intel Elkhart Lake. If the CAN controller is found in an active state, m_can_dev_setup() would fail because m_can_niso_supported() calls m_can_cccr_update_bits(), which refuses to modify any other configuration bits when CCCR_INIT is not set. To avoid this issue, set CCCR_INIT before attempting to modify any other configuration flags. Fixes: cd5a46ce6fa6 ("can: m_can: don't enable transceiver when probing") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18macintosh: declare ctl_table as constLuis Felipe Hernandez
Since commit 7abc9b53bd51 ("sysctl: allow registration of const struct ctl_table"), the sysctl registration API allows struct ctl_table variables to be placed into read-only memory. mac_hid_files is registered as a sysctl table and should be treated as read-only. By declaring the mac_hid_files structure as const, we ensure that it cannot be accidentally modified. This change improves safety. Suggested-by: Thomas Weißschuh <linux@weissschuh.net> Suggested-by: Ricardo B. Marliere <rbm@suse.com> Reviewed-by: Ricardo B. Marliere <rbm@suse.com> Signed-off-by: Luis Felipe Hernandez <luis.hernandez093@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241217-constify_ctl_table-v1-1-402ebceaeb8e@gmail.com
2024-12-17net: netdevsim: fix nsim_pp_hold_write()Eric Dumazet
nsim_pp_hold_write() has two problems: 1) It may return with rtnl held, as found by syzbot. 2) Its return value does not propagate an error if any. Fixes: 1580cbcbfe77 ("net: netdevsim: add some fake page pool use") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216083703.1859921-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17x86/cpu: Expose only stepping min/max interfaceDave Hansen
The x86_match_cpu() infrastructure can match CPU steppings. Since there are only 16 possible steppings, the matching infrastructure goes all out and stores the stepping match as a bitmap. That means it can match any possible steppings in a single list entry. Fun. But it exposes this bitmap to each of the X86_MATCH_*() helpers when none of them really need a bitmap. It makes up for this by exporting a helper (X86_STEPPINGS()) which converts a contiguous stepping range into the bitmap which every single user leverages. Instead of a bitmap, have the main helper for this sort of thing (X86_MATCH_VFM_STEPS()) just take a stepping range. This ends up actually being even more compact than before. Leave the helper in place (renamed to __X86_STEPPINGS()) to make it more clear what is going on instead of just having a random GENMASK() in the middle of an already complicated macro. One oddity that I hit was this macro: X86_MATCH_VFM_STEPS(vfm, X86_STEPPING_MIN, max_stepping, issues) It *could* have been converted over to take a min/max stepping value for each entry. But that would have been a bit too verbose and would prevent the one oddball in the list (INTEL_COMETLAKE_L stepping 0) from sticking out. Instead, just have it take a *maximum* stepping and imply that the match is from 0=>max_stepping. This is functional for all the cases now and also retains the nice property of having INTEL_COMETLAKE_L stepping 0 stick out like a sore thumb. skx_cpuids[] is goofy. It uses the stepping match but encodes all possible steppings. Just use a normal, non-stepping match helper. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185129.65527B2A%40davehans-spike.ostc.intel.com
2024-12-17idpf: trigger SW interrupt when exiting wb_on_itr modeJoshua Hay
There is a race condition between exiting wb_on_itr and completion write backs. For example, we are in wb_on_itr mode and a Tx completion is generated by HW, ready to be written back, as we are re-enabling interrupts: HW SW | | | | idpf_tx_splitq_clean_all | | napi_complete_done | | | tx_completion_wb | idpf_vport_intr_update_itr_ena_irq That tx_completion_wb happens before the vector is fully re-enabled. Continuing with this example, it is a UDP stream and the tx_completion_wb is the last one in the flow (there are no rx packets). Because the HW generated the completion before the interrupt is fully enabled, the HW will not fire the interrupt once the timer expires and the write back will not happen. NAPI poll won't be called. We have indicated we're back in interrupt mode but nothing else will trigger the interrupt. Therefore, the completion goes unprocessed, triggering a Tx timeout. To mitigate this, fire a SW triggered interrupt upon exiting wb_on_itr. This interrupt will catch the rogue completion and avoid the timeout. Add logic to set the appropriate bits in the vector's dyn_ctl register. Fixes: 9c4a27da0ecc ("idpf: enable WB_ON_ITR") Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-17idpf: add support for SW triggered interruptsJoshua Hay
SW triggered interrupts are guaranteed to fire after their timer expires, unlike Tx and Rx interrupts which will only fire after the timer expires _and_ a descriptor write back is available to be processed by the driver. Add the necessary fields, defines, and initializations to enable a SW triggered interrupt in the vector's dyn_ctl register. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-17clk: thead: Fix TH1520 emmc and shdci clock rateMaksim Kiselev
In accordance with LicheePi 4A BSP the clock that comes to emmc/sdhci is 198Mhz which is got through frequency division of source clock VIDEO PLL by 4 [1]. But now the AP_SUBSYS driver sets the CLK EMMC SDIO to the same frequency as the VIDEO PLL, equal to 792 MHz. This causes emmc/sdhci to work 4 times slower. Let's fix this issue by adding fixed factor clock that divides VIDEO PLL by 4 for emmc/sdhci. Link: https://github.com/revyos/thead-kernel/blob/7563179071a314f41cdcdbfd8cf6e101e73707f3/drivers/clk/thead/clk-light-fm.c#L454 Fixes: ae81b69fd2b1 ("clk: thead: Add support for T-Head TH1520 AP_SUBSYS clocks") Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com> Link: https://lore.kernel.org/r/20241210083029.92620-1-bigunclemax@gmail.com Tested-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Drew Fustini <dfustini@tenstorrent.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-12-17drm: rework FB_CORE dependencyArnd Bergmann
The 'select FB_CORE' statement moved from CONFIG_DRM to DRM_CLIENT_LIB, but there are now configurations that have code calling into fb_core as built-in even though the client_lib itself is a loadable module: x86_64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_set_suspend': drm_fb_helper.c:(.text+0x2c6): undefined reference to `fb_set_suspend' x86_64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_resume_worker': drm_fb_helper.c:(.text+0x2e1): undefined reference to `fb_set_suspend' In addition to DRM_CLIENT_LIB, the 'select' needs to be at least in DRM_KMS_HELPER and DRM_GEM_SHMEM_HELPER, so add it here. This patch is the KMS_HELPER part of [1]. Fixes: dadd28d4142f ("drm/client: Add client-lib module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/141411/ # 1 Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-4-tzimmermann@suse.de Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-12-17drm/fbdev: Select FB_CORE dependency for fbdev on DMA and TTMThomas Zimmermann
Select FB_CORE if GEM's DMA and TTM implementations support fbdev emulation. Fixes linker errors about missing symbols from the fbdev subsystem. Also see [1] for a related SHMEM fix. Fixes: dadd28d4142f ("drm/client: Add client-lib module") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/141411/ # 1 Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-3-tzimmermann@suse.de
2024-12-17fbdev: Fix recursive dependencies wrt BACKLIGHT_CLASS_DEVICEThomas Zimmermann
Do not select BACKLIGHT_CLASS_DEVICE from FB_BACKLIGHT. The latter only controls backlight support within fbdev core code and data structures. Make fbdev drivers depend on BACKLIGHT_CLASS_DEVICE and let users select it explicitly. Fixes warnings about recursive dependencies, such as error: recursive dependency detected! symbol BACKLIGHT_CLASS_DEVICE is selected by FB_BACKLIGHT symbol FB_BACKLIGHT is selected by FB_SH_MOBILE_LCDC symbol FB_SH_MOBILE_LCDC depends on FB_DEVICE symbol FB_DEVICE depends on FB_CORE symbol FB_CORE is selected by DRM_GEM_DMA_HELPER symbol DRM_GEM_DMA_HELPER is selected by DRM_PANEL_ILITEK_ILI9341 symbol DRM_PANEL_ILITEK_ILI9341 depends on BACKLIGHT_CLASS_DEVICE BACKLIGHT_CLASS_DEVICE is user-selectable, so making drivers adapt to it is the correct approach in any case. For most drivers, backlight support is also configurable separately. v3: - Select BACKLIGHT_CLASS_DEVICE in PowerMac defconfigs (Christophe) - Fix PMAC_BACKLIGHT module dependency corner cases (Christophe) v2: - s/BACKLIGHT_DEVICE_CLASS/BACKLIGHT_CLASS_DEVICE (Helge) - Fix fbdev driver-dependency corner case (Arnd) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-2-tzimmermann@suse.de
2024-12-17Merge tag 'xsa465+xsa466-6.13-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Fix xen netfront crash (XSA-465) and avoid using the hypercall page that doesn't do speculation mitigations (XSA-466)" * tag 'xsa465+xsa466-6.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: remove hypercall page x86/xen: use new hypercall functions instead of hypercall page x86/xen: add central hypercall functions x86/xen: don't do PV iret hypercall through hypercall page x86/static-call: provide a way to do very early static-call updates objtool/x86: allow syscall instruction x86: make get_cpu_vendor() accessible from Xen code xen/netfront: fix crash when removing device
2024-12-17platform/x86: mlx-platform: call pci_dev_put() to balance the refcountJoe Hattori
mlxplat_pci_fpga_device_init() calls pci_get_device() but does not release the refcount on error path. Call pci_dev_put() on the error path and in mlxplat_pci_fpga_device_exit() to fix this. This bug was found by an experimental static analysis tool that I am developing. Fixes: 02daa222fbdd ("platform: mellanox: Add initial support for PCIe based programming logic device") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20241216022538.381209-1-joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-17pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap lockingEvgenii Shatokhin
If a device uses MCP23xxx IO expander to receive IRQs, the following bug can happen: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, ... preempt_count: 1, expected: 0 ... Call Trace: ... __might_resched+0x104/0x10e __might_sleep+0x3e/0x62 mutex_lock+0x20/0x4c regmap_lock_mutex+0x10/0x18 regmap_update_bits_base+0x2c/0x66 mcp23s08_irq_set_type+0x1ae/0x1d6 __irq_set_trigger+0x56/0x172 __setup_irq+0x1e6/0x646 request_threaded_irq+0xb6/0x160 ... We observed the problem while experimenting with a touchscreen driver which used MCP23017 IO expander (I2C). The regmap in the pinctrl-mcp23s08 driver uses a mutex for protection from concurrent accesses, which is the default for regmaps without .fast_io, .disable_locking, etc. mcp23s08_irq_set_type() calls regmap_update_bits_base(), and the latter locks the mutex. However, __setup_irq() locks desc->lock spinlock before calling these functions. As a result, the system tries to lock the mutex whole holding the spinlock. It seems, the internal regmap locks are not needed in this driver at all. mcp->lock seems to protect the regmap from concurrent accesses already, except, probably, in mcp_pinconf_get/set. mcp23s08_irq_set_type() and mcp23s08_irq_mask/unmask() are called under chip_bus_lock(), which calls mcp23s08_irq_bus_lock(). The latter takes mcp->lock and enables regmap caching, so that the potentially slow I2C accesses are deferred until chip_bus_unlock(). The accesses to the regmap from mcp23s08_probe_one() do not need additional locking. In all remaining places where the regmap is accessed, except mcp_pinconf_get/set(), the driver already takes mcp->lock. This patch adds locking in mcp_pinconf_get/set() and disables internal locking in the regmap config. Among other things, it fixes the sleeping in atomic context described above. Fixes: 8f38910ba4f6 ("pinctrl: mcp23s08: switch to regmap caching") Cc: stable@vger.kernel.org Signed-off-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/20241209074659.1442898-1-e.shatokhin@yadro.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-12-17qed: fix possible uninit pointer read in qed_mcp_nvm_info_populate()Gianfranco Trad
Coverity reports an uninit pointer read in qed_mcp_nvm_info_populate(). If EOPNOTSUPP is returned from qed_mcp_bist_nvm_get_num_images() ensure nvm_info.num_images is set to 0 to avoid possible uninit assignment to p_hwfn->nvm_info.image_att later on in out label. Closes: https://scan5.scan.coverity.com/#/project-view/63204/10063?selectedIssue=1636666 Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Gianfranco Trad <gianf.trad@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241215011733.351325-2-gianf.trad@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-17net: ethernet: bgmac-platform: fix an OF node reference leakJoe Hattori
The OF node obtained by of_parse_phandle() is not freed. Call of_node_put() to balance the refcount. This bug was found by an experimental static analysis tool that I am developing. Fixes: 1676aba5ef7e ("net: ethernet: bgmac: device tree phy enablement") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241214014912.2810315-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-17net: ethernet: oa_tc6: fix tx skb race condition between reference pointersParthiban Veerasooran
There are two skb pointers to manage tx skb's enqueued from n/w stack. waiting_tx_skb pointer points to the tx skb which needs to be processed and ongoing_tx_skb pointer points to the tx skb which is being processed. SPI thread prepares the tx data chunks from the tx skb pointed by the ongoing_tx_skb pointer. When the tx skb pointed by the ongoing_tx_skb is processed, the tx skb pointed by the waiting_tx_skb is assigned to ongoing_tx_skb and the waiting_tx_skb pointer is assigned with NULL. Whenever there is a new tx skb from n/w stack, it will be assigned to waiting_tx_skb pointer if it is NULL. Enqueuing and processing of a tx skb handled in two different threads. Consider a scenario where the SPI thread processed an ongoing_tx_skb and it moves next tx skb from waiting_tx_skb pointer to ongoing_tx_skb pointer without doing any NULL check. At this time, if the waiting_tx_skb pointer is NULL then ongoing_tx_skb pointer is also assigned with NULL. After that, if a new tx skb is assigned to waiting_tx_skb pointer by the n/w stack and there is a chance to overwrite the tx skb pointer with NULL in the SPI thread. Finally one of the tx skb will be left as unhandled, resulting packet missing and memory leak. - Consider the below scenario where the TXC reported from the previous transfer is 10 and ongoing_tx_skb holds an tx ethernet frame which can be transported in 20 TXCs and waiting_tx_skb is still NULL. tx_credits = 10; /* 21 are filled in the previous transfer */ ongoing_tx_skb = 20; waiting_tx_skb = NULL; /* Still NULL */ - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true. - After oa_tc6_prepare_spi_tx_buf_for_tx_skbs() ongoing_tx_skb = 10; waiting_tx_skb = NULL; /* Still NULL */ - Perform SPI transfer. - Process SPI rx buffer to get the TXC from footers. - Now let's assume previously filled 21 TXCs are freed so we are good to transport the next remaining 10 tx chunks from ongoing_tx_skb. tx_credits = 21; ongoing_tx_skb = 10; waiting_tx_skb = NULL; - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true again. - In the oa_tc6_prepare_spi_tx_buf_for_tx_skbs() ongoing_tx_skb = NULL; waiting_tx_skb = NULL; - Now the below bad case might happen, Thread1 (oa_tc6_start_xmit) Thread2 (oa_tc6_spi_thread_handler) --------------------------- ----------------------------------- - if waiting_tx_skb is NULL - if ongoing_tx_skb is NULL - ongoing_tx_skb = waiting_tx_skb - waiting_tx_skb = skb - waiting_tx_skb = NULL ... - ongoing_tx_skb = NULL - if waiting_tx_skb is NULL - waiting_tx_skb = skb To overcome the above issue, protect the moving of tx skb reference from waiting_tx_skb pointer to ongoing_tx_skb pointer and assigning new tx skb to waiting_tx_skb pointer, so that the other thread can't access the waiting_tx_skb pointer until the current thread completes moving the tx skb reference safely. Fixes: 53fbde8ab21e ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames") Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>