summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2023-11-29drm/amd/display: fix a pipe mapping error in dcn32_fpuWenjing Liu
[why] In dcn32 DML pipes are ordered the same as dc pipes but only for used pipes. For example, if dc pipe 1 and 2 are used, their dml pipe indices would be 0 and 1 respectively. However update_pipe_slice_table_with_split_flags doesn't skip indices for free pipes. This causes us to not reference correct dml pipe output when building pipe topology. [how] Use two variables to iterate dc and dml pipes respectively and only increment dml pipe index when current dc pipe is not free. Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Update DCN35 watermarksNicholas Kazlauskas
[Why & How] Update to the new values per HW team request. Affects both stutter and z8. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amdgpu: update xgmi num links info post gc9.4.2Jonathan Kim
GC IP 9.4.2 and up support TA reporting of the number of xGMI links between peers. Tested-by: Vignesh Chander <vignesh.chander@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Add z-state support policy for dcn35Nicholas Kazlauskas
[Why] DML2 means that the dcn3x policy for calculating z-state support no longer runs from validate_bandwidth. This means we are unconditionally allowing Z8, the hardware default. [How] Port the policy over to DCN35, but with a few modifications: - Don't use min_dst_y_next_start as a check for Z8/Z10 allow - Add support for overriding the Z10 stutter period per ASIC - Cleanup the code to make the policy assignment more clear Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Include udelay when waiting for INBOX0 ACKAlvin Lee
When waiting for the ACK for INBOX0 message, we have to ensure to include the udelay for proper wait time Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29cpufreq/amd-pstate: Only print supported EPP values for performance governorAyush Jain
show_energy_performance_available_preferences() to show only supported values which is performance in performance governor policy. -------Before-------- $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver amd-pstate-epp $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences default performance balance_performance balance_power power -------After-------- $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver amd-pstate-epp $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences performance Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors") Suggested-by: Wyes Karny <wyes.karny@amd.com> Signed-off-by: Ayush Jain <ayush.jain3@amd.com> Reviewed-by: Wyes Karny <wyes.karny@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29dm-flakey: start allocating with MAX_ORDERMikulas Patocka
Commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") changed the meaning of MAX_ORDER from exclusive to inclusive. So, we can allocate compound pages with up to 1 << MAX_ORDER pages. Reflect this change in dm-flakey and start trying to allocate compound pages with MAX_ORDER. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29dm-verity: align struct dm_verity_fec_io properlyMikulas Patocka
dm_verity_fec_io is placed after the end of two hash digests. If the hash digest has unaligned length, struct dm_verity_fec_io could be unaligned. This commit fixes the placement of struct dm_verity_fec_io, so that it's aligned. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29dm verity: don't perform FEC for failed readahead IOWu Bo
We found an issue under Android OTA scenario that many BIOs have to do FEC where the data under dm-verity is 100% complete and no corruption. Android OTA has many dm-block layers, from upper to lower: dm-verity dm-snapshot dm-origin & dm-cow dm-linear ufs DM tables have to change 2 times during Android OTA merging process. When doing table change, the dm-snapshot will be suspended for a while. During this interval, many readahead IOs are submitted to dm_verity from filesystem. Then the kverity works are busy doing FEC process which cost too much time to finish dm-verity IO. This causes needless delay which feels like system is hung. After adding debugging it was found that each readahead IO needed around 10s to finish when this situation occurred. This is due to IO amplification: dm-snapshot suspend erofs_readahead // 300+ io is submitted dm_submit_bio (dm_verity) dm_submit_bio (dm_snapshot) bio return EIO bio got nothing, it's empty verity_end_io verity_verify_io forloop range(0, io->n_blocks) // each io->nblocks ~= 20 verity_fec_decode fec_decode_rsb fec_read_bufs forloop range(0, v->fec->rsn) // v->fec->rsn = 253 new_read submit_bio (dm_snapshot) end loop end loop dm-snapshot resume Readahead BIOs get nothing while dm-snapshot is suspended, so all of them will cause verity's FEC. Each readahead BIO needs to verify ~20 (io->nblocks) blocks. Each block needs to do FEC, and every block needs to do 253 (v->fec->rsn) reads. So during the suspend interval(~200ms), 300 readahead BIOs trigger ~1518000 (300*20*253) IOs to dm-snapshot. As readahead IO is not required by userspace, and to fix this issue, it is best to pass readahead errors to upper layer to handle it. Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Wu Bo <bo.wu@vivo.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29dm verity: initialize fec io before freeing itWu Bo
If BIO error, verity_end_io() can call verity_finish_io() before verity_fec_init_io(). Therefore, fec_io->rs is not initialized and may crash when doing memory freeing in verity_fec_finish_io(). Crash call stack: die+0x90/0x2b8 __do_kernel_fault+0x260/0x298 do_bad_area+0x2c/0xdc do_translation_fault+0x3c/0x54 do_mem_abort+0x54/0x118 el1_abort+0x38/0x5c el1h_64_sync_handler+0x50/0x90 el1h_64_sync+0x64/0x6c free_rs+0x18/0xac fec_rs_free+0x10/0x24 mempool_free+0x58/0x148 verity_fec_finish_io+0x4c/0xb0 verity_end_io+0xb8/0x150 Cc: stable@vger.kernel.org # v6.0+ Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature") Signed-off-by: Wu Bo <bo.wu@vivo.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq updateWyes Karny
When amd_pstate is running, writing to scaling_min_freq and scaling_max_freq has no effect. These values are only passed to the policy level, but not to the platform level. This means that the platform does not know about the frequency limits set by the user. To fix this, update the min_perf and max_perf values at the platform level whenever the user changes the scaling_min_freq and scaling_max_freq values. Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors") Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29drm/panel: nt36523: fix return value check in nt36523_probe()Yang Yingliang
mipi_dsi_device_register_full() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR(). Fixes: 0993234a0045 ("drm/panel: Add driver for Novatek NT36523") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231129090715.856263-1-yangyingliang@huaweicloud.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231129090715.856263-1-yangyingliang@huaweicloud.com
2023-11-29drm/panel: starry-2081101qfh032011-53g: Fine tune the panel power sequencexiazhengqiao
For the "starry, 2081101qfh032011-53g" panel, it is stipulated in the panel spec that MIPI needs to keep the LP11 state before the lcm_reset pin is pulled high. Fixes: 6069b66cd962 ("drm/panel: support for STARRY 2081101QFH032011-53G MIPI-DSI panel") Signed-off-by: xiazhengqiao <xiazhengqiao@huaqin.corp-partner.google.com> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20231129084115.7918-1-xiazhengqiao@huaqin.corp-partner.google.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231129084115.7918-1-xiazhengqiao@huaqin.corp-partner.google.com
2023-11-29Merge tag 'pinctrl-v6.7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix a really interesting potential core bug in the list iterator requireing the use of READ_ONCE() discovered when testing kernel compiles with clang. - Check devm_kcalloc() return value and an array bounds in the STM32 driver. - Fix an exotic string truncation issue in the s32cc driver, found by the kernel test robot (impressive!) - Fix an undocumented struct member in the cy8c95x0 driver. - Fix a symbol overlap with MIPS in the Lochnagar driver, MIPS defines a global symbol "RST" which is a bit too generic and collide with stuff. OK this one should be renamed too, we will fix that as well. - Fix erroneous branch taking in the Realtek driver. - Fix the mail address in MAINTAINERS for the s32g2 driver. * tag 'pinctrl-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: dt-bindings: pinctrl: s32g2: change a maintainer email address pinctrl: realtek: Fix logical error when finding descriptor pinctrl: lochnagar: Don't build on MIPS pinctrl: avoid reload of p state in list iteration pinctrl: cy8c95x0: Fix doc warning pinctrl: s32cc: Avoid possible string truncation pinctrl: stm32: fix array read out of bound pinctrl: stm32: Add check for devm_kcalloc
2023-11-29drm/i915: Call intel_pre_plane_updates() also for pipes getting enabledVille Syrjälä
We used to call intel_pre_plane_updates() for any pipe going through a modeset whether the pipe was previously enabled or not. This in fact needed to apply all the necessary clock gating workarounds/etc. Restore the correct behaviour. Fixes: 39919997322f ("drm/i915: Disable all planes before modesetting any pipes") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231121054324.9988-3-ville.syrjala@linux.intel.com (cherry picked from commit e0d5ce11ed0a21bb2bf328ad82fd261783c7ad88) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-29drm/i915: Also check for VGA converter in eDP probeVille Syrjälä
Unfortunately even the HPD based detection added in commit cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") fails to detect that the VBT's eDP/DDI-A is a ghost on Asus B360M-A (CFL+CNP). On that board eDP/DDI-A has its HPD asserted despite nothing being actually connected there :( The straps/fuses also indicate that the eDP port is present. So if one boots with a VGA monitor connected the eDP probe will mistake the DP->VGA converter hooked to DDI-E for an eDP panel on DDI-A. As a last resort check what kind of DP device we've detected, and if it looks like a DP->VGA converter then conclude that the eDP port should be ignored. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9636 Fixes: cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231114142333.15799-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit fcd479a79120bf0cd507d85f898297a3b868dda6) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-29drm/i915/gsc: Mark internal GSC engine with reserved uabi classTvrtko Ursulin
The GSC CS is not exposed to the user, so we skipped assigning a uabi class number for it. However, the trace logs use the uabi class and instance to identify the engine, so leaving uabi class unset makes the GSC CS show up as the RCS in those logs. Given that the engine is not exposed to the user, we can't add a new case in the uabi enum, so we insted internally define a kernel internal class as -1. At the same time remove special handling for the name and complete the uabi_classes array so internal class is automatically correctly assigned. Engine will show as 65535:0 other0 in the logs/traces which should be unique enough. v2: * Fix uabi class u8 vs u16 type confusion. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 194babe26bdc ("drm/i915/mtl: don't expose GSC command streamer to the user") Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231116084456.291533-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit dfed6b58d54f3a5d7e6bc1fb060e2c936330eba2) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-28ravb: Fix races between ravb_tx_timeout_work() and net related opsYoshihiro Shimoda
Fix races between ravb_tx_timeout_work() and functions of net_device_ops and ethtool_ops by using rtnl_trylock() and rtnl_unlock(). Note that since ravb_close() is under the rtnl lock and calls cancel_work_sync(), ravb_tx_timeout_work() should calls rtnl_trylock(). Otherwise, a deadlock may happen in ravb_tx_timeout_work() like below: CPU0 CPU1 ravb_tx_timeout() schedule_work() ... __dev_close_many() // Under rtnl lock ravb_close() cancel_work_sync() // Waiting ravb_tx_timeout_work() rtnl_lock() // This is possible to cause a deadlock If rtnl_trylock() fails, rescheduling the work with sleep for 1 msec. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20231127122420.3706751-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29nouveau/gsp: replace zero-length array with flex-array member and use ↵Gustavo A. R. Silva
__counted_by Fake flexible arrays (zero-length and one-element arrays) are deprecated, and should be replaced by flexible-array members. So, replace zero-length array with a flexible-array member in `struct PACKED_REGISTRY_TABLE`. Also annotate array `entries` with `__counted_by()` to prepare for the coming implementation by GCC and Clang of the `__counted_by` attribute. Flexible array members annotated with `__counted_by` can have their accesses bounds-checked at run-time via `CONFIG_UBSAN_BOUNDS` (for array indexing) and `CONFIG_FORTIFY_SOURCE` (for strcpy/memcpy-family functions). This fixes multiple -Warray-bounds warnings: drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1069:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1070:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1071:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1072:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] While there, also make use of the struct_size() helper, and address checkpatch.pl warning: WARNING: please, no spaces at the start of a line This results in no differences in binary output. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZVZbX7C5suLMiBf+@work
2023-11-29nouveau/gsp/r535: remove a stray unlock in r535_gsp_rpc_send()Dan Carpenter
This unlock doesn't belong here and it leads to a double unlock in the caller, r535_gsp_rpc_push(). Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Timur Tabi <ttabi@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/a0293812-c05d-45f0-a535-3f24fe582c02@moroto.mountain
2023-11-29nouveau: find the smallest page allocation to cover a buffer alloc.Dave Airlie
With the new uapi we don't have the comp flags on the allocation, so we shouldn't be using the first size that works, we should be iterating until we get the correct one. This reduces allocations from 2MB to 64k in lots of places. Fixes dEQP-VK.memory.allocation.basic.size_8KiB.forward.count_4000 on my ampere/gsp system. Cc: stable@vger.kernel.org # v6.6 Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230811031520.248341-1-airlied@gmail.com
2023-11-28powercap: DTPM: Fix unneeded conversions to micro-WattsLukasz Luba
The power values coming from the Energy Model are already in uW. The PowerCap and DTPM frameworks operate on uW, so all places should just use the values from the EM. Fix the code by removing all of the conversion to uW still present in it. Fixes: ae6ccaa65038 (PM: EM: convert power field to micro-Watts precision and align drivers) Cc: 5.19+ <stable@vger.kernel.org> # v5.19+ Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch()Gautham R. Shenoy
cpufreq_driver->fast_switch() callback expects a frequency as a return value. amd_pstate_fast_switch() was returning the return value of amd_pstate_update_freq(), which only indicates a success or failure. Fix this by making amd_pstate_fast_switch() return the target_freq when the call to amd_pstate_update_freq() is successful, and return the current frequency from policy->cur when the call to amd_pstate_update_freq() is unsuccessful. Fixes: 4badf2eb1e98 ("cpufreq: amd-pstate: Add ->fast_switch() callback") Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Wyes Karny <wyes.karny@amd.com> Reviewed-by: Perry Yuan <perry.yuan@amd.com> Cc: 6.4+ <stable@vger.kernel.org> # v6.4+ Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28r8169: prevent potential deadlock in rtl8169_closeHeiner Kallweit
ndo_stop() is RTNL-protected by net core, and the worker function takes RTNL as well. Therefore we will deadlock when trying to execute a pending work synchronously. To fix this execute any pending work asynchronously. This will do no harm because netif_running() is false in ndo_stop(), and therefore the work function is effectively a no-op. However we have to ensure that no task is running or pending after rtl_remove_one(), therefore add a call to cancel_work_sync(). Fixes: abe5fc42f9ce ("r8169: use RTNL to protect critical sections") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/12395867-1d17-4cac-aa7d-c691938fcddf@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28r8169: fix deadlock on RTL8125 in jumbo mtu modeHeiner Kallweit
The original change results in a deadlock if jumbo mtu mode is used. Reason is that the phydev lock is held when rtl_reset_work() is called here, and rtl_jumbo_config() calls phy_start_aneg() which also tries to acquire the phydev lock. Fix this by calling rtl_reset_work() asynchronously. Fixes: 621735f59064 ("r8169: fix rare issue with broken rx after link-down on RTL8125") Reported-by: Ian Chen <free122448@hotmail.com> Tested-by: Ian Chen <free122448@hotmail.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/caf6a487-ef8c-4570-88f9-f47a659faf33@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28efi/unaccepted: Fix off-by-one when checking for overlapping rangesMichael Roth
When a task needs to accept memory it will scan the accepting_list to see if any ranges already being processed by other tasks overlap with its range. Due to an off-by-one in the range comparisons, a task might falsely determine that an overlapping range is being accepted, leading to an unnecessary delay before it begins processing the range. Fix the off-by-one in the range comparison to prevent this and slightly improve performance. Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance") Link: https://lore.kernel.org/linux-mm/20231101004523.vseyi5bezgfaht5i@amd.com/T/#me2eceb9906fcae5fe958b3fe88e41f920f8335b6 Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-11-28xen/events: fix error code in xen_bind_pirq_msi_to_irq()Dan Carpenter
Return -ENOMEM if xen_irq_init() fails. currently the code returns an uninitialized variable or zero. Fixes: 5dd9ad32d775 ("xen/events: drop xen_allocate_irqs_dynamic()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Juergen Gross <jgross@ssue.com> Link: https://lore.kernel.org/r/3b9ab040-a92e-4e35-b687-3a95890a9ace@moroto.mountain Signed-off-by: Juergen Gross <jgross@suse.com>
2023-11-28octeontx2-pf: Restore TC ingress police rules when interface is upSubbaraya Sundeep
TC ingress policer rules depends on interface receive queue contexts since the bandwidth profiles are attached to RQ contexts. When an interface is brought down all the queue contexts are freed. This in turn frees bandwidth profiles in hardware causing ingress police rules non-functional after the interface is brought up. Fix this by applying all the ingress police rules config to hardware in otx2_open. Also allow adding ingress rules only when interface is running since no contexts exist for the interface when it is down. Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930217-5707-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64Geetha sowjanya
When more than 64 VFs are enabled for a PF then mbox communication between VF and PF is not working as mbox work queueing for few VFs are skipped due to wrong calculation of VF numbers. Fixes: d424b6c02415 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930042-5400-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: stmmac: xgmac: Disable FPE MMC interruptsFurong Xu
Commit aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") tries to disable MMC interrupts to avoid a storm of unhandled interrupts, but leaves the FPE(Frame Preemption) MMC interrupts enabled, FPE MMC interrupts can cause the same problem. Now we mask FPE TX and RX interrupts to disable all MMC interrupts. Fixes: aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/20231125060126.2328690-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28drm/gpuvm: Fix deprecated license identifierThomas Hellström
"GPL-2.0-only" in the license header was incorrectly changed to the now deprecated "GPL-2.0". Fix. Cc: Maxime Ripard <mripard@kernel.org> Cc: Danilo Krummrich <dakr@redhat.com> Reported-by: David Edelsohn <dje.gcc@gmail.com> Closes: https://lore.kernel.org/dri-devel/5lfrhdpkwhpgzipgngojs3tyqfqbesifzu5nf4l5q3nhfdhcf2@25nmiq7tfrew/T/#m5c356d68815711eea30dd94cc6f7ea8cd4344fe3 Fixes: f7749a549b4f ("drm/gpuvm: Dual-licence the drm_gpuvm code GPL-2.0 OR MIT") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Acked-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231106114827.62492-1-thomas.hellstrom@linux.intel.com
2023-11-28Revert "drm/bridge: panel: Add a device link between drm device and panel ↵Linus Walleij
device" This reverts commit 199cf07ebd2b0d41185ac79b895547d45610b681. This patch creates bugs on devices where the DRM device is the ancestor of the panel devices. Attempts to fix this have failed because it leads to using device core functionality which is questionable. Reported-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/lkml/CACRpkdaGzXD6HbiX7mVUNJAJtMEPG00Pp6+nJ1P0JrfJ-ArMvQ@mail.gmail.com/T/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231128-revert-panel-fix-v1-3-69bb05048dae@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231128-revert-panel-fix-v1-3-69bb05048dae@linaro.org
2023-11-28Revert "driver core: Export device_is_dependent() to modules"Linus Walleij
This reverts commit 1d5e8f4bf06da86b71cc9169110d1a0e1e7af337. Greg says: "why exactly is this needed? Nothing outside of the driver core should be needing this function, it shouldn't be public at all (I missed that before.) So please, revert it for now, let's figure out why DRM thinks this is needed for it's devices, and yet no other bus/subsystem does." Link: https://lore.kernel.org/dri-devel/2023112739-willing-sighing-6bdd@gregkh/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231128-revert-panel-fix-v1-1-69bb05048dae@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231128-revert-panel-fix-v1-1-69bb05048dae@linaro.org
2023-11-28Revert "drm/bridge: panel: Check device dependency before managing device link"Linus Walleij
This reverts commit 39d5b6a64ace77d0c11c398d272218df5f939abb. This patch was causing build errors by using an unexported function from the device core, which Greg questions the saneness in exporting. Link: https://lore.kernel.org/lkml/CACRpkdaGzXD6HbiX7mVUNJAJtMEPG00Pp6+nJ1P0JrfJ-ArMvQ@mail.gmail.com/T/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231128-revert-panel-fix-v1-2-69bb05048dae@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231128-revert-panel-fix-v1-2-69bb05048dae@linaro.org
2023-11-28octeontx2-af: Fix possible buffer overflowElena Salomatkina
A loop in rvu_mbox_handler_nix_bandprof_free() contains a break if (idx == MAX_BANDPROF_PER_PFFUNC), but if idx may reach MAX_BANDPROF_PER_PFFUNC buffer '(*req->prof_idx)[layer]' overflow happens before that check. The patch moves the break to the beginning of the loop. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e8e095b3b370 ("octeontx2-af: cn10k: Bandwidth profiles config support"). Signed-off-by: Elena Salomatkina <elena.salomatkina.cmc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20231124210802.109763-1-elena.salomatkina.cmc@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-27Merge tag 'media/v6.7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab. * tag 'media/v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: pci: mgb4: add COMMON_CLK dependency media: v4l2-subdev: Fix a 64bit bug media: mgb4: Added support for T200 card variant media: vsp1: Remove unbalanced .s_stream(0) calls
2023-11-27netkit: Reject IFLA_NETKIT_PEER_INFO in netkit_change_linkDaniel Borkmann
The IFLA_NETKIT_PEER_INFO attribute can only be used during device creation, but not via changelink callback. Hence reject it there. Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Cc: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/e86a277a1e8d3b19890312779e42f790b0605ea4.1701115314.git.daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-27nvme: check for valid nvme_identify_ns() before using itEwan D. Milne
When scanning namespaces, it is possible to get valid data from the first call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second call in nvme_update_ns_info_block(). In particular, if the NSID becomes inactive between the two commands, a storage device may return a buffer filled with zero as per 4.1.5.1. In this case, we can get a kernel crash due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will be set to zero. PID: 326 TASK: ffff95fec3cd8000 CPU: 29 COMMAND: "kworker/u98:10" #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7 #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788 #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595 #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6 #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926 [exception RIP: blk_stack_limits+434] RIP: ffffffff92191872 RSP: ffffad8f8702fc80 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff95efa0c91800 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 00000000ffffffff R8: ffff95fec7df35a8 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff95fed33c09a8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core] #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core] This happened when the check for valid data was moved out of nvme_identify_ns() into one of the callers. Fix this by checking in both callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186 Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan") Cc: stable@vger.kernel.org Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-11-27Merge branch 'cpufreq/arm/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge ARM cpufreq fixes for 6.7-rc4 from Viresh Kumar. These fix issues related to power domains in the qcom cpufreq driver and an OPP-related issue in the imx6q cpufreq driver. * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: pmdomain: qcom: rpmpd: Set GENPD_FLAG_ACTIVE_WAKEUP cpufreq: qcom-nvmem: Preserve PM domain votes in system suspend cpufreq: qcom-nvmem: Enable virtual power domain devices cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
2023-11-27ACPI: video: Use acpi_video_device for cooling-dev driver dataHans de Goede
The acpi_video code was storing the acpi_video_device as driver_data in the acpi_device children of the acpi_video_bus acpi_device. But the acpi_video driver only binds to the bus acpi_device. It uses, but does not bind to, the children. Since it is not the driver it should not be using the driver_data of the children's acpi_device-s. Since commit 0d16710146a1 ("ACPI: bus: Set driver_data to NULL every time .add() fails") the childen's driver_data ends up getting set to NULL after a driver fails to bind to the children leading to a NULL pointer deref in video_get_max_state when registering the cooling-dev: [ 3.148958] BUG: kernel NULL pointer dereference, address: 0000000000000090 <snip> [ 3.149015] Hardware name: Sony Corporation VPCSB2X9R/VAIO, BIOS R2087H4 06/15/2012 [ 3.149021] RIP: 0010:video_get_max_state+0x17/0x30 [video] <snip> [ 3.149105] Call Trace: [ 3.149110] <TASK> [ 3.149114] ? __die+0x23/0x70 [ 3.149126] ? page_fault_oops+0x171/0x4e0 [ 3.149137] ? exc_page_fault+0x7f/0x180 [ 3.149147] ? asm_exc_page_fault+0x26/0x30 [ 3.149158] ? video_get_max_state+0x17/0x30 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149176] ? __pfx_video_get_max_state+0x10/0x10 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149192] __thermal_cooling_device_register.part.0+0xf2/0x2f0 [ 3.149205] acpi_video_bus_register_backlight.part.0.isra.0+0x414/0x570 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149227] acpi_video_register_backlight+0x57/0x80 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149245] intel_acpi_video_register+0x68/0x90 [i915 1f3a758130b32ef13d301d4f8f78c7d766d57f2a] [ 3.149669] intel_display_driver_register+0x28/0x50 [i915 1f3a758130b32ef13d301d4f8f78c7d766d57f2a] [ 3.150064] i915_driver_probe+0x790/0xb90 [i915 1f3a758130b32ef13d301d4f8f78c7d766d57f2a] [ 3.150402] local_pci_probe+0x45/0xa0 [ 3.150412] pci_device_probe+0xc1/0x260 <snip> Fix this by directly using the acpi_video_device as devdata for the cooling-device, which avoids the need to set driver-data on the children at all. Fixes: 0d16710146a1 ("ACPI: bus: Set driver_data to NULL every time .add() fails") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9718 Cc: 6.6+ <stable@vger.kernel.org> # 6.6+ Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-27dma-buf: fix check in dma_resv_add_fenceChristian König
It's valid to add the same fence multiple times to a dma-resv object and we shouldn't need one extra slot for each. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: a3f7c10a269d5 ("dma-buf/dma-resv: check if the new fence is really later") Cc: stable@vger.kernel.org # v5.19+ Link: https://patchwork.freedesktop.org/patch/msgid/20231115093035.1889-1-christian.koenig@amd.com
2023-11-27nvme-core: fix a memory leak in nvme_ns_info_from_identify()Maurizio Lombardi
In case of error, free the nvme_id_ns structure that was allocated by nvme_identify_ns(). Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-11-27nvme: fine-tune sending of first keep-aliveMark O'Donovan
Keep-alive commands are sent half-way through the kato period. This normally works well but fails when the keep-alive system is started when we are more than half way through the kato. This can happen on larger setups or due to host delays. With this change we now time the initial keep-alive command from the controller initialisation time, rather than the keep-alive mechanism activation time. Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-11-27vfio/pds: Fix possible sleep while in atomic contextBrett Creeley
The driver could possibly sleep while in atomic context resulting in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is set: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2817, name: bash preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 Call Trace: <TASK> dump_stack_lvl+0x36/0x50 __might_resched+0x123/0x170 mutex_lock+0x1e/0x50 pds_vfio_put_lm_file+0x1e/0xa0 [pds_vfio_pci] pds_vfio_put_save_file+0x19/0x30 [pds_vfio_pci] pds_vfio_state_mutex_unlock+0x2e/0x80 [pds_vfio_pci] pci_reset_function+0x4b/0x70 reset_store+0x5b/0xa0 kernfs_fop_write_iter+0x137/0x1d0 vfs_write+0x2de/0x410 ksys_write+0x5d/0xd0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 This can happen if pds_vfio_put_restore_file() and/or pds_vfio_put_save_file() grab the mutex_lock(&lm_file->lock) while the spin_lock(&pds_vfio->reset_lock) is held, which can happen during while calling pds_vfio_state_mutex_unlock(). Fix this by changing the reset_lock to reset_mutex so there are no such conerns. Also, make sure to destroy the reset_mutex in the driver specific VFIO device release function. This also fixes a spinlock bad magic BUG that was caused by not calling spinlock_init() on the reset_lock. Since, the lock is being changed to a mutex, make sure to call mutex_init() on it. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/kvm/1f9bc27b-3de9-4891-9687-ba2820c1b390@moroto.mountain/ Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231122192532.25791-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-11-27vfio/pds: Fix mutex lock->magic != lock warningBrett Creeley
The following BUG was found when running on a kernel with CONFIG_DEBUG_MUTEXES=y set: DEBUG_LOCKS_WARN_ON(lock->magic != lock) RIP: 0010:mutex_trylock+0x10d/0x120 Call Trace: <TASK> ? __warn+0x85/0x140 ? mutex_trylock+0x10d/0x120 ? report_bug+0xfc/0x1e0 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? mutex_trylock+0x10d/0x120 ? mutex_trylock+0x10d/0x120 pds_vfio_reset+0x3a/0x60 [pds_vfio_pci] pci_reset_function+0x4b/0x70 reset_store+0x5b/0xa0 kernfs_fop_write_iter+0x137/0x1d0 vfs_write+0x2de/0x410 ksys_write+0x5d/0xd0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 As shown, lock->magic != lock. This is because mutex_init(&pds_vfio->state_mutex) is called in the VFIO open path. So, if a reset is initiated before the VFIO device is opened the mutex will have never been initialized. Fix this by calling mutex_init(&pds_vfio->state_mutex) in the VFIO init path. Also, don't destroy the mutex on close because the device may be re-opened, which would cause mutex to be uninitialized. Fix this by implementing a driver specific vfio_device_ops.release callback that destroys the mutex before calling vfio_pci_core_release_dev(). Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231122192532.25791-2-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-11-27driver core: Export device_is_dependent() to modulesLiu Ying
Export device_is_dependent() since the drm_kms_helper module is starting to use it. Signed-off-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231127051414.3783108-2-victor.liu@nxp.com
2023-11-27pmdomain: arm: Avoid polling for scmi_perf_domainUlf Hansson
It was a mistake to prefer polling based mode when setting a performance level for a domain. Let's instead rely on the protocol to decide what is best and thus avoid polling when possible. Reported-by: Nikunj Kela <nkela@quicinc.com> Fixes: 2af23ceb8624 ("pmdomain: arm: Add the SCMI performance domain") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20231127135033.136442-1-ulf.hansson@linaro.org
2023-11-27wifi: mac80211: use wiphy locked debugfs helpers for agg_statusJohannes Berg
The read is currently with RCU and the write can deadlock, convert both for the sake of illustration. Make mac80211 depend on cfg80211 debugfs to get the helpers, but mac80211 debugfs without it does nothing anyway. This also required some adjustments in ath9k. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-11-27iommu/vt-d: Set variable intel_dirty_ops to staticKunwu Chan
Fix the following warning: drivers/iommu/intel/iommu.c:302:30: warning: symbol 'intel_dirty_ops' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20231120101025.1103404-1-chentao@kylinos.cn Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27iommu/vt-d: Fix incorrect cache invalidation for mm notificationLu Baolu
Commit 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") moved the secondary TLB invalidations into the TLB invalidation functions to ensure that all secondary TLB invalidations happen at the same time as the CPU invalidation and added a flush-all type of secondary TLB invalidation for the batched mode, where a range of [0, -1UL) is used to indicates that the range extends to the end of the address space. However, using an end address of -1UL caused an overflow in the Intel IOMMU driver, where the end address was rounded up to the next page. As a result, both the IOTLB and device ATC were not invalidated correctly. Add a flush all helper function and call it when the invalidation range is from 0 to -1UL, ensuring that the entire caches are invalidated correctly. Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") Cc: stable@vger.kernel.org Cc: Huang Ying <ying.huang@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Tested-by: Luo Yuzhang <yuzhang.luo@intel.com> # QAT Tested-by: Tony Zhu <tony.zhu@intel.com> # DSA Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20231117090933.75267-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>