summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-26nfc: fdp: Merge the same judgmentwengjianfeng
Combine two judgments that return the same value Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211126013130.27112-1-samirweng1979@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26net: vlan: fix underflow for the real_dev refcntZiyang Xuan
Inject error before dev_hold(real_dev) in register_vlan_dev(), and execute the following testcase: ip link add dev dummy1 type dummy ip link add name dummy1.100 link dummy1 type vlan id 100 ip link del dev dummy1 When the dummy netdevice is removed, we will get a WARNING as following: ======================================================================= refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 2 PID: 0 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 and an endless loop of: ======================================================================= unregister_netdevice: waiting for dummy1 to become free. Usage count = -1073741824 That is because dev_put(real_dev) in vlan_dev_free() be called without dev_hold(real_dev) in register_vlan_dev(). It makes the refcnt of real_dev underflow. Move the dev_hold(real_dev) to vlan_dev_init() which is the call-back of ndo_init(). That makes dev_hold() and dev_put() for vlan's real_dev symmetrical. Fixes: 563bcbae3ba2 ("net: vlan: fix a UAF in vlan_dev_real_dev()") Reported-by: Petr Machata <petrm@nvidia.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20211126015942.2918542-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26ptp: fix filter names in the documentationJakub Kicinski
All the filter names are missing _PTP in them. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20211126031921.2466944-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()Julian Wiedmann
ethtool_set_coalesce() now uses both the .get_coalesce() and .set_coalesce() callbacks. But the check for their availability is buggy, so changing the coalesce settings on a device where the driver provides only _one_ of the callbacks results in a NULL pointer dereference instead of an -EOPNOTSUPP. Fix the condition so that the availability of both callbacks is ensured. This also matches the netlink code. Note that reproducing this requires some effort - it only affects the legacy ioctl path, and needs a specific combination of driver options: - have .get_coalesce() and .coalesce_supported but no .set_coalesce(), or - have .set_coalesce() but no .get_coalesce(). Here eg. ethtool doesn't cause the crash as it first attempts to call ethtool_get_coalesce() and bails out on error. Fixes: f3ccfda19319 ("ethtool: extend coalesce setting uAPI with CQE mode") Cc: Yufeng Mo <moyufeng@huawei.com> Cc: Huazhong Tan <tanhuazhong@huawei.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Link: https://lore.kernel.org/r/20211126175543.28000-1-jwi@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26nfc: virtual_ncidev: change default device permissionsThadeu Lima de Souza Cascardo
Device permissions is S_IALLUGO, with many unnecessary bits. Remove them and also remove read and write permissions from group and others. Before the change: crwsrwsrwt 1 0 0 10, 125 Nov 25 13:59 /dev/virtual_nci After the change: crw------- 1 0 0 10, 125 Nov 25 14:05 /dev/virtual_nci Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Bongsu Jeon <bongsu.jeon@samsung.com> Link: https://lore.kernel.org/r/20211125141457.716921-1-cascardo@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26net/sched: sch_ets: don't peek at classes beyond 'nbands'Davide Caratti
when the number of DRR classes decreases, the round-robin active list can contain elements that have already been freed in ets_qdisc_change(). As a consequence, it's possible to see a NULL dereference crash, caused by the attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets] Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287 RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000 RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0 R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100 FS: 00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0 Call Trace: <TASK> qdisc_peek_dequeued+0x29/0x70 [sch_ets] tbf_dequeue+0x22/0x260 [sch_tbf] __qdisc_run+0x7f/0x630 net_tx_action+0x290/0x4c0 __do_softirq+0xee/0x4f8 irq_exit_rcu+0xf4/0x130 sysvec_apic_timer_interrupt+0x52/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7f2aa7fc9ad4 Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00 RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202 RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720 RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720 RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380 R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460 </TASK> Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000018 Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to remove from the active list those elements that are no more SP nor DRR. [1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/ v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock acquired, thanks to Cong Wang. v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb from the next list item. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26Merge branch 'acpi-properties'Rafael J. Wysocki
Merge fix and cleanup related to the management of ACPI device properties for 5.16-rc3. * acpi-properties: ACPI: Make acpi_node_get_parent() local ACPI: Get acpi_device's parent from the parent field
2021-11-26Merge branch 'pm-sleep'Rafael J. Wysocki
Merge hibernation-related fixes for 5.16-rc3. * pm-sleep: PM: hibernate: Fix snapshot partial write lengths PM: hibernate: use correct mode for swsusp_close()
2021-11-26net: stmmac: Disable Tx queues when reconfiguring the interfaceYannick Vignon
The Tx queues were not disabled in situations where the driver needed to stop the interface to apply a new configuration. This could result in a kernel panic when doing any of the 3 following actions: * reconfiguring the number of queues (ethtool -L) * reconfiguring the size of the ring buffers (ethtool -G) * installing/removing an XDP program (ip l set dev ethX xdp) Prevent the panic by making sure netif_tx_disable is called when stopping an interface. Without this patch, the following kernel panic can be observed when doing any of the actions above: Unable to handle kernel paging request at virtual address ffff80001238d040 [....] Call trace: dwmac4_set_addr+0x8/0x10 dev_hard_start_xmit+0xe4/0x1ac sch_direct_xmit+0xe8/0x39c __dev_queue_xmit+0x3ec/0xaf0 dev_queue_xmit+0x14/0x20 [...] [ end trace 0000000000000002 ]--- Fixes: 5fabb01207a2d ("net: stmmac: Add initial XDP support") Fixes: aa042f60e4961 ("net: stmmac: Add support to Ethtool get/set ring parameters") Fixes: 0366f7e06a6be ("net: stmmac: add ethtool support for get/set channels") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Link: https://lore.kernel.org/r/20211124154731.1676949-1-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26Merge tag 'char-misc-5.16-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fix from Greg KH: "Here is a single binder driver fix for 5.16-rc3. It resolves a problem reported in the set of binder fixes that went into 5.16-rc1. It has been in linux-next for a while with no reported problems" * tag 'char-misc-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: binder: fix test regression due to sender_euid change
2021-11-26Merge tag 'staging-5.16-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging fixes from Greg KH: "Here are some small staging driver fixes and one driver removal for 5.16-rc3. The fixes resolve a number of small issues found in 5.16-rc1, nothing huge at all. The driver removal was due to a platform being removed in 5.16-rc1, but this driver was forgotten about. It wasn't being built anymore so it's safe to delete. All have been in linux-next for a while with no reported problems" * tag 'staging-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect() staging: greybus: Add missing rwsem around snd_ctl_remove() calls staging: Remove Netlogic XLP network driver staging: r8188eu: fix a memory leak in rtw_wx_read32() staging: r8188eu: use GFP_ATOMIC under spinlock staging: r8188eu: Use kzalloc() with GFP_ATOMIC in atomic context staging/fbtft: Fix backlight staging: r8188eu: Fix breakage introduced when 5G code was removed
2021-11-26Merge tag 'usb-5.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB fixes for reported problems for 5.16-rc3 They include: - typec driver fixes - new usb-serial driver ids - usb hub enumeration issues that were much reported - gadget driver fixes - dwc3 driver fix - chipidea driver fixe All of these have been in linux-next with no reported problems" * tag 'usb-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add Fibocom FM101-GL variants usb: typec: tipd: Fix initialization sequence for cd321x usb: typec: tipd: Fix typo in cd321x_switch_power_state usb: hub: Fix locking issues with address0_mutex USB: serial: pl2303: fix GC type detection USB: serial: option: add Telit LE910S1 0x9200 composition usb: chipidea: ci_hdrc_imx: fix potential error pointer dereference in probe usb: hub: Fix usb enumeration issue due to address0 race usb: typec: fusb302: Fix masking of comparator and bc_lvl interrupts usb: dwc3: leave default DMA for PCI devices usb: dwc2: hcd_queue: Fix use of floating point literal usb: dwc3: gadget: Fix null pointer exception usb: gadget: udc-xilinx: Fix an error handling path in 'xudc_probe()' usb: xhci: tegra: Check padctrl interrupt presence in device tree usb: dwc2: gadget: Fix ISOC flow for elapsed frames usb: dwc3: gadget: Check for L1/L2/U3 for Start Transfer usb: dwc3: gadget: Ignore NoStream after End Transfer usb: dwc3: core: Revise GHWPARAMS9 offset
2021-11-26Merge tag 'mmc-v5.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - mmc_spi: Add SPI IDs to silence warning - sdhci: Fix ADMA for PAGE_SIZE >= 64KiB - sdhci-esdhc-imx: Disable broken CMDQ for imx8qm/imx8qxp/imx8mm * tag 'mmc-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: spi: Add device-tree SPI IDs mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiB mmc: sdhci-esdhc-imx: disable CMDQ support
2021-11-26Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has an interrupt storm fix for the i801, better timeout handling for the new virtio driver, and some documentation fixes this time" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: docs: i2c: smbus-protocol: mention the repeated start condition i2c: virtio: disable timeout handling i2c: i801: Fix interrupt storm from SMB_ALERT signal i2c: i801: Restore INTREN on unload dt-bindings: i2c: imx-lpi2c: Fix i.MX 8QM compatible matching
2021-11-26Merge tag 'for-linus-5.16c-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - Kconfig fix to make it possible to control building of the privcmd driver - three fixes for issues identified by the kernel test robot - a five-patch series to simplify timeout handling for Xen PV driver initialization - two patches to fix error paths in xenstore/xenbus driver initialization * tag 'for-linus-5.16c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: make HYPERVISOR_set_debugreg() always_inline xen: make HYPERVISOR_get_debugreg() always_inline xen: detect uninitialized xenbus in xenbus_init xen: flag xen_snd_front to be not essential for system boot xen: flag pvcalls-front to be not essential for system boot xen: flag hvc_xen to be not essential for system boot xen: flag xen_drm_front to be not essential for system boot xen: add "not_essential" flag to struct xenbus_driver xen/pvh: add missing prototype to header xen: don't continue xenstore initialization in case of errors xen/privcmd: make option visible in Kconfig
2021-11-26Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Three arm64 fixes. The main one is a fix to the way in which we evaluate the macro arguments to our uaccess routines, which we _think_ might be the root cause behind some unkillable tasks we've seen in the Android arm64 CI farm (testing is ongoing). In any case, it's worth fixing. Other than that, we've toned down an over-zealous VM_BUG_ON() and fixed ftrace stack unwinding in a bunch of cases. Summary: - Evaluate uaccess macro arguments outside of the critical section - Tighten up VM_BUG_ON() in pmd_populate_kernel() to avoid false positive - Fix ftrace stack unwinding using HAVE_FUNCTION_GRAPH_RET_ADDR_PTR" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: uaccess: avoid blocking within critical sections arm64: mm: Fix VM_BUG_ON(mm != &init_mm) for trans_pgd arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
2021-11-26drm: msm: fix building without CONFIG_COMMON_CLKArnd Bergmann
When CONFIG_COMMON_CLOCK is disabled, the 8996 specific phy code is left out, which results in a link failure: ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to `msm_hdmi_phy_8996_cfg' This was only exposed after it became possible to build test the driver without the clock interfaces. Make COMMON_CLK a hard dependency for compile testing, and simplify it a little based on that. Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM") Reported-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20211013144308.2248978-1-arnd@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-26zram: only make zram_wb_devops for CONFIG_ZRAM_WRITEBACKJens Axboe
If writeback isn't configured, then we get the following warning when compiling zram: drivers/block/zram/zram_drv.c:1824:45: warning: unused variable 'zram_wb_devops' [-Wunused-const-variable] Make sure we only define the block_device_operations if that option is enabled. Link: https://lore.kernel.org/lkml/202111261614.gCJMqcyh-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-26block: call rq_qos_done() before ref check in batch completionsJens Axboe
We need to call rq_qos_done() regardless of whether or not we're freeing the request or not, as the reference count doesn't cover the IO completion tracking. Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()") Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reported-by: Kenneth R. Crudup <kenny@panix.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-26wilc1000: remove '-Wunused-but-set-variable' warning in chip_wakeup()Ajay Singh
Remove unused variables to avoid the below warnings: drivers/net/wireless/microchip/wilc1000/wlan.c: In function 'chip_wakeup': >> drivers/net/wireless/microchip/wilc1000/wlan.c:620:34: warning: variable 'to_host_from_fw_bit' set but not used [-Wunused-but-set-variable] 620 | u32 to_host_from_fw_reg, to_host_from_fw_bit; | ^~~~~~~~~~~~~~~~~~~ >> drivers/net/wireless/microchip/wilc1000/wlan.c:620:13: warning: variable 'to_host_from_fw_reg' set but not used [-Wunused-but-set-variable] 620 | u32 to_host_from_fw_reg, to_host_from_fw_bit; | ^~~~~~~~~~~~~~~~~~~ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ajay Singh <ajay.kathat@microchip.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211115102809.1408267-1-ajay.kathat@microchip.com
2021-11-26iwlwifi: mvm: read the rfkill state and feed it to iwlmeiEmmanuel Grumbach
Read the rfkill state upon boot, mac start and mac stop. Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112062814.7502-6-emmanuel.grumbach@intel.com
2021-11-26iwlwifi: mvm: add vendor commands needed for iwlmeiEmmanuel Grumbach
Add the vendor commands that must be used by the network manager to allow proper operation of iwlmei. * Send information on the AP CSME is connected to * Notify the userspace when roaming is forbidden * Allow the userspace to require ownership Co-Developed-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> v6: remove the VENDOR_CMDS Kconfig option and make the whole infra depend on IWLMEI directly v7: remove // comments remove an unneeded function Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112062814.7502-5-emmanuel.grumbach@intel.com
2021-11-26iwlwifi: integrate with iwlmeiEmmanuel Grumbach
iwlmei needs to know about the follwing events: * Association * De-association * Country Code change * SW Rfkill change * SAR table changes iwlmei can take the device away from us, so report the new rfkill type when this happens. Advertise the required data from the CSME firmware to the usersapce: mostly, the AP that the CSME firmware is currently associated to in case there is an active link protection session. Generate the HOST_ASSOC / HOST_DISSASSOC messages. Don't support WPA1 (non-RSNA) for now. Don't support shared wep either. We can then determine the AUTH parameter by checking the AKM. Feed the cipher from the key installation. SW Rfkill will be implemented later when cfg80211 will allow us to read the SW Rfkill state. Co-Developed-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> v7: Ayala added her signed-off remove pointless function declaration fix a bug due to merge conflict in the HOST_ASSOC message v8: leave a print if we have a SAP connection on a device we do not support (yet) Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112062814.7502-4-emmanuel.grumbach@intel.com
2021-11-26iwlwifi: mei: add debugfs hooksEmmanuel Grumbach
Add three debugfs hooks: * status: Check if we have a connection with the CSME firwmare. This hook is a read only. * req_ownership: Send a SAP command to request ownership. This flow should be triggered by iwlwifi (from user space through vendor commands really), but being able to trigger an ownership request from debugfs allows us to request ownership without connecting afterwards. This is an "error" flow that the CSME firmware is designed to handle this way: + Grant ownership since the host asked for it + Wait 3 seconds to let the host connect + If the host didn't connect, take the device back (forcefully). + Don't grant any new ownership request in the following 30 seconds. This debugfs hook allows us to test this flow. * send_start_message: Restart the communication with the CSME firmware from the very beginning. At the very beginning (upon iwlwifi start), iwlmei send a special message: SAP_ME_MSG_START. This hook allows to send it again and this will retrigger the whole flow. It is important to test this restart in the middle of normal operation since it can happen (in case the CSME firmware decided to reset for example). Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112062814.7502-3-emmanuel.grumbach@intel.com
2021-11-26iwlwifi: mei: add the driver to allow cooperation with CSMEEmmanuel Grumbach
CSME in two words ----------------- CSME stands for Converged Security and Management Engine. It is a CPU on the chipset and runs a dedicated firmware. AMT (Active Management Technology) is one of the applications that run on that CPU. AMT allows to control the platform remotely. Here is a partial list of the use cases: * View the screen of the plaform, with keyboard and mouse (KVM) * Attach a remote IDE device * Have a serial console to the device * Query the state of the platform * Reset / shut down / boot the platform Networking in CSME ------------------ For those uses cases, CSME's firmware has an embedded network stack and is able to use the network devices of the system: LAN and WLAN. This is thanks to the CSME's firmware WLAN driver. One can add a profile (SSID / key / certificate) to the CSME's OS and CSME will connect to that profile. Then, one can use the WLAN link to access the applications that run on CSME (AMT is one of them). Note that CSME is active during power state and power state transitions. For example, it is possible to have a KVM session open to the system while the system is rebooting and actually configure the BIOS remotely over WLAN thanks to AMT. How all this is related to Linux -------------------------------- In Linux, there is a driver that allows the OS to talk to the CSME firmware, this driver is drivers/misc/mei. This driver advertises a bus that allows other kernel drivers or even user space) to talk to components inside the CSME firmware. In practice, the system advertises a PCI device that allows to send / receive data to / from the CSME firmware. The mei bus drivers in drivers/misc/mei is an abstration on top of this PCI device. The driver being added here is called iwlmei and talks to the WLAN driver inside the CSME firmware through the mei bus driver. Note that the mei bus driver only gives bus services, it doesn't define the content of the communication. Why do we need this driver? -------------------------- CSME uses the same WLAN device that the OS is expecting to see hence we need an arbitration mechanism. This is what iwlmei is in charge of. iwlmei maintains the communication with the CSME firmware's WLAN driver. The language / protocol that is used between the CSME's firmware WLAN driver and iwlmei is OS agnostic and is called SAP which stands for Software Abritration Protocol. With SAP, iwlmei will be able to tell the CSME firmware's WLAN driver: 1) Please give me the device. 2) Please note that the SW/HW rfkill state change. 3) Please note that I am now associated to X. 4) Please note that I received this packet. etc... There are messages that go the opposite direction as well: 1) Please note that AMT is en/disable. 2) Please note that I believe the OS is broken and hence I'll take the device *now*, whether you like it or not, to make sure that connectivity is preserved. 3) Please note that I am willing to give the device if the OS needs it. 4) Please give me any packet that is sent on UDP / TCP on IP address XX.XX.XX.XX and an port ZZ. 5) Please send this packet. etc... Please check drivers/net/wireless/intel/iwlwifi/mei/sap.h for the full protocol specification. Arbitration is not the only purpose of iwlmei and SAP. SAP also allows to maintain the AMT's functionality even when the OS owns the device. To connect to AMT, one needs to initiate an HTTP connection to port 16992. iwlmei will listen to the Rx path and forward (through SAP) to the CSME firmware the data it got. Then, the embedded HTTP server in the chipset will reply to the request and send a SAP notification to ask iwlmei to send the reply. This way, AMT running on the CSME can still work. In practice this means that all the use cases quoted above (KVM, remote IDE device, etc...) will work even when the OS uses the WLAN device. How to disable all this? --------------------------- iwlmei won't be able to do anything if the CSME's networking stack is not enabled. By default, CSME's networking stack is disabled (this is a BIOS setting). In case the CSME's networking stack is disabled, iwlwifi will just get access to the device because there is no contention with any other actor and, hence, no arbitration is needed. In this patch, I only add the iwlmei driver. Integration with iwlwifi will be implemented in the next one. Co-Developed-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> v2: fix a few warnings raised by the different bots v3: rewrite the commit message v4: put the debugfs content in a different patch v5: fix a NULL pointer dereference upon DHCP TX if SAP is connected since we now have the required cfg80211 bits in wl-drv-next, add the RFKILL handling patch to this series. v6: change the SAP API to inherit the values from iwl-mei.h removing the need to ensure the values are equal with a BUILD_BUG_ON. This was suggested by Arend v7: * fix a locking issue in case of CSME firmware reset: When the CSME firmware resets, we need to unregister the netdev, first take the mutex, and only then, rely on it being taken. * Add a comment to explain why it is ok to have static variables (iwlmei can't have more than a single instance). * Add a define for 26 + 8 + 8 * Add a define SEND_SAP_MAX_WAIT_ITERATION * make struct const * Reword a bit the Kconfig help message * Ayala added her Signed-off * fixed an RCU annotation v8: do not require ownership upfront, use NIC_OWNER instead. This fixes a deadlock when CSME does not have the right WiFi FW. Add more documentation about the owernship transition Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112062814.7502-2-emmanuel.grumbach@intel.com
2021-11-26mei: bus: add client dma interfaceAlexander Usyskin
Expose the client dma mapping via mei client bus interface. The client dma has to be mapped before the device is enabled, therefore we need to create device linking already during mapping and we need to unmap after the client is disable hence we need to postpone the unlink and flush till unmapping or when destroying the device. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Co-developed-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20210420172755.12178-1-emmanuel.grumbach@intel.com Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112062814.7502-1-emmanuel.grumbach@intel.com
2021-11-26mwifiex: Ignore BTCOEX events from the 88W8897 firmwareJonas Dreßler
The firmware of the 88W8897 PCIe+USB card sends those events very unreliably, sometimes bluetooth together with 2.4ghz-wifi is used and no COEX event comes in, and sometimes bluetooth is disabled but the coexistance mode doesn't get disabled. This means we sometimes end up capping the rx/tx window size while bluetooth is not enabled anymore, artifically limiting wifi speeds even though bluetooth is not being used. Since we can't fix the firmware, let's just ignore those events on the 88W8897 device. From some Wireshark capture sessions it seems that the Windows driver also doesn't change the rx/tx window sizes when bluetooth gets enabled or disabled, so this is fairly consistent with the Windows driver. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211103205827.14559-1-verdre@v0yd.nl
2021-11-26mwifiex: Ensure the version string from the firmware is 0-terminatedJonas Dreßler
We assume at a few places that priv->version_str is 0-terminated, but right now we trust the firmware that this is the case with the version string we get from it. Let's rather ensure this ourselves and replace the last character with '\0'. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211103201800.13531-4-verdre@v0yd.nl
2021-11-26mwifiex: Add quirk to disable deep sleep with certain hardware revisionJonas Dreßler
The 88W8897 PCIe+USB card in the hardware revision 20 apparently has a hardware issue where the card wakes up from deep sleep randomly and very often, somewhat depending on the card activity, maybe the hardware has a floating wakeup pin or something. This was found by comparing two MS Surface Book 2 devices, where one devices wifi card experienced spurious wakeups, while the other one didn't. Those continuous wakeups prevent the card from entering host sleep when the computer suspends. And because the host won't answer to events from the card anymore while it's suspended, the firmwares internal power saving state machine seems to get confused and the card can't sleep anymore at all after that. Since we can't work around that hardware bug in the firmware, let's get the hardware revision string from the firmware and match it with known bad revisions. Then disable auto deep sleep for those revisions, which makes sure we no longer get those spurious wakeups. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211103201800.13531-3-verdre@v0yd.nl
2021-11-26mwifiex: Use a define for firmware version string lengthJonas Dreßler
Since the version string we get from the firmware is always 128 characters long, use a define for this size instead of having the number 128 copied all over the place. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211103201800.13531-2-verdre@v0yd.nl
2021-11-26mwifiex: Fix skb_over_panic in mwifiex_usb_recv()Zekun Shen
Currently, with an unknown recv_type, mwifiex_usb_recv just return -1 without restoring the skb. Next time mwifiex_usb_rx_complete is invoked with the same skb, calling skb_put causes skb_over_panic. The bug is triggerable with a compromised/malfunctioning usb device. After applying the patch, skb_over_panic no longer shows up with the same input. Attached is the panic report from fuzzing. skbuff: skb_over_panic: text:000000003bf1b5fa len:2048 put:4 head:00000000dd6a115b data:000000000a9445d8 tail:0x844 end:0x840 dev:<NULL> kernel BUG at net/core/skbuff.c:109! invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 0 PID: 198 Comm: in:imklog Not tainted 5.6.0 #60 RIP: 0010:skb_panic+0x15f/0x161 Call Trace: <IRQ> ? mwifiex_usb_rx_complete+0x26b/0xfcd [mwifiex_usb] skb_put.cold+0x24/0x24 mwifiex_usb_rx_complete+0x26b/0xfcd [mwifiex_usb] __usb_hcd_giveback_urb+0x1e4/0x380 usb_giveback_urb_bh+0x241/0x4f0 ? __hrtimer_run_queues+0x316/0x740 ? __usb_hcd_giveback_urb+0x380/0x380 tasklet_action_common.isra.0+0x135/0x330 __do_softirq+0x18c/0x634 irq_exit+0x114/0x140 smp_apic_timer_interrupt+0xde/0x380 apic_timer_interrupt+0xf/0x20 </IRQ> Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu> Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/YX4CqjfRcTa6bVL+@Zekuns-MBP-16.fios-router.home
2021-11-26rtw88: add quirk to disable pci caps on HP 250 G7 Notebook PCPing-Ke Shih
8821CE causes random freezes on HP 250 G7 Notebook PC. Add a quirk to disable pci ASPM capability. Reported-by: rtl8821cerfe2 <rtl8821cerfe2@protonmail.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211119052437.8671-1-pkshih@realtek.com
2021-11-26rtw88: add debugfs to force lowest basic rateYu-Yen Ting
The management frame with high rate e.g. 24M may not be transmitted smoothly in long range environment. Add a debugfs to force to use the lowest basic rate in order to debug the reachability of transmitting management frame. obtain current setting cat /sys/kernel/debug/ieee80211/phyX/rtw88/force_lowest_basic_rate force lowest rate: echo 1 > /sys/kernel/debug/ieee80211/phyX/rtw88/force_lowest_basic_rate Signed-off-by: Yu-Yen Ting <steventing@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211102022454.10944-2-pkshih@realtek.com
2021-11-26rtw88: follow the AP basic rates for tx mgmt frameYu-Yen Ting
By default the driver uses the 1M and 6M rate for managemnt frames in 2G and 5G bands respectively. But when the basic rates is configured from the mac80211, we need to send the management frames according to the basic rates. This commit makes the driver use the lowest basic rates to send the management frames. Signed-off-by: Yu-Yen Ting <steventing@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211102022454.10944-1-pkshih@realtek.com
2021-11-26rtw89: add AXIDMA and TX FIFO dump in mac_mem_dumpChia-Yuan Li
The AXIDMA is tx/rx packet transmission between PCIE host and device, and TX FIFO is MAC TX data. We dump them to verify that these memory buffers are correct. Signed-off-by: Chia-Yuan Li <leo.li@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211122021129.4339-1-pkshih@realtek.com
2021-11-26rtw89: fix potentially access out of range of RF register arrayPing-Ke Shih
The RF register array is used to help firmware to restore RF settings. The original code can potentially access out of range, if the size is between (RTW89_H2C_RF_PAGE_SIZE * RTW89_H2C_RF_PAGE_NUM + 1) to ((RTW89_H2C_RF_PAGE_SIZE + 1) * RTW89_H2C_RF_PAGE_NUM). Fortunately, current used size doesn't fall into the wrong case, and the size will not change if we don't update RF parameter. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211119055729.12826-1-pkshih@realtek.com
2021-11-26rtw89: remove unneeded variableChangcheng Deng
Fix the following coccicheck review: ./drivers/net/wireless/realtek/rtw89/mac.c: 1096: 5-8: Unneeded variable Remove unneeded variable used to store return value. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211110121135.151187-1-deng.changcheng@zte.com.cn
2021-11-26rtw89: remove unnecessary conditional operatorsYe Guojin
The conditional operator is unnecessary while assigning values to the bool variables. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211104061119.1685-1-ye.guojin@zte.com.cn
2021-11-26rtw89: update rtw89_regulatory map to R58-R31Zong-Zhe Yang
Start to configure entries with RTW89_QATAR, RTW89_UKRAINE, RTW89_CN. Adjust some entries with explicit rtw89_regulatory instead of RTW89_WW. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211101093106.28848-5-pkshih@realtek.com
2021-11-26rtw89: update tx power limit/limit_ru tables to R54Zong-Zhe Yang
Update tx power limit table and tx power limit_ru table to R54. Configure entries for MEXICO, CN, QATAR, and adjust some values of original entries. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211101093106.28848-4-pkshih@realtek.com
2021-11-26rtw89: update rtw89 regulation definition to R58-R31Zong-Zhe Yang
Support QATAR in rtw89_regulation_type and reorder the enum to align realtek R58-R31 regulation definition. Besides, if an unassigned entry of limit/limit_ru tables is read, return the corresponding WW value for the unconfigured case. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211101093106.28848-3-pkshih@realtek.com
2021-11-26rtw89: fill regd field of limit/limit_ru tables by enumZong-Zhe Yang
This modification just replaces the number filled in the regd field with the corresponding enum. No assignment of a value in a table is changed. Doing this first is because the follow-up patches may adjust the order of enum declarations. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211101093106.28848-2-pkshih@realtek.com
2021-11-26io_uring: fix link traversal lockingPavel Begunkov
WARNING: inconsistent lock state 5.16.0-rc2-syzkaller #0 Not tainted inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. ffff888078e11418 (&ctx->timeout_lock ){?.+.}-{2:2} , at: io_timeout_fn+0x6f/0x360 fs/io_uring.c:5943 {HARDIRQ-ON-W} state was registered at: [...] spin_unlock_irq include/linux/spinlock.h:399 [inline] __io_poll_remove_one fs/io_uring.c:5669 [inline] __io_poll_remove_one fs/io_uring.c:5654 [inline] io_poll_remove_one+0x236/0x870 fs/io_uring.c:5680 io_poll_remove_all+0x1af/0x235 fs/io_uring.c:5709 io_ring_ctx_wait_and_kill+0x1cc/0x322 fs/io_uring.c:9534 io_uring_release+0x42/0x46 fs/io_uring.c:9554 __fput+0x286/0x9f0 fs/file_table.c:280 task_work_run+0xdd/0x1a0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0xc14/0x2b40 kernel/exit.c:832 674ee8e1b4a41 ("io_uring: correct link-list traversal locking") fixed a data race but introduced a possible deadlock and inconsistentcy in irq states. E.g. io_poll_remove_all() spin_lock_irq(timeout_lock) io_poll_remove_one() spin_lock/unlock_irq(poll_lock); spin_unlock_irq(timeout_lock) Another type of problem is freeing a request while holding ->timeout_lock, which may leads to a deadlock in io_commit_cqring() -> io_flush_timeouts() and other places. Having 3 nested locks is also too ugly. Add io_match_task_safe(), which would briefly take and release timeout_lock for race prevention inside, so the actuall request cancellation / free / etc. code doesn't have it taken. Reported-by: syzbot+ff49a3059d49b0ca0eec@syzkaller.appspotmail.com Reported-by: syzbot+847f02ec20a6609a328b@syzkaller.appspotmail.com Reported-by: syzbot+3368aadcd30425ceb53b@syzkaller.appspotmail.com Reported-by: syzbot+51ce8887cdef77c9ac83@syzkaller.appspotmail.com Reported-by: syzbot+3cb756a49d2f394a9ee3@syzkaller.appspotmail.com Fixes: 674ee8e1b4a41 ("io_uring: correct link-list traversal locking") Cc: stable@kernel.org # 5.15+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/397f7ebf3f4171f1abe41f708ac1ecb5766f0b68.1637937097.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-26io_uring: fail cancellation for EXITING tasksPavel Begunkov
WARNING: CPU: 1 PID: 20 at fs/io_uring.c:6269 io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269 CPU: 1 PID: 20 Comm: kworker/1:0 Not tainted 5.16.0-rc1-syzkaller #0 Workqueue: events io_fallback_req_func RIP: 0010:io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269 Call Trace: <TASK> io_req_task_link_timeout+0x6b/0x1e0 fs/io_uring.c:6886 io_fallback_req_func+0xf9/0x1ae fs/io_uring.c:1334 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> We need original task's context to do cancellations, so if it's dying and the callback is executed in a fallback mode, fail the cancellation attempt. Fixes: 89b263f6d56e6 ("io_uring: run linked timeouts from task_work") Cc: stable@kernel.org # 5.15+ Reported-by: syzbot+ab0cfe96c2b3cd1c1153@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4c41c5f379c6941ad5a07cd48cb66ed62199cf7e.1637937097.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-26btrfs: fix the memory leak caused in lzo_compress_pages()Qu Wenruo
[BUG] Fstests generic/027 is pretty easy to trigger a slow but steady memory leak if run with "-o compress=lzo" mount option. Normally one single run of generic/027 is enough to eat up at least 4G ram. [CAUSE] In commit d4088803f511 ("btrfs: subpage: make lzo_compress_pages() compatible") we changed how @page_in is released. But that refactoring makes @page_in only released after all pages being compressed. This leaves error path not releasing @page_in. And by "error path" things like incompressible data will also be treated as an error (-E2BIG). Thus it can cause a memory leak if even nothing wrong happened. [FIX] Add check under @out label to release @page_in when needed, so when we hit any error, the input page is properly released. Reported-by: Josef Bacik <josef@toxicpanda.com> Fixes: d4088803f511 ("btrfs: subpage: make lzo_compress_pages() compatible") Reviewed-and-tested-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-26KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()Lai Jiangshan
INVLPG operates on guest virtual address, which are represented by vcpu->arch.walk_mmu. In nested virtualization scenarios, kvm_mmu_invlpg() was using the wrong MMU structure; if L2's invlpg were emulated by L0 (in practice, it hardly happen) when nested two-dimensional paging is enabled, the call to ->tlb_flush_gva() would be skipped and the hardware TLB entry would not be invalidated. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-5-jiangshanlai@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26KVM: X86: Fix when shadow_root_level=5 && guest root_level<4Lai Jiangshan
If the is an L1 with nNPT in 32bit, the shadow walk starts with pae_root. Fixes: a717a780fc4e ("KVM: x86/mmu: Support shadowing NPT when 5-level paging is enabled in host) Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-2-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILEVitaly Kuznetsov
With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test may hit RLIMIT_NOFILE limits: # ./kvm_create_max_vcpus KVM_CAP_MAX_VCPU_ID: 4096 KVM_CAP_MAX_VCPUS: 1024 Testing creating 1024 vCPUs, with IDs 0...1023. /dev/kvm not available (errno: 24), skipping test Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE capability which is generally not needed to run kvm selftests (but without raising the limit the test is doomed to fail anyway). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211123135953.667434-1-vkuznets@redhat.com> [Skip the test if the hard limit can be raised. - Paolo] Reviewed-by: Sean Christopherson <seanjc@google.com> Tested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUNVitaly Kuznetsov
Commit 63f5a1909f9e ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken") officially deprecated KVM_SET_CPUID{,2} ioctls after first successful KVM_RUN and promissed to make this sequence forbiden in 5.16. It's time to fulfil the promise. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211122175818.608220-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features testVitaly Kuznetsov
hyperv_features's sole purpose is to test access to various Hyper-V MSRs and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN is deprecated and soon-to-be forbidden, avoid it by re-creating test VM for each sub-test. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211122175818.608220-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>