summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-04-08RDMA/addr: Be strict with gid sizeLeon Romanovsky
The nla_len() is less than or equal to 16. If it's less than 16 then end of the "gid" buffer is uninitialized. Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08Merge tag 's390-5.12-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - fix incorrect dereference of the ext_params2 external interrupt parameter, which leads to an instant kernel crash if a pfault interrupt occurs. - add forgotten stack unwinder support, and fix memory leak for the new machine check handler stack. - fix inline assembly register clobbering due to KASAN code instrumentation. * tag 's390-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/setup: use memblock_free_late() to free old stack s390/irq: fix reading of ext_params2 field from lowcore s390/unwind: add machine check handler stack s390/cpcmd: fix inline assembly register clobbering
2021-04-08arm64: Get rid of CONFIG_ARM64_VHEMarc Zyngier
CONFIG_ARM64_VHE was introduced with ARMv8.1 (some 7 years ago), and has been enabled by default for almost all that time. Given that newer systems that are VHE capable are finally becoming available, and that some systems are even incapable of not running VHE, drop the configuration altogether. Anyone willing to stick to non-VHE on VHE hardware for obscure reasons should use the 'kvm-arm.mode=nvhe' command-line option. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210408131010.1109027-4-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64: Cope with CPUs stuck in VHE modeMarc Zyngier
It seems that the CPUs part of the SoC known as Apple M1 have the terrible habit of being stuck with HCR_EL2.E2H==1, in violation of the architecture. Try and work around this deplorable state of affairs by detecting the stuck bit early and short-circuit the nVHE dance. Additional filtering code ensures that attempts at switching to nVHE from the command-line are also ignored. It is still unknown whether there are many more such nuggets to be found... Reported-by: Hector Martin <marcan@marcan.st> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210408131010.1109027-3-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64: cpufeature: Allow early filtering of feature overrideMarc Zyngier
Some CPUs are broken enough that some overrides need to be rejected at the earliest opportunity. In some cases, that's right at cpu feature override time. Provide the necessary infrastructure to filter out overrides, and to report such filtered out overrides to the core cpufeature code. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210408131010.1109027-2-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64: Require that system registers at all visible ELs be initializedMark Brown
Currently we require that software at a higher exception level initialise all registers at the exception level the kernel will be entered prior to starting the kernel in order to ensure that there is nothing uninitialised which could result in an UNKNOWN state while running the kernel. The expectation is that the software running at the highest exception levels will be tightly coupled to the system and can ensure that all available features are appropriately initialised and that the kernel can initialise anything else. There is a gap here in the case where new registers are added to lower exception levels that require initialisation but the kernel does not yet understand them. Extend the requirement to also include exception levels below the one where the kernel is entered to cover this. Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210401180942.35815-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64: Disable fine grained traps on bootMark Brown
The arm64 FEAT_FGT extension introduces a set of traps to EL2 for accesses to small sets of registers and instructions from EL1 and EL0. Currently Linux makes no use of this feature, ensure that it is not active at boot by disabling the traps during EL2 setup. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210401180942.35815-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64: Document requirements for fine grained traps at bootMark Brown
The arm64 FEAT_FGT extension introduces a set of traps to EL2 for accesses to small sets of registers and instructions from EL1 and EL0, access to which is controlled by EL3. Require access to it so that it is available to us in future and so that we can ensure these traps are disabled during boot. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210401180942.35815-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08ice: fix memory leak of aRFS after resuming from suspendYongxin Liu
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor. In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost. So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released. Fix the following memory leak. unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse warning: missing error code 'err'Arkadiusz Kubalewski
Set proper return values inside error checking if-statements. Previously following warning was produced when compiling against sparse. i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err' Fixes: 4ff0ee1af0169 ("i40e: Introduce recovery mode support") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse error: 'vsi->netdev' could be nullArkadiusz Kubalewski
Remove vsi->netdev->name from the trace. This is redundant information. With the devinfo trace, the adapter is already identifiable. Previously following error was produced when compiling against sparse. i40e_main.c:2571 i40e_sync_vsi_filters() error: we previously assumed 'vsi->netdev' could be null (see line 2323) Fixes: b603f9dc20af ("i40e: Log info when PF is entering and leaving Allmulti mode.") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse error: uninitialized symbol 'ring'Arkadiusz Kubalewski
Init pointer with NULL in default switch case statement. Previously the error was produced when compiling against sparse. i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'. Fixes: 44ea803e2fa7 ("i40e: introduce new dump desc XDP command") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse errors in i40e_txrx.cArkadiusz Kubalewski
Remove error handling through pointers. Instead use plain int to return value from i40e_run_xdp(...). Previously: - sparse errors were produced during compilation: i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR() - sk_buff* was used to return value, but it has never had valid pointer to sk_buff. Returned value was always int handled as a pointer. Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Fixes: 2e6893123830 ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix parameters in aq_get_phy_register()Grzegorz Siwik
Change parameters order in aq_get_phy_register() due to wrong statistics in PHY reported by ethtool. Previously all PHY statistics were exactly the same for all interfaces Now statistics are reported correctly - different for different interfaces Fixes: 0514db37dd78 ("i40e: Extend PHY access with page change flag") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08arm64: mte: Remove unused mte_assign_mem_tag_range()Vincenzo Frascino
mte_assign_mem_tag_range() was added in commit 85f49cae4dfc ("arm64: mte: add in-kernel MTE helpers") in 5.11 but moved out of mte.S by commit 2cb34276427a ("arm64: kasan: simplify and inline MTE functions") in 5.12 and renamed to mte_set_mem_tag_range(). 2cb34276427a did not delete the old function prototypes in mte.h. Remove the unused prototype from mte.h. Cc: Will Deacon <will@kernel.org> Reported-by: Derrick McKee <derrick.mckee@gmail.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20210407133817.23053-1-vincenzo.frascino@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64: Add __init section marker to some functionsJisheng Zhang
They are not needed after booting, so mark them as __init to move them to the .init section. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20210330135449.4dcffd7f@xhacker.debian Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08arm64/sve: Rework SVE access trap to convert state in registersMark Brown
When we enable SVE usage in userspace after taking a SVE access trap we need to ensure that the portions of the register state that are not shared with the FPSIMD registers are zeroed. Currently we do this by forcing the FPSIMD registers to be saved to the task struct and converting them there. This is wasteful in the common case where the task state is loaded into the registers and we will immediately return to userspace since we can initialise the SVE state directly in registers instead of accessing multiple copies of the register state in memory. Instead in that common case do the conversion in the registers and update the task metadata so that we can return to userspace without spilling the register state to memory unless there is some other reason to do so. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210312190313.24598-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08Merge tag 'sound-5.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This batch became unexpectedly bigger due to the pending ASoC patches, but all look small and fine device-specific fixes. Many of the commits are for ASoC Intel drivers, while the rest are for ASoC small codec/platform fixes and HD-audio quirks" * tag 'sound-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits) ALSA: hda/realtek: Fix speaker amp setup on Acer Aspire E1 ALSA: aloop: Fix initialization of controls ALSA: hda/conexant: Apply quirk for another HP ZBook G5 model ASoC: fsl_esai: Fix TDM slot setup for I2S mode ASoC: codecs: lpass-rx-macro: set npl clock rate correctly ASoC: codecs: lpass-tx-macro: set npl clock rate correctly ASoC: sunxi: sun4i-codec: fill ASoC card owner ASoC: cygnus: fix for_each_child.cocci warnings ASoC: max98373: Added 30ms turn on/off time delay ASoC: max98373: Changed amp shutdown register as volatile ASoC: intel: atom: Remove 44100 sample-rate from the media and deep-buffer DAI descriptions ASoC: intel: atom: Stop advertising non working S24LE support ASoC: wm8960: Fix wrong bclk and lrclk with pll enabled for some chips ASoC: SOF: Intel: move ELH chip info ASoC: SOF: Intel: APL: set shutdown callback to hda_dsp_shutdown ASoC: SOF: Intel: CNL: set shutdown callback to hda_dsp_shutdown ASoC: SOF: Intel: ICL: set shutdown callback to hda_dsp_shutdown ASoC: SOF: Intel: TGL: set shutdown callback to hda_dsp_shutdown ASoC: SOF: Intel: TGL: fix EHL ops ASoC: SOF: core: harden shutdown helper ...
2021-04-08Merge tag 'omap-for-v5.12/fixes-rc6-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fixes for omaps for v5.12-rc cycle Fix swapped mmc device order also for omap3 that got changed with the recent PROBE_PREFER_ASYNCHRONOUS changes. While eventually the aliases should be board specific, all the mmc device instances are all there in the SoC, and we do probe them by default so that PM runtime can idle the devices if left enabled from the bootloader. Also included are two compiler warning fixes. * tag 'omap-for-v5.12/fixes-rc6-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Fix uninitialized sr_inst ARM: dts: Fix swapped mmc order for omap3 ARM: OMAP2+: Fix warning for omap_init_time_of() ARM: OMAP4: PM: update ROM return address for OSWR and OFF ARM: OMAP4: Fix PMIC voltage domains for bionic ARM: dts: Fix moving mmc devices with aliases for omap4 & 5 ARM: dts: Drop duplicate sha2md5_fck to fix clk_disable race Link: https://lore.kernel.org/r/pull-1617702755-711306@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fix from Paolo Bonzini: "A lone x86 patch, for a bug found while developing a backport to stable versions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp
2021-04-08Merge tag 'for-linus-2021-04-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() fix from Christian Brauner: "Syzbot reported a bug in close_range. Debugging this showed we didn't recalculate the current maximum fd number for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC after we unshared the file descriptors table. As a result, max_fd could exceed the current fdtable maximum causing us to set excessive bits. As a concrete example, let's say the user requested everything from fd 4 to ~0UL to be closed and their current fdtable size is 256 with their highest open fd being 4. With CLOSE_RANGE_UNSHARE the caller will end up with a new fdtable which has room for 64 file descriptors since that is the lowest fdtable size we accept. But now max_fd will still point to 255 and needs to be adjusted. Fix this by retrieving the correct maximum fd value in __range_cloexec(). I've carried this fix for a little while but since there was no linux-next release over easter I waited until now. With this change close_range() can be further simplified but imho we are in no hurry to do that and so I'll defer this for the 5.13 merge window" * tag 'for-linus-2021-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: file: fix close_range() for unshare+cloexec
2021-04-08Merge tag 'qcom-drivers-fixes-for-5.12' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm fix for 5.12 This bypasses the, recently introduced, interconnect handling in the GENI (serial engine) driver when running off ACPI, as this causes the GENI probe to fail and the Lenovo Yoga C630 to boot without keyboard and touchpad. * tag 'qcom-drivers-fixes-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: soc: qcom: geni: shield geni_icc_get() for ACPI boot Link: https://lore.kernel.org/r/20210404155604.712236-1-bjorn.andersson@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-08Merge tag 'sunxi-fixes-for-5.12-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes One 32kHz clock fix for the beelink gs1, a CD polarity fix for the SoPine, some MAINTAINERS maintainance, and a clk / reset switch to our headers. * tag 'sunxi-fixes-for-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: h6: beelink-gs1: Remove ext. 32 kHz osc reference MAINTAINERS: Match on allwinner keyword MAINTAINERS: Add our new mailing-list arm64: dts: allwinner: Fix SD card CD GPIO for SOPine systems arm64: dts: allwinner: h6: Switch to macros for RSB clock/reset indices Link: https://lore.kernel.org/r/9972a85e-60b7-49f4-a246-db3396dd4764.lettre@localhost Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-08Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull umount fix from Al Viro: "Brown paperbag time: dumb braino in the series that went into 5.7 broke the 'don't step into ->d_weak_revalidate() when umount(2) looks the victim up' behaviour. Spotted only now - saw if (!err && unlikely(nd->flags & LOOKUP_MOUNTPOINT)) { err = handle_lookup_down(nd); nd->flags &= ~LOOKUP_JUMPED; // no d_weak_revalidate(), please... } and went "why do we clear that flag here - nothing below that point is going to check it anyway" / "wait a minute, what is it doing *after* complete_walk() (which is where we check that flag and call ->d_weak_revalidate())" / "how could that possibly _not_ break?", followed by reproducing the breakage and verifying that the obvious fix of that braino does, indeed, fix it. The reproducer is (assuming that $DIR exists and is exported r/w to localhost) mkdir $DIR/a mkdir /tmp/foo mount --bind /tmp/foo /tmp/foo mkdir /tmp/foo/a mkdir /tmp/foo/b mount -t nfs4 localhost:$DIR/a /tmp/foo/a mount -t nfs4 localhost:$DIR /tmp/foo/b rmdir /tmp/foo/b/a umount /tmp/foo/b umount /tmp/foo/a umount -l /tmp/foo # will get everything under /tmp/foo, no matter what Correct behaviour is successful umount; broken kernels (5.7-rc1 and later) get umount.nfs4: /tmp/foo/a: Stale file handle Note that bind mount is there to be able to recover - on broken kernels we'd get stuck with impossible-to-umount filesystem if not for that. FWIW, that braino had been posted for review back then, at least twice. Unfortunately, the call of complete_walk() was outside of diff context, so the bogosity hadn't been immediately obvious from the patch alone ;-/" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late
2021-04-08x86/sgx: Do not update sgx_nr_free_pages in sgx_setup_epc_section()Jarkko Sakkinen
The commit in Fixes: changed the SGX EPC page sanitization to end up in sgx_free_epc_page() which puts clean and sanitized pages on the free list. This was done for the reason that it is best to keep the logic to assign available-for-use EPC pages to the correct NUMA lists in a single location. sgx_nr_free_pages is also incremented by sgx_free_epc_pages() but those pages which are being added there per EPC section do not belong to the free list yet because they haven't been sanitized yet - they land on the dirty list first and the sanitization happens later when ksgxd starts massaging them. So remove that addition there and have sgx_free_epc_page() do that solely. [ bp: Sanitize commit message too. ] Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210408092924.7032-1-jarkko@kernel.org
2021-04-08nl80211: fix beacon head validationJohannes Berg
If the beacon head attribute (NL80211_ATTR_BEACON_HEAD) is too short to even contain the frame control field, we access uninitialized data beyond the buffer. Fix this by checking the minimal required size first. We used to do this until S1G support was added, where the fixed data portion has a different size. Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 1d47f1198d58 ("nl80211: correctly validate S1G beacon head") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-08dt-bindings: timer: nuvoton,npcm7xx: Add wpcm450-timerJonathan Neuschäfer
Add a compatible string for WPCM450, which has essentially the same timer controller. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210320181610.680870-6-j.neuschaefer@gmx.net
2021-04-08clocksource/drivers/arm_arch_timer: Add __ro_after_init and __initJisheng Zhang
Some functions are not needed after booting, so mark them as __init to move them to the .init section. Some global variables are never modified after init, so can be __ro_after_init. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210330140444.4fb2a7cb@xhacker.debian
2021-04-08clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940Tony Lindgren
There is a timer wrap issue on dra7 for the ARM architected timer. In a typical clock configuration the timer fails to wrap after 388 days. To work around the issue, we need to use timer-ti-dm percpu timers instead. Let's configure dmtimer3 and 4 as percpu timers by default, and warn about the issue if the dtb is not configured properly. Let's do this as a single patch so it can be backported to v5.8 and later kernels easily. Note that this patch depends on earlier timer-ti-dm systimer posted mode fixes, and a preparatory clockevent patch "clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue". For more information, please see the errata for "AM572x Sitara Processors Silicon Revisions 1.1, 2.0": https://www.ti.com/lit/er/sprz429m/sprz429m.pdf The concept is based on earlier reference patches done by Tero Kristo and Keerthy. Cc: Keerthy <j-keerthy@ti.com> Cc: Tero Kristo <kristo@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210323074326.28302-3-tony@atomide.com
2021-04-08bpf, x86: Validate computation of branch displacements for x86-32Piotr Krysiuk
The branch displacement logic in the BPF JIT compilers for x86 assumes that, for any generated branch instruction, the distance cannot increase between optimization passes. But this assumption can be violated due to how the distances are computed. Specifically, whenever a backward branch is processed in do_jit(), the distance is computed by subtracting the positions in the machine code from different optimization passes. This is because part of addrs[] is already updated for the current optimization pass, before the branch instruction is visited. And so the optimizer can expand blocks of machine code in some cases. This can confuse the optimizer logic, where it assumes that a fixed point has been reached for all machine code blocks once the total program size stops changing. And then the JIT compiler can output abnormal machine code containing incorrect branch displacements. To mitigate this issue, we assert that a fixed point is reached while populating the output image. This rejects any problematic programs. The issue affects both x86-32 and x86-64. We mitigate separately to ease backporting. Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-04-08bpf, x86: Validate computation of branch displacements for x86-64Piotr Krysiuk
The branch displacement logic in the BPF JIT compilers for x86 assumes that, for any generated branch instruction, the distance cannot increase between optimization passes. But this assumption can be violated due to how the distances are computed. Specifically, whenever a backward branch is processed in do_jit(), the distance is computed by subtracting the positions in the machine code from different optimization passes. This is because part of addrs[] is already updated for the current optimization pass, before the branch instruction is visited. And so the optimizer can expand blocks of machine code in some cases. This can confuse the optimizer logic, where it assumes that a fixed point has been reached for all machine code blocks once the total program size stops changing. And then the JIT compiler can output abnormal machine code containing incorrect branch displacements. To mitigate this issue, we assert that a fixed point is reached while populating the output image. This rejects any problematic programs. The issue affects both x86-32 and x86-64. We mitigate separately to ease backporting. Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-04-08platform/surface: aggregator: move to use request_irq by IRQF_NO_AUTOEN flagTian Tao
disable_irq() after request_irq() still has a time gap in which interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will disable IRQ auto-enable because of requesting. this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which is being merged: https://lore.kernel.org/patchwork/patch/1388765/ Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/1617778852-26492-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-04-08platform/mellanox: mlxreg-hotplug: move to use request_irq by IRQF_NO_AUTOEN ↵Tian Tao
flag disable_irq() after request_irq() still has a time gap in which interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will disable IRQ auto-enable because of requesting. this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which is being merged: https://lore.kernel.org/patchwork/patch/1388765/ Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/1617785983-28878-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-04-08Merge tag 'irq-no-autoen-2021-03-25' into review-hansHans de Goede
Tag for the input subsystem to pick up
2021-04-08clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issueTony Lindgren
There is a timer wrap issue on dra7 for the ARM architected timer. In a typical clock configuration the timer fails to wrap after 388 days. To work around the issue, we need to use timer-ti-dm timers instead. Let's prepare for adding support for percpu timers by adding a common dmtimer_clkevt_init_common() and call it from dmtimer_clockevent_init(). This patch makes no intentional functional changes. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210323074326.28302-2-tony@atomide.com
2021-04-08bus: mhi: pci_generic: Constify mhi_controller_config struct definitionsManivannan Sadhasivam
"mhi_controller_config" struct is not modified inside "mhi_pci_dev_info" struct. So constify the instances. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2021-04-08bus: mhi: pci_generic: Introduce Foxconn T99W175 supportJarvis Jiang
Add support for T99W175 modems, this modem series is based on SDX55 qcom chip. The modem is mainly based on MBIM protocol for both the data and control path. This patch adds support for below modems: - T99W175(based on sdx55), Both for eSIM and Non-eSIM - DW5930e(based on sdx55), With eSIM, It's also T99W175 - DW5930e(based on sdx55), Non-eSIM, It's also T99W175 This patch was tested with Ubuntu 20.04 X86_64 PC as host Signed-off-by: Jarvis Jiang <jarvis.w.jiang@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20210408095524.3559-1-jarvis.w.jiang@gmail.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2021-04-08drm/vc4: crtc: Reduce PV fifo threshold on hvs4Dom Cobley
Experimentally have found PV on hvs4 reports fifo full error with expected settings and does not with one less This appears as: [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:82:crtc-3] flip_done timed out with bit 10 of PV_STAT set "HVS driving pixels when the PV FIFO is full" Fixes: c8b75bca92cb ("drm/vc4: Add KMS support for Raspberry Pi.") Signed-off-by: Dom Cobley <popcornmix@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-3-maxime@cerno.tech
2021-04-08drm/vc4: plane: Remove redundant assignmentMaxime Ripard
The vc4_plane_atomic_async_update function assigns twice in a row the src_h field in the drm_plane_state structure to the same value. Remove the second one. Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-2-maxime@cerno.tech
2021-04-08nl80211: fix potential leak of ACL paramsJohannes Berg
In case nl80211_parse_unsol_bcast_probe_resp() results in an error, need to "goto out" instead of just returning to free possibly allocated data. Fixes: 7443dcd1f171 ("nl80211: Unsolicited broadcast probe response support") Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671ff0b237726d4851b5b0f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-08cfg80211: check S1G beacon compat element lengthJohannes Berg
We need to check the length of this element so that we don't access data beyond its end. Fix that. Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5ea49d2daa3cd2459d11@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-08arm64: entry: Enable random_kstack_offset supportKees Cook
Allow for a randomized stack offset on a per-syscall basis, with roughly 5 bits of entropy. (And include AAPCS rationale AAPCS thanks to Mark Rutland.) In order to avoid unconditional stack canaries on syscall entry (due to the use of alloca()), also disable stack protector to avoid triggering needless checks and slowing down the entry path. As there is no general way to control stack protector coverage with a function attribute[1], this must be disabled at the compilation unit level. This isn't a problem here, though, since stack protector was not triggered before: examining the resulting syscall.o, there are no changes in canary coverage (none before, none now). [1] a working __attribute__((no_stack_protector)) has been added to GCC and Clang but has not been released in any version yet: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=346b302d09c1e6db56d9fe69048acb32fbb97845 https://reviews.llvm.org/rG4fbf84c1732fca596ad1d6e96015e19760eb8a9b Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210401232347.2791257-6-keescook@chromium.org
2021-04-08lkdtm: Add REPORT_STACK for checking stack offsetsKees Cook
For validating the stack offset behavior, report the offset from a given process's first seen stack address. Add s script to calculate the results to the LKDTM kselftests. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-7-keescook@chromium.org
2021-04-08x86/entry: Enable random_kstack_offset supportKees Cook
Allow for a randomized stack offset on a per-syscall basis, with roughly 5-6 bits of entropy, depending on compiler and word size. Since the method of offsetting uses macros, this cannot live in the common entry code (the stack offset needs to be retained for the life of the syscall, which means it needs to happen at the actual entry point). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-5-keescook@chromium.org
2021-04-08stack: Optionally randomize kernel stack offset each syscallKees Cook
This provides the ability for architectures to enable kernel stack base address offset randomization. This feature is controlled by the boot param "randomize_kstack_offset=on/off", with its default value set by CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. This feature is based on the original idea from the last public release of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt All the credit for the original idea goes to the PaX team. Note that the design and implementation of this upstream randomize_kstack_offset feature differs greatly from the RANDKSTACK feature (see below). Reasoning for the feature: This feature aims to make harder the various stack-based attacks that rely on deterministic stack structure. We have had many such attacks in past (just to name few): https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html As Linux kernel stack protections have been constantly improving (vmap-based stack allocation with guard pages, removal of thread_info, STACKLEAK), attackers have had to find new ways for their exploits to work. They have done so, continuing to rely on the kernel's stack determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT were not relevant. For example, the following recent attacks would have been hampered if the stack offset was non-deterministic between syscalls: https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf (page 70: targeting the pt_regs copy with linear stack overflow) https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html (leaked stack address from one syscall as a target during next syscall) The main idea is that since the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack, even with address exposures, as the stack base will change on the next syscall. Also, since randomization is performed after placing pt_regs, the ptrace-based approach[1] to discover the randomized offset during a long-running syscall should not be possible. Design description: During most of the kernel's execution, it runs on the "thread stack", which is pretty deterministic in its structure: it is fixed in size, and on every entry from userspace to kernel on a syscall the thread stack starts construction from an address fetched from the per-cpu cpu_current_top_of_stack variable. The first element to be pushed to the thread stack is the pt_regs struct that stores all required CPU registers and syscall parameters. Finally the specific syscall function is called, with the stack being used as the kernel executes the resulting request. The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the syscall processing, and to change it every time a process issues a syscall. The source of randomness is currently architecture-defined (but x86 is using the low byte of rdtsc()). Future improvements for different entropy sources is possible, but out of scope for this patch. Further more, to add more unpredictability, new offsets are chosen at the end of syscalls (the timing of which should be less easy to measure from userspace than at syscall entry time), and stored in a per-CPU variable, so that the life of the value does not stay explicitly tied to a single task. As suggested by Andy Lutomirski, the offset is added using alloca() and an empty asm() statement with an output constraint, since it avoids changes to assembly syscall entry code, to the unwinder, and provides correct stack alignment as defined by the compiler. In order to make this available by default with zero performance impact for those that don't want it, it is boot-time selectable with static branches. This way, if the overhead is not wanted, it can just be left turned off with no performance impact. The generated assembly for x86_64 with GCC looks like this: ... ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax # 12380 <kstack_offset> ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax ffffffff81003983: 48 83 c0 0f add $0xf,%rax ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax ffffffff8100398c: 48 29 c4 sub %rax,%rsp ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax ... As a result of the above stack alignment, this patch introduces about 5 bits of randomness after pt_regs is spilled to the thread stack on x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for stack alignment). The amount of entropy could be adjusted based on how much of the stack space we wish to trade for security. My measure of syscall performance overhead (on x86_64): lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null randomize_kstack_offset=y Simple syscall: 0.7082 microseconds randomize_kstack_offset=n Simple syscall: 0.7016 microseconds So, roughly 0.9% overhead growth for a no-op syscall, which is very manageable. And for people that don't want this, it's off by default. There are two gotchas with using the alloca() trick. First, compilers that have Stack Clash protection (-fstack-clash-protection) enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to any dynamic stack allocations. While the randomization offset is always less than a page, the resulting assembly would still contain (unreachable!) probing routines, bloating the resulting assembly. To avoid this, -fno-stack-clash-protection is unconditionally added to the kernel Makefile since this is the only dynamic stack allocation in the kernel (now that VLAs have been removed) and it is provably safe from Stack Clash style attacks. The second gotcha with alloca() is a negative interaction with -fstack-protector*, in that it sees the alloca() as an array allocation, which triggers the unconditional addition of the stack canary function pre/post-amble which slows down syscalls regardless of the static branch. In order to avoid adding this unneeded check and its associated performance impact, architectures need to carefully remove uses of -fstack-protector-strong (or -fstack-protector) in the compilation units that use the add_random_kstack() macro and to audit the resulting stack mitigation coverage (to make sure no desired coverage disappears). No change is visible for this on x86 because the stack protector is already unconditionally disabled for the compilation unit, but the change is required on arm64. There is, unfortunately, no attribute that can be used to disable stack protector for specific functions. Comparison to PaX RANDKSTACK feature: The RANDKSTACK feature randomizes the location of the stack start (cpu_current_top_of_stack), i.e. including the location of pt_regs structure itself on the stack. Initially this patch followed the same approach, but during the recent discussions[2], it has been determined to be of a little value since, if ptrace functionality is available for an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write different offsets in the pt_regs struct, observe the cache behavior of the pt_regs accesses, and figure out the random stack offset. Another difference is that the random offset is stored in a per-cpu variable, rather than having it be per-thread. As a result, these implementations differ a fair bit in their implementation details and results, though obviously the intent is similar. [1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/ [2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/ [3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html Co-developed-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
2021-04-08init_on_alloc: Optimize static branchesKees Cook
The state of CONFIG_INIT_ON_ALLOC_DEFAULT_ON (and ...ON_FREE...) did not change the assembly ordering of the static branches: they were always out of line. Use the new jump_label macros to check the CONFIG settings to default to the "expected" state, which slightly optimizes the resulting assembly code. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20210401232347.2791257-3-keescook@chromium.org
2021-04-08jump_label: Provide CONFIG-driven build state defaultsKees Cook
As shown in the comment in jump_label.h, choosing the initial state of static branches changes the assembly layout. If the condition is expected to be likely it's inline, and if unlikely it is out of line via a jump. A few places in the kernel use (or could be using) a CONFIG to choose the default state, which would give a small performance benefit to their compile-time declared default. Provide the infrastructure to do this. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210401232347.2791257-2-keescook@chromium.org
2021-04-08KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_spPaolo Bonzini
Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller will skip the TLB flush, which is wrong. There are two ways to fix it: - since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to use "flush |= ..." - or we can chain the flush argument through kvm_tdp_mmu_zap_sp down to __kvm_tdp_mmu_zap_gfn_range. Note that kvm_tdp_mmu_zap_sp will neither yield nor flush, so flush would never go from true to false. This patch does the former to simplify application to stable kernels, and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush. Cc: seanjc@google.com Fixes: 048f49809c526 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping") Cc: <stable@vger.kernel.org> # 5.10.x: 048f49809c: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping Cc: <stable@vger.kernel.org> # 5.10.x: 33a3164161: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages Cc: <stable@vger.kernel.org> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-08clocksource/drivers/dw_apb_timer_of: Add handling for potential memory leakDinh Nguyen
Add calls to disable the clock and unmap the timer base address in case of any failures. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210322121844.2271041-1-dinguyen@kernel.org
2021-04-08clocksource/drivers/npcm: Add support for WPCM450Jonathan Neuschäfer
Add a compatible string for WPCM450, which has essentially the same timer controller. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210320181610.680870-11-j.neuschaefer@gmx.net