summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-10-07x86/mce: Add _ASM_EXTABLE_CPY for copy user accessYouquan Song
_ASM_EXTABLE_UA is a general exception entry to record the exception fixup for all exception spots between kernel and user space access. To enable recovery from machine checks while coping data from user addresses it is necessary to be able to distinguish the places that are looping copying data from those that copy a single byte/word/etc. Add a new macro _ASM_EXTABLE_CPY and use it in place of _ASM_EXTABLE_UA in the copy functions. Record the exception reason number to regs->ax at ex_handler_uaccess which is used to check MCE triggered. The new fixup routine ex_handler_copy() is almost an exact copy of ex_handler_uaccess() The difference is that it sets regs->ax to the trap number. Following patches use this to avoid trying to copy remaining bytes from the tail of the copy and possibly hitting the poison again. New mce.kflags bit MCE_IN_KERNEL_COPYIN will be used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-4-tony.luck@intel.com
2020-10-07x86/mce: Provide method to find out the type of an exception handlerTony Luck
Avoid a proliferation of ex_has_*_handler() functions by having just one function that returns the type of the handler (if any). Drop the __visible attribute for this function. It is not called from assembler so the attribute is not necessary. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-3-tony.luck@intel.com
2020-10-07x86/mce: Pass pointer to saved pt_regs to severity calculation routinesYouquan Song
New recovery features require additional information about processor state when a machine check occurred. Pass pt_regs down to the routines that need it. No functional change. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-2-tony.luck@intel.com
2020-10-07pinctrl: mediatek: Free eint data on failureEnric Balletbo i Serra
The pinctrl driver can work without the EINT resource, but, if it is expected to have this resource but the mtk_build_eint() function fails after allocating their data (because can't get the resource or can't map the irq), the data is not freed and you end with a NULL pointer dereference. Fix this by freeing the data if mtk_build_eint() fails, so pinctrl still works and doesn't hang. This is noticeable after commit f97dbf48ca43 ("irqchip/mtk-sysirq: Convert to a platform driver") on MT8183 because, due this commit, the pinctrl driver fails to map the irq and spots the following bug: [ 1.947597] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 1.956404] Mem abort info: [ 1.959203] ESR = 0x96000004 [ 1.962259] EC = 0x25: DABT (current EL), IL = 32 bits [ 1.967565] SET = 0, FnV = 0 [ 1.970613] EA = 0, S1PTW = 0 [ 1.973747] Data abort info: [ 1.976619] ISV = 0, ISS = 0x00000004 [ 1.980447] CM = 0, WnR = 0 [ 1.983410] [0000000000000004] user address but active_mm is swapper [ 1.989759] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 1.995322] Modules linked in: [ 1.998371] CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc1+ #44 [ 2.004715] Hardware name: MediaTek krane sku176 board (DT) [ 2.010280] pstate: 60000005 (nZCv daif -PAN -UAO BTYPE=--) [ 2.015850] pc : mtk_eint_set_debounce+0x48/0x1b8 [ 2.020546] lr : mtk_eint_set_debounce+0x34/0x1b8 [ 2.025239] sp : ffff80001008baa0 [ 2.028544] x29: ffff80001008baa0 x28: ffff0000ff7ff790 [ 2.033847] x27: ffff0000f9ec34b0 x26: ffff0000f9ec3480 [ 2.039150] x25: ffff0000fa576410 x24: ffff0000fa502800 [ 2.044453] x23: 0000000000001388 x22: ffff0000fa635f80 [ 2.049755] x21: 0000000000000008 x20: 0000000000000000 [ 2.055058] x19: 0000000000000071 x18: 0000000000000001 [ 2.060360] x17: 0000000000000000 x16: 0000000000000000 [ 2.065662] x15: ffff0000facc8470 x14: ffffffffffffffff [ 2.070965] x13: 0000000000000001 x12: 00000000000000c0 [ 2.076267] x11: 0000000000000040 x10: 0000000000000070 [ 2.081569] x9 : ffffaec0063d24d8 x8 : ffff0000fa800270 [ 2.086872] x7 : 0000000000000000 x6 : 0000000000000011 [ 2.092174] x5 : ffff0000fa800248 x4 : ffff0000fa800270 [ 2.097476] x3 : ffff8000100c5000 x2 : 0000000000000000 [ 2.102778] x1 : 0000000000000000 x0 : 0000000000000000 [ 2.108081] Call trace: [ 2.110520] mtk_eint_set_debounce+0x48/0x1b8 [ 2.114870] mtk_gpio_set_config+0x5c/0x78 [ 2.118958] gpiod_set_config+0x5c/0x78 [ 2.122786] gpiod_set_debounce+0x18/0x28 [ 2.126789] gpio_keys_probe+0x50c/0x910 [ 2.130705] platform_drv_probe+0x54/0xa8 [ 2.134705] really_probe+0xe4/0x3b0 [ 2.138271] driver_probe_device+0x58/0xb8 [ 2.142358] device_driver_attach+0x74/0x80 [ 2.146532] __driver_attach+0x58/0xe0 [ 2.150274] bus_for_each_dev+0x70/0xc0 [ 2.154100] driver_attach+0x24/0x30 [ 2.157666] bus_add_driver+0x14c/0x1f0 [ 2.161493] driver_register+0x64/0x120 [ 2.165319] __platform_driver_register+0x48/0x58 [ 2.170017] gpio_keys_init+0x1c/0x28 [ 2.173672] do_one_initcall+0x54/0x1b4 [ 2.177499] kernel_init_freeable+0x1d0/0x238 [ 2.181848] kernel_init+0x14/0x118 [ 2.185328] ret_from_fork+0x10/0x34 [ 2.188899] Code: a9438ac1 12001266 f94006c3 121e766a (b9400421) [ 2.194991] ---[ end trace 168cf7b3324b6570 ]--- [ 2.199611] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 2.207260] SMP: stopping secondary CPUs [ 2.211294] Kernel Offset: 0x2ebff4800000 from 0xffff800010000000 [ 2.217377] PHYS_OFFSET: 0xffffb50500000000 [ 2.221551] CPU features: 0x0240002,2188200c [ 2.225811] Memory Limit: none [ 2.228860] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- Fixes: 89132dd8ffd2 ("pinctrl: mediatek: extend eint build to pinctrl-mtk-common-v2.c") Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Sean Wang <sean.wang@kernel.org> Link: https://lore.kernel.org/r/20201001142511.3560143-1-enric.balletbo@collabora.com [rebased on changed infrastructure] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-10-07x86/platform/uv: Update Copyrights to conform to HPE standardsMike Travis
Add Copyrights to those files that have been updated for UV5 changes. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201005203929.148656-14-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update for UV5 NMI MMR changesMike Travis
The UV NMI MMR addresses and fields moved between UV4 and UV5 necessitating a rewrite of the UV NMI handler. Adjust references to accommodate those changes. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-13-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update UV5 TSC checkingMike Travis
Update check of BIOS TSC sync status to include both possible "invalid" states provided by newer UV5 BIOS. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-12-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update node present countingMike Travis
The changes in the UV5 arch shrunk the NODE PRESENT table to just 2x64 entries (128 total) so are in to 64 bit MMRs instead of a depth of 64 bits in an array. Adjust references when counting up the nodes present. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-11-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update UV5 MMR references in UV GRUMike Travis
Make modifications to the GRU mappings to accommodate changes for UV5. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-10-mike.travis@hpe.com
2020-10-07x86/platform/uv: Adjust GAM MMR references affected by UV5 updatesMike Travis
Make modifications to the GAM MMR mappings to accommodate changes for UV5. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-9-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update MMIOH references based on new UV5 MMRsMike Travis
Make modifications to the MMIOH mappings to accommodate changes for UV5. [ Fix W=1 build warnings. ] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-8-mike.travis@hpe.com
2020-10-07x86/platform/uv: Add and decode Arch Type in UVsystabMike Travis
When the UV BIOS starts the kernel it passes the UVsystab info struct to the kernel which contains information elements more specific than ACPI, and generally pertinent only to the MMRs. These are read only fields so information is passed one way only. A new field starting with UV5 is the UV architecture type so the ACPI OEM_ID field can be used for other purposes going forward. The UV Arch Type selects the entirety of the MMRs available, with their addresses and fields defined in uv_mmrs.h. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-7-mike.travis@hpe.com
2020-10-07x86/platform/uv: Add UV5 direct referencesMike Travis
Add new references to UV5 (and UVY class) system MMR addresses and fields primarily caused by the expansion from 46 to 52 bits of physical memory address. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-6-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update UV MMRs for UV5Mike Travis
Update UV MMRs in uv_mmrs.h for UV5 based on Verilog output from the UV Hub hardware design files. This is the next UV architecture with a new class (UVY) being defined for 52 bit physical address masks. Uses a bitmask for UV arch identification so a single test can cover multiple versions. Includes other adjustments to match the uv_mmrs.h file to keep from encountering compile errors. New UV5 functionality is added in the patches that follow. [ Fix W=1 build warnings. ] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-5-mike.travis@hpe.com
2020-10-07drivers/misc/sgi-xp: Adjust references in UV kernel modulesMike Travis
Remove the define is_uv() is_uv_system and just use the latter as is. This removes a conflict with a new symbol in the generated uv_mmrs.h file (is_uv()). Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-4-mike.travis@hpe.com
2020-10-07x86/platform/uv: Remove SCIR MMR references for UV systemsMike Travis
UV class systems no longer use System Controller for monitoring of CPU activity provided by this driver. Other methods have been developed for BIOS and the management controller (BMC). Remove that supporting code. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-3-mike.travis@hpe.com
2020-10-07x86/platform/uv: Remove UV BAU TLB Shootdown HandlerMike Travis
The Broadcast Assist Unit (BAU) TLB shootdown handler is being rewritten to become the UV BAU APIC driver. It is designed to speed up sending IPIs to selective CPUs within the system. Remove the current TLB shutdown handler (tlb_uv.c) file and a couple of kernel hooks in the interim. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-2-mike.travis@hpe.com
2020-10-07nvme-core: remove extra condition for vwcChaitanya Kulkarni
In nvme_set_queue_limits() we initialize vwc to false and later add a condition to set vwc true. The value of the vwc can be declare initialized which makes all the blk_queue_XXX() calls uniform. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07nvme-core: remove extra variableChaitanya Kulkarni
In nvme_validate_ns() the exra variable ctrl is used only twice. Using ns->ctrl directly still maintains the redability and original length of the lines in the code. Get rid of the extra variable ctrl & use ns->ctrl directly. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07nvme: remove nvme_identify_ns_listChristoph Hellwig
Just fold it into the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2020-10-07nvme: refactor nvme_validate_nsChristoph Hellwig
Move the logic to revalidate the block_device size or remove the namespace from the caller into nvme_validate_ns. This removes the return value and thus the status code translation. Additionally it also catches non-permanent errors from nvme_update_ns_info using the existing logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: move nvme_validate_nsChristoph Hellwig
Move nvme_validate_ns just above its only remaining caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: query namespace identifiers before adding the namespaceChristoph Hellwig
Check the namespace identifier list first thing when scanning namespaces. This keeps the code to query the CSI common between the alloc and validate path, and helps to structure the code better for multiple command set support. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: revalidate zone bitmaps in nvme_update_ns_infoChristoph Hellwig
Consolidate the two calls into a single place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-07nvme: remove nvme_update_formatsChristoph Hellwig
Now that the queue is frozen before updating ->lba_shift we can't hit the invalid references mentioned in the comment any more. More importantly this code would not have helped us if the format was changed by another controller or through implementation defined back channels. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: update the known admin effectsChristoph Hellwig
A Format NVM command can change the capabilities of namespaces, while Sanitize does change the Logical Block Content and must be serialized. Also remove CSUPP bit for Format - it is not a mandatory command, and we don't check for the bit anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: set the queue limits in nvme_update_ns_infoChristoph Hellwig
Only set the queue limits once we have the real block size. This also updates the limits on a rescan if needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: remove the 0 lba_shift check in nvme_update_ns_infoChristoph Hellwig
We can no longer reach this code if Identify Namespace failed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: clean up the check for too large logic block sizesChristoph Hellwig
Use a single statement to set both the capacity and fake block size instead of two. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-07nvme: freeze the queue over ->lba_shift updatesChristoph Hellwig
Ensure that there can't be any I/O in flight went we change the disk geometry in nvme_update_ns_info, most notable the LBA size by lifting the queue free from nvme_update_disk_info into the caller Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: factor out a nvme_configure_metadata helperChristoph Hellwig
Factor out a helper from nvme_update_ns_info that configures the per-namespaces metadata and PI settings. Also make sure the helpers clear the flags explicitly instead of all of ->features to allow for potentially reusing ->features for future non-metadata flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: call nvme_identify_ns as the first thing in nvme_alloc_ns_blockChristoph Hellwig
Check if the namespace actually exists as the very first thing and don't bother with any extra work if not. This should speed up and simplify the sequential scanning for NVMe 1.0 devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: lift the check for an unallocated namespace into nvme_identify_nsChristoph Hellwig
Move the check from the two callers into the common helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2020-10-07nvme: rename __nvme_revalidate_diskChristoph Hellwig
Rename __nvme_revalidate_disk to nvme_update_ns_info and pass a namespace instead of the gendisk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: rename _nvme_revalidate_diskChristoph Hellwig
Rename _nvme_revalidate_disk to nvme_validate_ns to better describe what the function does, and pass the struct nvme_ns instead of the gendisk to better match the call chain. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: rename nvme_validate_ns to nvme_validate_or_alloc_nsChristoph Hellwig
Use a slightly more descriptive name to enable reusing nvme_validate_ns in the next patch for a lower level function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: remove the disk argument to nvme_update_zone_infoChristoph Hellwig
The queue can trivially be derived from the nvme_ns structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: fix initialization of the zone bitmapsChristoph Hellwig
The removal of the ->revalidate_disk method broke the initialization of the zone bitmaps, as nvme_revalidate_disk now never gets called during initialization. Move the zone related code from nvme_revalidate_disk into a new helper in zns.c, and call it from nvme_alloc_ns in addition to nvme_validate_ns to ensure the zone bitmaps are initialized during probe. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07block: optimize blk_queue_zoned_model for !CONFIG_BLK_DEV_ZONEDChristoph Hellwig
Always return BLK_ZONED_NONE if zoned device support is not enabled. This allows various compiler optimizations including the dead code elimination that we so like for avoiding ifdefs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme-loop: don't put ctrl on nvme_init_ctrl errorChaitanya Kulkarni
The function nvme_init_ctrl() gets the ctrl reference & when it fails it does put the ctrl reference in the error unwind code. When creating loop ctrl in nvme_loop_create_ctrl() if nvme_init_ctrl() returns non zero (i.e. error) value it jumps to the "out_put_ctrl" label which calls nvme_put_ctrl(), that will lead to douple ctrl put in error unwind path. Update nvme_loop_create_ctrl() such that this patch removes the "out_put_ctrl" label, add a new "out" label after nvme_put_ctrl() in error unwind path and jump to newly added label when nvme_init_ctrl() call retuns an error. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07nvme-core: put ctrl ref when module ref get failChaitanya Kulkarni
When try_module_get() fails in the nvme_dev_open() it returns without releasing the ctrl reference which was taken earlier. Put the ctrl reference which is taken before calling the try_module_get() in the error return code path. Fixes: 52a3974feb1a "nvme-core: get/put ctrl and transport module in nvme_dev_open/release()" Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07drm/nouveau/mem: guard against NULL pointer access in mem_delKarol Herbst
other drivers seems to do something similar Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: dri-devel <dri-devel@lists.freedesktop.org> Cc: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-2-kherbst@redhat.com
2020-10-07drm/nouveau/device: return error for unknown chipsetsKarol Herbst
Previously the code relied on device->pri to be NULL and to fail probing later. We really should just return an error inside nvkm_device_ctor for unsupported GPUs. Fixes: 24d5ff40a732 ("drm/nouveau/device: rework mmio mapping code to get rid of second map") Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: dann frazier <dann.frazier@canonical.com> Cc: dri-devel <dri-devel@lists.freedesktop.org> Cc: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-1-kherbst@redhat.com
2020-10-07exfat: fix use of uninitialized spinlock on error pathNamjae Jeon
syzbot reported warning message: Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1d6/0x29e lib/dump_stack.c:118 register_lock_class+0xf06/0x1520 kernel/locking/lockdep.c:893 __lock_acquire+0xfd/0x2ae0 kernel/locking/lockdep.c:4320 lock_acquire+0x148/0x720 kernel/locking/lockdep.c:5029 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] exfat_cache_inval_inode+0x30/0x280 fs/exfat/cache.c:226 exfat_evict_inode+0x124/0x270 fs/exfat/inode.c:660 evict+0x2bb/0x6d0 fs/inode.c:576 exfat_fill_super+0x1e07/0x27d0 fs/exfat/super.c:681 get_tree_bdev+0x3e9/0x5f0 fs/super.c:1342 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 If exfat_read_root() returns an error, spinlock is used in exfat_evict_inode() without initialization. This patch combines exfat_cache_init_inode() with exfat_inode_init_once() to initialize spinlock by slab constructor. Fixes: c35b6810c495 ("exfat: add exfat cache") Cc: stable@vger.kernel.org # v5.7+ Reported-by: syzbot <syzbot+b91107320911a26c9a95@syzkaller.appspotmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-10-07exfat: fix pointer error checkingTetsuhiro Kohada
Fix missing result check of exfat_build_inode(). And use PTR_ERR_OR_ZERO instead of PTR_ERR. Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-10-07arm/arm64: xen: Fix to convert percpu address to gfn correctlyMasami Hiramatsu
Use per_cpu_ptr_to_phys() instead of virt_to_phys() for per-cpu address conversion. In xen_starting_cpu(), per-cpu xen_vcpu_info address is converted to gfn by virt_to_gfn() macro. However, since the virt_to_gfn(v) assumes the given virtual address is in linear mapped kernel memory area, it can not convert the per-cpu memory if it is allocated on vmalloc area. This depends on CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK. If it is enabled, the first chunk of percpu memory is linear mapped. In the other case, that is allocated from vmalloc area. Moreover, if the first chunk of percpu has run out until allocating xen_vcpu_info, it will be allocated on the 2nd chunk, which is based on kernel memory or vmalloc memory (depends on CONFIG_NEED_PER_CPU_KM). Without this fix and kernel configured to use vmalloc area for the percpu memory, the Dom0 kernel will fail to boot with following errors. [ 0.466172] Xen: initializing cpu0 [ 0.469601] ------------[ cut here ]------------ [ 0.474295] WARNING: CPU: 0 PID: 1 at arch/arm64/xen/../../arm/xen/enlighten.c:153 xen_starting_cpu+0x160/0x180 [ 0.484435] Modules linked in: [ 0.487565] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #4 [ 0.493895] Hardware name: Socionext Developer Box (DT) [ 0.499194] pstate: 00000005 (nzcv daif -PAN -UAO BTYPE=--) [ 0.504836] pc : xen_starting_cpu+0x160/0x180 [ 0.509263] lr : xen_starting_cpu+0xb0/0x180 [ 0.513599] sp : ffff8000116cbb60 [ 0.516984] x29: ffff8000116cbb60 x28: ffff80000abec000 [ 0.522366] x27: 0000000000000000 x26: 0000000000000000 [ 0.527754] x25: ffff80001156c000 x24: fffffdffbfcdb600 [ 0.533129] x23: 0000000000000000 x22: 0000000000000000 [ 0.538511] x21: ffff8000113a99c8 x20: ffff800010fe4f68 [ 0.543892] x19: ffff8000113a9988 x18: 0000000000000010 [ 0.549274] x17: 0000000094fe0f81 x16: 00000000deadbeef [ 0.554655] x15: ffffffffffffffff x14: 0720072007200720 [ 0.560037] x13: 0720072007200720 x12: 0720072007200720 [ 0.565418] x11: 0720072007200720 x10: 0720072007200720 [ 0.570801] x9 : ffff8000100fbdc0 x8 : ffff800010715208 [ 0.576182] x7 : 0000000000000054 x6 : ffff00001b790f00 [ 0.581564] x5 : ffff800010bbf880 x4 : 0000000000000000 [ 0.586945] x3 : 0000000000000000 x2 : ffff80000abec000 [ 0.592327] x1 : 000000000000002f x0 : 0000800000000000 [ 0.597716] Call trace: [ 0.600232] xen_starting_cpu+0x160/0x180 [ 0.604309] cpuhp_invoke_callback+0xac/0x640 [ 0.608736] cpuhp_issue_call+0xf4/0x150 [ 0.612728] __cpuhp_setup_state_cpuslocked+0x128/0x2c8 [ 0.618030] __cpuhp_setup_state+0x84/0xf8 [ 0.622192] xen_guest_init+0x324/0x364 [ 0.626097] do_one_initcall+0x54/0x250 [ 0.630003] kernel_init_freeable+0x12c/0x2c8 [ 0.634428] kernel_init+0x1c/0x128 [ 0.637988] ret_from_fork+0x10/0x18 [ 0.641635] ---[ end trace d95b5309a33f8b27 ]--- [ 0.646337] ------------[ cut here ]------------ [ 0.651005] kernel BUG at arch/arm64/xen/../../arm/xen/enlighten.c:158! [ 0.657697] Internal error: Oops - BUG: 0 [#1] SMP [ 0.662548] Modules linked in: [ 0.665676] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.9.0-rc4+ #4 [ 0.673398] Hardware name: Socionext Developer Box (DT) [ 0.678695] pstate: 00000005 (nzcv daif -PAN -UAO BTYPE=--) [ 0.684338] pc : xen_starting_cpu+0x178/0x180 [ 0.688765] lr : xen_starting_cpu+0x144/0x180 [ 0.693188] sp : ffff8000116cbb60 [ 0.696573] x29: ffff8000116cbb60 x28: ffff80000abec000 [ 0.701955] x27: 0000000000000000 x26: 0000000000000000 [ 0.707344] x25: ffff80001156c000 x24: fffffdffbfcdb600 [ 0.712718] x23: 0000000000000000 x22: 0000000000000000 [ 0.718107] x21: ffff8000113a99c8 x20: ffff800010fe4f68 [ 0.723481] x19: ffff8000113a9988 x18: 0000000000000010 [ 0.728863] x17: 0000000094fe0f81 x16: 00000000deadbeef [ 0.734245] x15: ffffffffffffffff x14: 0720072007200720 [ 0.739626] x13: 0720072007200720 x12: 0720072007200720 [ 0.745008] x11: 0720072007200720 x10: 0720072007200720 [ 0.750390] x9 : ffff8000100fbdc0 x8 : ffff800010715208 [ 0.755771] x7 : 0000000000000054 x6 : ffff00001b790f00 [ 0.761153] x5 : ffff800010bbf880 x4 : 0000000000000000 [ 0.766534] x3 : 0000000000000000 x2 : 00000000deadbeef [ 0.771916] x1 : 00000000deadbeef x0 : ffffffffffffffea [ 0.777304] Call trace: [ 0.779819] xen_starting_cpu+0x178/0x180 [ 0.783898] cpuhp_invoke_callback+0xac/0x640 [ 0.788325] cpuhp_issue_call+0xf4/0x150 [ 0.792317] __cpuhp_setup_state_cpuslocked+0x128/0x2c8 [ 0.797619] __cpuhp_setup_state+0x84/0xf8 [ 0.801779] xen_guest_init+0x324/0x364 [ 0.805683] do_one_initcall+0x54/0x250 [ 0.809590] kernel_init_freeable+0x12c/0x2c8 [ 0.814016] kernel_init+0x1c/0x128 [ 0.817583] ret_from_fork+0x10/0x18 [ 0.821226] Code: d0006980 f9427c00 cb000300 17ffffea (d4210000) [ 0.827415] ---[ end trace d95b5309a33f8b28 ]--- [ 0.832076] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 0.839815] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/160196697165.60224.17470743378683334995.stgit@devnote2 Signed-off-by: Juergen Gross <jgross@suse.com>
2020-10-06riscv: Fixup bootup failure with HARDENED_USERCOPYGuo Ren
6184358da000 ("riscv: Fixup static_obj() fail") attempted to elide a lockdep failure by rearranging our kernel image to place all initdata within [_stext, _end], thus triggering lockdep to treat these as static objects. These objects are released and eventually reallocated, causing check_kernel_text_object() to trigger a BUG(). This backs out the change to make [_stext, _end] all-encompassing, instead just moving initdata. This results in initdata being outside of [__init_begin, __init_end], which means initdata can't be freed. Link: https://lore.kernel.org/linux-riscv/1593266228-61125-1-git-send-email-guoren@kernel.org/T/#t Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Reported-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Aurelien Jarno <aurelien@aurel32.net> [Palmer: Clean up commit text] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-10-06scsi: hisi_sas: Recover PHY state according to the status before resetXiang Chen
Currently the PHY state is set according to the state of the PHYs after reset. This is invalid as the PHYs are already re-initialized. Set PHY state according to the state before the reset instead of after. Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06scsi: hisi_sas: Filter out new PHY up events during suspendXiang Chen
Currently sas_resume_ha() is called while resuming the controller to wait for all suspended PHYs to come up and all the libsas events to be completed. There is a scenario which will cause task hung: For direct attach with two disks connected with two PHYs, disable phy0 before suspending the disk on phy1 and the controller, then enable phy0 and resume the controller, and task hung occurs as follows: [ 591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0] [ 593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined [ 593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume [ 593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata) [ 593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200) [ 593.148350] sas: DOING DISCOVERY on port 0, pid:948 [ 593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found [ 593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 [ 593.165663] sas: ata7: end_device-2:0: dev error handler [ 593.165730] sas: ata2: end_device-2:1: dev error handler [ 593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2 [ 593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down [ 593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133 [ 593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32) [ 593.514295] ata7.00: configured for UDMA/133 [ 593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [ 593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005 serial:S45NNA0M712225 [ 593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0 [ 593.543674] device_link_add 324 [ 593.546801] device_link_add 352 [ 593.549930] device_link_add 406 [ 593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0 [ 593.559208] device_link_add 444 [ 593.562335] device_link_add 455 [ 593.565517] scsi 2:0:2:0: Direct-Access ATA SAMSUNG MZ7LH960 404Q PQ: 0 ANSI: 5 [ 620.057464] phy-2:1: resume timeout [ 738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds. [ 738.848295] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.862155] kworker/u256:0 D 0 8 2 0x00000028 [ 738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker [ 738.873693] Call trace: [ 738.876133] __switch_to+0xf4/0x148 [ 738.879613] __schedule+0x270/0x5d8 [ 738.883091] schedule+0x78/0x110 [ 738.886307] schedule_timeout+0x1ac/0x280 [ 738.890299] wait_for_completion+0x94/0x138 [ 738.894472] flush_workqueue+0x114/0x438 [ 738.898377] sas_porte_bytes_dmaed+0x400/0x500 [ 738.902801] sas_port_event_worker+0x28/0x40 [ 738.907053] process_one_work+0x1e8/0x360 [ 738.911046] worker_thread+0x44/0x478 [ 738.914698] kthread+0x150/0x158 [ 738.917915] ret_from_fork+0x10/0x1c [ 738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds. [ 738.928550] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.942408] kworker/u256:1 D 0 948 2 0x00000028 [ 738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain [ 738.953766] Call trace: [ 738.956203] __switch_to+0xf4/0x148 [ 738.959678] __schedule+0x270/0x5d8 [ 738.963152] schedule+0x78/0x110 [ 738.966368] rpm_resume+0xcc/0x550 [ 738.969757] __pm_runtime_resume+0x3c/0x88 [ 738.973836] rpm_get_suppliers+0x50/0x148 [ 738.977829] __pm_runtime_set_status+0x124/0x2f0 [ 738.982427] scsi_sysfs_add_sdev+0x1a0/0x2a8 [ 738.986679] scsi_probe_and_add_lun+0x888/0xab0 [ 738.991190] __scsi_scan_target+0xec/0x520 [ 738.995268] scsi_scan_target+0x11c/0x128 [ 738.999261] sas_rphy_add+0x15c/0x1e8 [ 739.002907] sas_probe_devices+0xe4/0x150 [ 739.006899] sas_discover_domain+0x33c/0x588 [ 739.011150] process_one_work+0x1e8/0x360 [ 739.015143] worker_thread+0x44/0x478 [ 739.018789] kthread+0x150/0x158 [ 739.022003] ret_from_fork+0x10/0x1c ... If an extra phy0 up happens during resume of the SAS controller, it will emit a new libsas event (event PORTE_BYTES_DMAED and event DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to resume supplier (host controller). For runtime PM core, if device is in the resuming state, the later resume request of the device will wait for previous resume request to complete synchronously. At that point in time the state of the controller is still resuming as it waits for all libsas events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked as the state of the controller is resuming which causes a deadlock. To avoid the issue, filter out new PHY up events while the controller is suspended. Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06scsi: hisi_sas: Add device link between SCSI devices and hisi_hbaXiang Chen
Runtime PM of SCSI devices is already supported in SCSI layer, we can suspend/resume every SCSI device separately. But if there is no link between hisi_hba and SCSI devices or SCSI targets it will cause issues if the controller is suspended while SCSI devices are still resuming. Only when all the SCSI devices under the controller are suspended, the controller can be suspended. Add the device link between SCSI devices and the controller. Link: https://lore.kernel.org/r/1601649038-25534-6-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>