summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2021-02-10i40e: Add netlink callbacks support for software based DCBArkadiusz Kubalewski
Add callbacks used by software based LLDP agent, which allows to configure DCB feature from userspace. Update copyright dates as appropriate. If LLDP agent is turned off in BIOS, or after setting private flag ("disable-fw-lldp on"). The driver initialized DCB functionality with default values, one traffic class with 100% bandwidth allocated. The new netlink callbacks are required for software LLDP agent, it must be able to acquire current DCB configuration of a network port and apply DCB configuration changes, if required. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add init and default config of software based DCBArkadiusz Kubalewski
Add extra handling on changing the "disable-fw-lldp" private flag to properly initialize software based DCB feature. Add default configuration of DCB functionality when Firmware LLDP agent is turned off, in case of driver probe and device reset on reconfiguration. Update copyright dates as appropriate. Software based DCB is a brand-new feature in i40e driver. Before, DCB was implemented by Firmware LLDP agent only. The agent was responsible for handling incoming DCB-related LLDP frames and applying received DCB configuration to hardware. Default configuration and new initialization flow for software based DCB is required. If LLDP agent is turned off in BIOS, or after setting private flag ("disable-fw-lldp on"). The driver initializes DCB functionality with default values, one traffic class with 100% bandwidth allocated. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add hardware configuration for software based DCBArkadiusz Kubalewski
Add registers and definitions required for applying DCB related hardware configuration. Add functions responsible for calculating and setting proper hardware configuration values for software based DCB functionality. Add function responsible for invoking Admin Queue command, which results in applying new DCB configuration to the hardware. Update copyright dates as appropriate. Software based DCB is a brand-new feature in i40e driver. Before, DCB was implemented by Firmware LLDP agent only. The agent was responsible for handling incoming DCB-related LLDP frames and applying received DCB configuration to hardware. New communication channel between software and hardware is required for software driver. It must be able to calculate and configure all the registers related for DCB feature. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
2021-02-10media: i2c: max9271: Add MODULE_* macrosJacopo Mondi
Since commit 7f03d9fefcc5 ("media: i2c: Kconfig: Make MAX9271 a module") the max9271 library is built as a module but no MODULE_*() attributes were specified, causing a build error due to missing license information. ERROR: modpost: missing MODULE_LICENSE() in drivers/media/i2c/max9271.o Fix this by adding MODULE attributes to the driver. Fixes: 7f03d9fefcc5 ("media: i2c: Kconfig: Make MAX9271 a module") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-02-10Merge tag 'pm-5.11-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Address a performance regression related to scale-invariance on x86 that may prevent turbo CPU frequencies from being used in certain workloads on systems using acpi-cpufreq as the CPU performance scaling driver and schedutil as the scaling governor" * tag 'pm-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not there cpufreq: ACPI: Extend frequency tables to cover boost frequencies
2021-02-10Merge tag 'acpi-5.11-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Revert a problematic ACPICA commit that changed the code to attempt to update memory regions which may be read-only on some systems (Ard Biesheuvel)" * tag 'acpi-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPICA: Interpreter: fix memory leak by using existing buffer"
2021-02-10Merge tag 'dmaengine-fix2-5.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Some late fixes for dmaengine: Core: - fix channel device_node deletion Driver fixes: - dw: revert of runtime pm enabling - idxd: device state fix, interrupt completion and list corruption - ti: resource leak * tag 'dmaengine-fix2-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine dw: Revert "dmaengine: dw: Enable runtime PM" dmaengine: idxd: check device state before issue command dmaengine: ti: k3-udma: Fix a resource leak in an error handling path dmaengine: move channel device_node deletion to driver dmaengine: idxd: fix misc interrupt completion dmaengine: idxd: Fix list corruption in description completion
2021-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Another pile of networing fixes: 1) ath9k build error fix from Arnd Bergmann 2) dma memory leak fix in mediatec driver from Lorenzo Bianconi. 3) bpf int3 kprobe fix from Alexei Starovoitov. 4) bpf stackmap integer overflow fix from Bui Quang Minh. 5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from Christoph Schemmel. 6) Don't update deleted entry in xt_recent netfilter module, from Jazsef Kadlecsik. 7) Use after free in nftables, fix from Pablo Neira Ayuso. 8) Header checksum fix in flowtable from Sven Auhagen. 9) Validate user controlled length in qrtr code, from Sabyrzhan Tasbolatov. 10) Fix race in xen/netback, from Juergen Gross, 11) New device ID in cxgb4, from Raju Rangoju. 12) Fix ring locking in rxrpc release call, from David Howells. 13) Don't return LAPB error codes from x25_open(), from Xie He. 14) Missing error returns in gsi_channel_setup() from Alex Elder. 15) Get skb_copy_and_csum_datagram working properly with odd segment sizes, from Willem de Bruijn. 16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean. 17) Do teardown on probe failure in DSA, from Vladimir Oltean. 18) Fix compilation failures of txtimestamp selftest, from Vadim Fedorenko. 19) Limit rx per-napi gro queue size to fix latency regression, from Eric Dumazet. 20) dpaa_eth xdp fixes from Camelia Groza. 21) Missing txq mode update when switching CBS off, in stmmac driver, from Mohammad Athari Bin Ismail. 22) Failover pending logic fix in ibmvnic driver, from Sukadev Bhattiprolu. 23) Null deref fix in vmw_vsock, from Norbert Slusarek. 24) Missing verdict update in xdp paths of ena driver, from Shay Agroskin. 25) seq_file iteration fix in sctp from Neil Brown. 26) bpf 32-bit src register truncation fix on div/mod, from Daniel Borkmann. 27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann. 28) Fix locking in vsock_shutdown(), from Stefano Garzarella. 29) Various missing index bound checks in hns3 driver, from Yufeng Mo. 30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from Vladimir Oltean. 31) Don't mix up stp and mrp port states in bridge layer, from Horatiu Vultur. 32) Fix locking during netif_tx_disable(), from Edwin Peer" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) bpf: Fix 32 bit src register truncation on div/mod bpf: Fix verifier jmp32 pruning decision logic bpf: Fix verifier jsgt branch analysis on max bound vsock: fix locking in vsock_shutdown() net: hns3: add a check for index in hclge_get_rss_key() net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx() net: hns3: add a check for queue_id in hclge_reset_vf_queue() net: dsa: felix: implement port flushing on .phylink_mac_link_down switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state net: watchdog: hold device global xmit lock during tx disable netfilter: nftables: relax check for stateful expressions in set definition netfilter: conntrack: skip identical origin tuple in same zone only vsock/virtio: update credit only if socket is not closed net: fix iteration for sctp transport seq_files net: ena: Update XDP verdict upon failure net/vmw_vsock: improve locking in vsock_connect_timeout() net/vmw_vsock: fix NULL pointer dereference ibmvnic: Clear failover_pending if unable to schedule net: stmmac: set TxQ mode back to DCB after disabling CBS ...
2021-02-10drivers/perf: Replace spin_lock_irqsave to spin_lockQi Liu
There is no need to do spin_lock_irqsave in context of hard IRQ, so replace them with spin_lock. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1612863742-1551-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-10dm era: Use correct value size in equality function of writeset treeNikos Tsironis
Fix the writeset tree equality test function to use the right value size when comparing two btree values. Fixes: eec40579d84873 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Reviewed-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10dm era: Fix bitset memory leaksNikos Tsironis
Deallocate the memory allocated for the in-core bitsets when destroying the target and in error paths. Fixes: eec40579d84873 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Reviewed-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10dm era: Verify the data block size hasn't changedNikos Tsironis
dm-era doesn't support changing the data block size of existing devices, so check explicitly that the requested block size for a new target matches the one stored in the metadata. Fixes: eec40579d84873 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Reviewed-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10dm era: Reinitialize bitset cache before digesting a new writesetNikos Tsironis
In case of devices with at most 64 blocks, the digestion of consecutive eras uses the writeset of the first era as the writeset of all eras to digest, leading to lost writes. That is, we lose the information about what blocks were written during the affected eras. The digestion code uses a dm_disk_bitset object to access the archived writesets. This structure includes a one word (64-bit) cache to reduce the number of array lookups. This structure is initialized only once, in metadata_digest_start(), when we kick off digestion. But, when we insert a new writeset into the writeset tree, before the digestion of the previous writeset is done, or equivalently when there are multiple writesets in the writeset tree to digest, then all these writesets are digested using the same cache and the cache is not re-initialized when moving from one writeset to the next. For devices with more than 64 blocks, i.e., the size of the cache, the cache is indirectly invalidated when we move to a next set of blocks, so we avoid the bug. But for devices with at most 64 blocks we end up using the same cached data for digesting all archived writesets, i.e., the cache is loaded when digesting the first writeset and it never gets reloaded, until the digestion is done. As a result, the writeset of the first era to digest is used as the writeset of all the following archived eras, leading to lost writes. Fix this by reinitializing the dm_disk_bitset structure, and thus invalidating the cache, every time the digestion code starts digesting a new writeset. Fixes: eec40579d84873 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10dm era: Update in-core bitset after committing the metadataNikos Tsironis
In case of a system crash, dm-era might fail to mark blocks as written in its metadata, although the corresponding writes to these blocks were passed down to the origin device and completed successfully. Consider the following sequence of events: 1. We write to a block that has not been yet written in the current era 2. era_map() checks the in-core bitmap for the current era and sees that the block is not marked as written. 3. The write is deferred for submission after the metadata have been updated and committed. 4. The worker thread processes the deferred write (process_deferred_bios()) and marks the block as written in the in-core bitmap, **before** committing the metadata. 5. The worker thread starts committing the metadata. 6. We do more writes that map to the same block as the write of step (1) 7. era_map() checks the in-core bitmap and sees that the block is marked as written, **although the metadata have not been committed yet**. 8. These writes are passed down to the origin device immediately and the device reports them as completed. 9. The system crashes, e.g., power failure, before the commit from step (5) finishes. When the system recovers and we query the dm-era target for the list of written blocks it doesn't report the aforementioned block as written, although the writes of step (6) completed successfully. The issue is that era_map() decides whether to defer or not a write based on non committed information. The root cause of the bug is that we update the in-core bitmap, **before** committing the metadata. Fix this by updating the in-core bitmap **after** successfully committing the metadata. Fixes: eec40579d84873 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10dm era: Recover committed writeset after crashNikos Tsironis
Following a system crash, dm-era fails to recover the committed writeset for the current era, leading to lost writes. That is, we lose the information about what blocks were written during the affected era. dm-era assumes that the writeset of the current era is archived when the device is suspended. So, when resuming the device, it just moves on to the next era, ignoring the committed writeset. This assumption holds when the device is properly shut down. But, when the system crashes, the code that suspends the target never runs, so the writeset for the current era is not archived. There are three issues that cause the committed writeset to get lost: 1. dm-era doesn't load the committed writeset when opening the metadata 2. The code that resizes the metadata wipes the information about the committed writeset (assuming it was loaded at step 1) 3. era_preresume() starts a new era, without taking into account that the current era might not have been archived, due to a system crash. To fix this: 1. Load the committed writeset when opening the metadata 2. Fix the code that resizes the metadata to make sure it doesn't wipe the loaded writeset 3. Fix era_preresume() to check for a loaded writeset and archive it, before starting a new era. Fixes: eec40579d84873 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10Merge back ACPICA material for v5.12.Rafael J. Wysocki
2021-02-10Merge back cpufreq updates for v5.12.Rafael J. Wysocki
2021-02-10ACPI: OSL: Clean up printing messagesRafael J. Wysocki
Replace the ACPI_DEBUG_PRINT() instance in osl.c unrelated to the ACPICA debug with acpi_handle_debug(), add a pr_fmt() definition to osl.c and replace direct printk() usage in that file with the suitable pr_*() calls. While at it, add a physical address value to the message in acpi_os_map_iomem() and reword a couple of messages to avoid using function names in them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-10staging: rtl8723bs: remove blank line from include/autoconf.hPhillip Potter
Remove additional blank line from include/autoconf.h, fixes one checkpatch check notice. Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/r/20210210170024.100937-1-phil@philpotter.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10spi: pxa2xx: Add IDs for the controllers found on Intel LynxpointAndy Shevchenko
Add IDs for the controllers found on Intel Lynxpoint. In particular it's Macbook Air 6,2 devices. Cc: Leif Liddy <leif.liddy@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210208163816.22147-2-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-02-10spi: pxa2xx: Fix the controller numbering for Wildcat PointAndy Shevchenko
Wildcat Point has two SPI controllers and added one is actually second one. Fix the numbering by adding the description of the first one. Fixes: caba248db286 ("spi: spi-pxa2xx-pci: Add ID and driver type for WildcatPoint PCH") Cc: Leif Liddy <leif.liddy@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210208163816.22147-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-02-10nbd: Convert to DEFINE_SHOW_ATTRIBUTELiao Pingfang
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Liao Pingfang <winndows@163.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10nvme: add 48-bit DMA address quirk for Amazon NVMe controllersFilippo Sironi
Some Amazon NVMe controllers do not follow the NVMe specification and are limited to 48-bit DMA addresses. Add a quirk to force bounce buffering if needed and limit the IOVA allocation for these devices. This affects all current Amazon NVMe controllers that expose EBS volumes (0x0061, 0x0065, 0x8061) and local instance storage (0xcd00, 0xcd01, 0xcd02). Signed-off-by: Filippo Sironi <sironi@amazon.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme-hwmon: rework to avoid devm allocationHannes Reinecke
The original design to use device-managed resource allocation doesn't really work as the NVMe controller has a vastly different lifetime than the hwmon sysfs attributes, causing warning about duplicate sysfs entries upon reconnection. This patch reworks the hwmon allocation to avoid device-managed resource allocation, and uses the NVMe controller as parent for the sysfs attributes. Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Enzo Matsumiya <ematsumiya@suse.de> Tested-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: remove else at the end of the functionChaitanya Kulkarni
The function nvmet_parse_io_cmd() returns value from nvmet_file_parse_io_cmd() or nvmet_bdev_parse_io_cmd() based on which backend is set for the request. Remove the else and just return the value from nvmet_bdev_parse_io_cmd(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: add nvmet_req_subsys() helperChaitanya Kulkarni
Just like what we have to get the passthru ctrl from the req, add an helper to get the subsystem associated with the nvmet_req() instead of open coding the chain of structures. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: use min of device_path and disk lenChaitanya Kulkarni
In function __assign_req_name() instead of using the DEVICE_NAME_LEN in strncpy() use min of DISK_NAME_LEN and strlen(req->ns->device_path). This is needed to turn off the following warnings:- In file included from drivers/nvme/target/core.c:14: In function ‘__assign_req_name’, inlined from ‘trace_event_raw_event_nvmet_req_init’ at drivers/nvme/target/./trace.h:58:1: drivers/nvme/target/trace.h:52:3: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] strncpy(name, req->ns->device_path, DISK_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function ‘__assign_req_name’, inlined from ‘perf_trace_nvmet_req_complete’ at drivers/nvme/target/./trace.h:100:1: drivers/nvme/target/trace.h:52:3: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] strncpy(name, req->ns->device_path, DISK_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function ‘__assign_req_name’, inlined from ‘perf_trace_nvmet_req_init’ at drivers/nvme/target/./trace.h:58:1: drivers/nvme/target/trace.h:52:3: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] strncpy(name, req->ns->device_path, DISK_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function ‘__assign_req_name’, inlined from ‘trace_event_raw_event_nvmet_req_complete’ at drivers/nvme/target/./trace.h:100:1: drivers/nvme/target/trace.h:52:3: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] strncpy(name, req->ns->device_path, DISK_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: use invalid cmd opcode helperChaitanya Kulkarni
In the NVMeOF block device backend, file backend, and passthru backend we reject and report the commands if opcode is not handled. Use the previously introduced helper in the passthru backend to make the error message uniform. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: use invalid cmd opcode helperChaitanya Kulkarni
In the NVMeOF block device backend, file backend, and passthru backend we reject and report the commands if opcode is not handled. Use the previously introduced helper in file backend to reduce the duplicate code and make the error message uniform. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: add helper to report invalid opcodeChaitanya Kulkarni
In the NVMeOF block device backend, file backend, and passthru backend we reject and report the commands if opcode is not handled. Add an helper and use it in block device backend to keep the code and error message uniform. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: remove extra variable in id-ns handlerChaitanya Kulkarni
In nvmet_execute_identify_ns() local variable ctrl is accessed only in one place, remove that and directly use it from nvmet_req->sq->ctrl. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: make nvmet_find_namespace() req basedChaitanya Kulkarni
The six callers of nvmet_find_namespace() duplicate the error log page update and status setting code for each call on failure. All callers are nvmet requests based functions, so we can pass req to the nvmet_find_namesapce() & derive ctrl from req, that'll allow us to update the error log page in nvmet_find_namespace(). Now that we pass the request we can also get rid of the local variable in nvmet_find_namespace() and use the req->ns and return the error code. Replace the ctrl parameter with nvmet_req for nvmet_find_namespace(), centralize the error log page update for non allocated namesapces, and return uniform error for non-allocated namespace. The nvmet_find_namespace() takes nsid parameter which is from NVMe commands structures such as get_log_page, identify, rw and common. All these commands have same offset for the nsid field. Derive nsid from req->cmd->common.nsid) & remove the extra parameter from the nvmet_find_namespace(). Lastly now we associate the ns to the req parameter that we pass to the nvmet_find_namespace(), rename nvmet_find_namespace() to nvmet_req_find_ns(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: return uniform error for invalid nsChaitanya Kulkarni
For nvmet_find_namespace() error case we have inconsistent error code mapping in the function nvmet_get_smart_log_nsid() and nvmet_set_feat_write_protect(). There is no point in retrying for the invalid namesapce from the host side. Set the error code to the NVME_SC_INVALID_NS | NVME_SC_DNR which matches what we have in nvmet_execute_identify_desclist(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet: set status to 0 in case for invalid nsidChaitanya Kulkarni
For unallocated namespace in nvmet_execute_identify_ns() don't set the status to NVME_SC_INVALID_NS, set it to zero. Fixes: bffcd507780e ("nvmet: set right status on error in id-ns handler") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet-fc: add a missing __rcu annotation to nvmet_fc_tgt_assoc.queuesChristoph Hellwig
Make sparse happy after the recent conversion to RCU lookups. Fixes: 4e2f02bf77da ("nvmet-fc: use RCU proctection for assoc_list") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: James Smart <james.smart@broadcom.com>
2021-02-10nvme-multipath: set nr_zones for zoned namespacesKeith Busch
The bio based drivers only require the request_queue's nr_zones is set, so set this field in the head if the namespace path is zoned. Fixes: 240e6ee272c07 ("nvme: support for zoned namespaces") Reported-by: Minwoo Im <minwoo.im.dev@gmail.com> Cc: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet-tcp: fix potential race of tcp socket closing accept_workSagi Grimberg
When we accept a TCP connection and allocate an nvmet-tcp queue we should make sure not to fully establish it or reference it as the connection may be already closing, which triggers queue release work, which does not fence against queue establishment. In order to address such a race, we make sure to check the sk_state and contain the queue reference to be done underneath the sk_callback_lock such that the queue release work correctly fences against it. Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Reported-by: Elad Grupi <elad.grupi@dell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvmet-tcp: fix receive data digest calculation for multiple h2cdata PDUsSagi Grimberg
When a host sends multiple h2cdata PDUs for a single command, we should verify the data digest calculation per PDU and not per command. Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Reported-by: Narayan Ayalasomayajula <Narayan.Ayalasomayajula@wdc.com> Tested-by: Narayan Ayalasomayajula <Narayan.Ayalasomayajula@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme-rdma: handle nvme_rdma_post_send failures betterChao Leng
nvme_rdma_post_send failing is a path related error and should bounce to another path when using nvme-multipath. Call nvme_host_path_error when nvme_rdma_post_send returns -EIO to ensure nvme_complete_rq gets invoked to fail over to another path if there is one. Signed-off-by: Chao Leng <lengchao@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme-fabrics: avoid double completions in nvmf_fail_nonready_commandChao Leng
When reconnecting, the request may be completed with NVME_SC_HOST_PATH_ERROR in nvmf_fail_nonready_command, which currently set the state of the request to MQ_RQ_IN_FLIGHT before calling nvme_complete_rq. When this happens for a request that is freed by the caller, such as nvme_submit_user_cmd, in the worst case the request could be completed again in tear down process. Instead of calling blk_mq_start_request from nvmf_fail_nonready_command, just use the new nvme_host_path_error helper to complete the command without starting it. Signed-off-by: Chao Leng <lengchao@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme: introduce a nvme_host_path_error helperChao Leng
When using nvme native multipathing, if a path related error occurs during ->queue_rq, the request needs to be completed with NVME_SC_HOST_PATH_ERROR so that the request can be failed over. Introduce a helper to complete the command from ->queue_rq in a wait that invokes nvme_complete_rq. Signed-off-by: Chao Leng <lengchao@huawei.com> [hch: renamed, added a return value to clean up the callers a bit] Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme: convert sysfs sprintf/snprintf family to sysfs_emitJiapeng Chong
Fix the following coccicheck warning: ./drivers/nvme/host/core.c:3580:8-16: WARNING: use scnprintf or sprintf. ./drivers/nvme/host/core.c:3570:8-16: WARNING: use scnprintf or sprintf. ./drivers/nvme/host/core.c:3560:8-16: WARNING: use scnprintf or sprintf. ./drivers/nvme/host/core.c:3526:8-16: WARNING: use scnprintf or sprintf. ./drivers/nvme/host/core.c:2833:8-16: WARNING: use scnprintf or sprintf. Reported-by: Abaci Robot<abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10bcache: Avoid comma separated statementsJoe Perches
Use semicolons and braces. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10bcache: Move journal work to new flush wqKai Krakow
This is potentially long running and not latency sensitive, let's get it out of the way of other latency sensitive events. As observed in the previous commit, the `system_wq` comes easily congested by bcache, and this fixes a few more stalls I was observing every once in a while. Let's not make this `WQ_MEM_RECLAIM` as it showed to reduce performance of boot and file system operations in my tests. Also, without `WQ_MEM_RECLAIM`, I no longer see desktop stalls. This matches the previous behavior as `system_wq` also does no memory reclaim: > // workqueue.c: > system_wq = alloc_workqueue("events", 0, 0); Cc: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Kai Krakow <kai@kaishome.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10bcache: Give btree_io_wq correct semantics againKai Krakow
Before killing `btree_io_wq`, the queue was allocated using `create_singlethread_workqueue()` which has `WQ_MEM_RECLAIM`. After killing it, it no longer had this property but `system_wq` is not single threaded. Let's combine both worlds and make it multi threaded but able to reclaim memory. Cc: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Kai Krakow <kai@kaishome.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10Revert "bcache: Kill btree_io_wq"Kai Krakow
This reverts commit 56b30770b27d54d68ad51eccc6d888282b568cee. With the btree using the `system_wq`, I seem to see a lot more desktop latency than I should. After some more investigation, it looks like the original assumption of 56b3077 no longer is true, and bcache has a very high potential of congesting the `system_wq`. In turn, this introduces laggy desktop performance, IO stalls (at least with btrfs), and input events may be delayed. So let's revert this. It's important to note that the semantics of using `system_wq` previously mean that `btree_io_wq` should be created before and destroyed after other bcache wqs to keep the same assumptions. Cc: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Kai Krakow <kai@kaishome.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10bcache: Fix register_device_aync typoKai Krakow
Should be `register_device_async`. Cc: Coly Li <colyli@suse.de> Signed-off-by: Kai Krakow <kai@kaishome.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10bcache: consider the fragmentation when update the writeback ratedongdong tao
Current way to calculate the writeback rate only considered the dirty sectors, this usually works fine when the fragmentation is not high, but it will give us unreasonable small rate when we are under a situation that very few dirty sectors consumed a lot dirty buckets. In some case, the dirty bucekts can reached to CUTOFF_WRITEBACK_SYNC while the dirty data(sectors) not even reached the writeback_percent, the writeback rate will still be the minimum value (4k), thus it will cause all the writes to be stucked in a non-writeback mode because of the slow writeback. We accelerate the rate in 3 stages with different aggressiveness, the first stage starts when dirty buckets percent reach above BCH_WRITEBACK_FRAGMENT_THRESHOLD_LOW (50), the second is BCH_WRITEBACK_FRAGMENT_THRESHOLD_MID (57), the third is BCH_WRITEBACK_FRAGMENT_THRESHOLD_HIGH (64). By default the first stage tries to writeback the amount of dirty data in one bucket (on average) in (1 / (dirty_buckets_percent - 50)) second, the second stage tries to writeback the amount of dirty data in one bucket in (1 / (dirty_buckets_percent - 57)) * 100 millisecond, the third stage tries to writeback the amount of dirty data in one bucket in (1 / (dirty_buckets_percent - 64)) millisecond. the initial rate at each stage can be controlled by 3 configurable parameters writeback_rate_fp_term_{low|mid|high}, they are by default 1, 10, 1000, the hint of IO throughput that these values are trying to achieve is described by above paragraph, the reason that I choose those value as default is based on the testing and the production data, below is some details: A. When it comes to the low stage, there is still a bit far from the 70 threshold, so we only want to give it a little bit push by setting the term to 1, it means the initial rate will be 170 if the fragment is 6, it is calculated by bucket_size/fragment, this rate is very small, but still much reasonable than the minimum 8. For a production bcache with unheavy workload, if the cache device is bigger than 1 TB, it may take hours to consume 1% buckets, so it is very possible to reclaim enough dirty buckets in this stage, thus to avoid entering the next stage. B. If the dirty buckets ratio didn't turn around during the first stage, it comes to the mid stage, then it is necessary for mid stage to be more aggressive than low stage, so i choose the initial rate to be 10 times more than low stage, that means 1700 as the initial rate if the fragment is 6. This is some normal rate we usually see for a normal workload when writeback happens because of writeback_percent. C. If the dirty buckets ratio didn't turn around during the low and mid stages, it comes to the third stage, and it is the last chance that we can turn around to avoid the horrible cutoff writeback sync issue, then we choose 100 times more aggressive than the mid stage, that means 170000 as the initial rate if the fragment is 6. This is also inferred from a production bcache, I've got one week's writeback rate data from a production bcache which has quite heavy workloads, again, the writeback is triggered by the writeback percent, the highest rate area is around 100000 to 240000, so I believe this kind aggressiveness at this stage is reasonable for production. And it should be mostly enough because the hint is trying to reclaim 1000 bucket per second, and from that heavy production env, it is consuming 50 bucket per second on average in one week's data. Option writeback_consider_fragment is to control whether we want this feature to be on or off, it's on by default. Lastly, below is the performance data for all the testing result, including the data from production env: https://docs.google.com/document/d/1AmbIEa_2MhB9bqhC3rfga9tp7n9YX9PLn0jSUxscVW0/edit?usp=sharing Signed-off-by: dongdong tao <dongdong.tao@canonical.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-10usb: quirks: add quirk to start video capture on ELMO L-12F document camera ↵Stefan Ursella
reliable Without this quirk starting a video capture from the device often fails with kernel: uvcvideo: Failed to set UVC probe control : -110 (exp. 34). Signed-off-by: Stefan Ursella <stefan.ursella@wolfvision.net> Link: https://lore.kernel.org/r/20210210140713.18711-1-stefan.ursella@wolfvision.net Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>