summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-01ASoC: topology: Move skl-tplg-interface.h to uapiGuenter Roeck
skl-tplg-interface.h describes firmware format details for Skylake topology files. It is part of the ABI and should reside in the uapi directory. While moving the file, also replace the license boilerplate with the SPDX License Identifier. No functional change. Signed-off-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-06-01ASoC: topology: Move v4 manifest header data structures to uapiGuenter Roeck
Topology manifest v4 is still part of the ABI. Move its data structures into the uapi header file. No functional change. Signed-off-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-06-01ASoC: topology: Improve backwards compatibility with v4 topology filesGuenter Roeck
Commit dc31e741db49 ("ASoC: topology: ABI - Add the types for BE DAI") introduced sound topology files version 5. Initially, this change made the topology code incompatible with v4 topology files. Backwards compatibility with v4 configuration files was subsequently added with commit 288b8da7e992 ("ASoC: topology: Support topology file of ABI v4"). Unfortunately, backwards compatibility was never fully implemented. First, the manifest size in (Skylake) v4 configuration files is set to 0, which causes manifest_new_ver() to bail out with error messages similar to the following. snd_soc_skl 0000:00:1f.3: ASoC: invalid manifest size snd_soc_skl 0000:00:1f.3: tplg component load failed-22 snd_soc_skl 0000:00:1f.3: Failed to init topology! snd_soc_skl 0000:00:1f.3: ASoC: failed to probe component -22 skl_n88l25_m98357a skl_n88l25_m98357a: ASoC: failed to instantiate card -22 skl_n88l25_m98357a: probe of skl_n88l25_m98357a failed with error -22 After this problem is fixed, the following error message is seen instead. snd_soc_skl 0000:00:1f.3: ASoC: old version of manifest snd_soc_skl 0000:00:1f.3: Invalid descriptor token 1093938482 snd_soc_skl 0000:00:1f.3: ASoC: failed to load widget media0_in cpr 0 snd_soc_skl 0000:00:1f.3: tPlg component load failed-22 This message is seen because backwards compatibility for loading widgets was never implemented. The lack of audio support when running the upstream kernel on recent Chromebooks has been reported in various forums, and can be traced back to this problem. Attempts to fix the problem, usually by providing v5 configuration files, were only partially successful. Let's implement backward compatibility properly to solve the problem for good. Signed-off-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-06-01hwmon: (asus_atk0110) Make use of device managed memoryBastian Germann
Use devm_* variants of kstrdup and kzalloc. Get rid of kfree cleanups. Signed-off-by: Bastian Germann <bastiangermann@fishpost.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01hwmon: (asus_atk0110) Replace deprecated device register callBastian Germann
Make the asus_atk0110 driver use hwmon_device_register_with_groups instead of the deprecated hwmon_device_register. Construct the expected attribute_group array from the sensor list which contains all needed attributes. Remove the manual sysfs file creation and deletion that are now taken care of by the (un)register calls via the attribute_group array. Signed-off-by: Bastian Germann <bastiangermann@fishpost.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01tools/power turbostat: fix MSR_IA32_MISC_ENABLE MWAIT printoutLen Brown
MSR_IA32_MISC_ENABLE[18] is the MWAIT ENABLE bit, not DISABLE bit... so MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST No-MWAIT PREFETCH TURBO) should print as: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT PREFETCH TURBO) Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: fix printing on inputArtem Bityutskiy
The recent patch that implements table printing on a keypress introduced a regression - turbostat prints the table almost continuously if it is run from a daemon program. The problem is also easy to reproduce like this: echo | turbostat The reason is that we cannot assume that stdin is always a TTY. It can be many things. This patch adds fixes the problem by limiting the new keypress functionality to TTYs only. If stdin is not a TTY, we just sleep for the full interval time. While on it, clean-up 'do_sleep()' to return no value, as callers do not expect that anyway. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: end current interval upon newline inputLen Brown
In turbostat interval mode, a newline typed on standard input will now conclude the current interval. Data will immediately be collected and printed for that interval, and the next interval will be started. This is similar to the recently added SIGUSR1 feature. But that is for use by programs, while this is for interactive use. Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: on SIGUSR1: sample, print and continueLen Brown
Interval-mode turbostat now catches and discards SIGUSR1. Thus, SIGUSR1 can be used to tell turbostat to cut short the current measurement interval. Turbostat will then start the next measurement interval using the regular interval length. This can be used to give turbostat variable intervals. Invoke turbostat with --interval LARGE_NUMBER_SEC and have a program that has permission to send it a SIGUSR1 always before LARGE_NUMBER_SEC expires. It may also be useful to use "--enable Time_Of_Day_Seconds" to observe the actual interval length. Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: on SIGINT: sample, print and exitLen Brown
When running in interval-mode, catch interrupts and print a final data record before exiting. Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: add --enable Time_Of_Day_SecondsLen Brown
Add a Time_Of_Day_Seconds column showing when measurement for each row was completed. Units are [sec.subsec] since Epoch, as reported by gettimeofday(2). While useful to correlate turbostat output with other tools, this built-in column is disabled, by default. Add the "--enable" option to enable such disabled-by-default built-in columns: "--enable Time_Of_Day_Seconds" "--enable usec" "--enable all", will enable all disabled-by-defauilt built-in counters. When "--debug" is used, all disabled-by-default columns are enabled, unless explicitly skipped using "--hide" Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01tools/power turbostat: fix Skylake Xeon package C-state displayArtem Bityutskiy
Turbostat neglects to display all package C-states for some Skylake Xeon BIOS configurations. This is due to a typo in the table decoding MSR_PKG_CST_CONFIG_CONTROL (0x000000e2) Here we fix that typo, according to Intel SDM, vol 4, Table 2-41 - "MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H". Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01MAINTAINERS: add turbostat utilityLen Brown
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01xfs: fix error handling in xfs_refcount_insert()Dave Chinner
generic/475 fired an assert failure just after the filesystem was shut down: XFS: Assertion failed: fs_is_ok, file: fs/xfs/libxfs/xfs_refcount.c, line: 182 ..... Call Trace: xfs_refcount_insert+0x151/0x190 xfs_refcount_adjust_extents.constprop.11+0x9c/0x470 xfs_refcount_adjust.constprop.10+0xb0/0x270 xfs_refcount_finish_one+0x25a/0x420 xfs_trans_log_finish_refcount_update+0x2a/0x40 xfs_refcount_update_finish_item+0x35/0xa0 xfs_defer_finish+0x15e/0x4d0 xfs_reflink_remap_extent+0x1bc/0x610 xfs_reflink_remap_blocks+0x6e/0x280 xfs_reflink_remap_range+0x311/0x530 vfs_clone_file_range+0x119/0x200 .... If xfs_btree_insert() returns an error, the corruption check fires instead of passing the error back the caller. The corruption check should be after we've checked for an error, not before, thereby avoiding assert failures if the filesystem shuts down during a refcount btree record insert. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01xfs: fix xfs_rtalloc_rec unitsDarrick J. Wong
All the realtime allocation functions deal with space on the rtdev in units of realtime extents. However, struct xfs_rtalloc_rec confusingly uses the word 'block' in the name, even though they're really extents. Fix the naming problem and fix all the unit handling problems in the two existing users. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2018-06-01xfs: strengthen rtalloc query range checksDarrick J. Wong
Strengthen the rtalloc range query checks to make sure that the keys do not run off the end of the realtime device inappropriately. Note that the query range functions require units of rt extents, not blocks, despite the type name. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2018-06-01xfs: xfs_rtbuf_get should check the bmapi_read resultsDarrick J. Wong
The xfs_rtbuf_get function should check the block mapping it gets back from bmapi_read. If there are no mappings or the mapping isn't a real extent, we should return -EFSCORRUPTED rather than trying to read a garbage value. We also require realtime bitmap blocks to be real, written allocations. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2018-06-01xfs: xfs_rtword_t should be unsigned, not signedDarrick J. Wong
xfs_rtword_t is used for bit manipulations in the realtime bitmap file. Since we're performing bit shifts with this type, we don't want sign extension and we don't want to be left shifting negative quantities because that's undefined behavior. This also shuts up these UBSAN warnings: UBSAN: Undefined behaviour in fs/xfs/libxfs/xfs_rtbitmap.c:833:48 signed integer overflow: -2147483648 - 1 cannot be represented in type 'int' Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2018-06-01hwmon: (k10temp) Make function get_raw_temp staticColin Ian King
The function get_raw_temp is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/hwmon/k10temp.c:149:14: warning: symbol 'get_raw_temp' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-02powerpc/mm: Fix kernel crash on page table freeAneesh Kumar K.V
Fix the below crash on Book3E 64. pgtable_page_dtor expects struct page *arg. Also call the destructor on non book3s platforms correctly. This frees up the split PTL locks correctly if we had allocated them before. Call Trace: .kmem_cache_free+0x9c/0x44c (unreliable) .ptlock_free+0x1c/0x30 .tlb_remove_table+0xdc/0x224 .free_pgd_range+0x298/0x500 .shift_arg_pages+0x10c/0x1e0 .setup_arg_pages+0x200/0x25c .load_elf_binary+0x450/0x16c8 .search_binary_handler.part.11+0x9c/0x248 .do_execveat_common.isra.13+0x868/0xc18 .run_init_process+0x34/0x4c .try_to_run_init_process+0x1c/0x68 .kernel_init+0xdc/0x130 .ret_from_kernel_thread+0x58/0x7c Fixes: 702346768 ("powerpc/mm/nohash: Remove pte fragment dependency from nohash") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-02powerpc/prom: Fix %u/%llx usage since prom_printf() changeMathieu Malaterre
In commit eae5f709a4d7 ("powerpc: Add __printf verification to prom_printf") __printf attribute was added to prom_printf(), which means GCC started warning about type/format mismatches. As part of that commit we changed some "%lx" formats to "%llx" where the type is actually unsigned long long. Unfortunately prom_printf() doesn't know how to print "%llx", it just prints a literal "lx", eg: reserved memory map: lx - lx lx - lx prom_printf() also doesn't know how to print "%u" (only "%lu"), it just prints a literal "u", eg: Max number of cores passed to firmware: u (NR_CPUS = 2048) Instead of: Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) This commit adds support for the missing formatters. Fixes: eae5f709a4d7 ("powerpc: Add __printf verification to prom_printf") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Mathieu Malaterre <malat@debian.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-02cxl: Configure PSL to not use APC virtual machinesVaibhav Jain
APC virtual machines arent used on POWER-9 chips and are already disabled in on-chip CAPP. They also need to be disabled on the PSL via 'PSL Data Send Control Register' by setting bit(47). This forces the PSL to send commands to CAPP with queue.id == 0. Fixes: 5632874311db ("cxl: Add support for POWER9 DD2") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-02cxl: Disable prefault_mode in Radix modeVaibhav Jain
Currently we see a kernel-oops reported on Power-9 while attaching a context to an AFU, with radix-mode and sysfs attr 'prefault_mode' set to anything other than 'none'. The backtrace of the oops is of this form: Unable to handle kernel paging request for data at address 0x00000080 Faulting instruction address: 0xc00800000bcf3b20 cpu 0x1: Vector: 300 (Data Access) at [c00000037f003800] pc: c00800000bcf3b20: cxl_load_segment+0x178/0x290 [cxl] lr: c00800000bcf39f0: cxl_load_segment+0x48/0x290 [cxl] sp: c00000037f003a80 msr: 9000000000009033 dar: 80 dsisr: 40000000 current = 0xc00000037f280000 paca = 0xc0000003ffffe600 softe: 3 irq_happened: 0x01 pid = 3529, comm = afp_no_int <snip> cxl_prefault+0xfc/0x248 [cxl] process_element_entry_psl9+0xd8/0x1a0 [cxl] cxl_attach_dedicated_process_psl9+0x44/0x130 [cxl] native_attach_process+0xc0/0x130 [cxl] afu_ioctl+0x3f4/0x5e0 [cxl] do_vfs_ioctl+0xdc/0x890 ksys_ioctl+0x68/0xf0 sys_ioctl+0x40/0xa0 system_call+0x58/0x6c The issue is caused as on Power-8 the AFU attr 'prefault_mode' was used to improve initial storage fault performance by prefaulting process segments. However on Power-9 with radix mode we don't have Storage-Segments that we can prefault. Also prefaulting process Pages will be too costly and fine-grained. Hence, since the prefaulting mechanism doesn't makes sense of radix-mode, this patch updates prefault_mode_store() to not allow any other value apart from CXL_PREFAULT_NONE when radix mode is enabled. Fixes: f24be42aab37 ("cxl: Add psl9 specific code") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-01ALSA: pci/hda: Remove unused, broken, header fileBen Hutchings
sound/pci/hda/local.h seems to be an earlier version of sound/hda/local.h; it was added at the same time but doesn't seem to have ever been used (within the git history). Most of its macros depend on a hdac_read_parm() function which is not defined anywhere. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-06-01net: mvpp2: Split the PPv2 driver to a dedicated directoryMaxime Chevallier
As the mvpp2 driver is growing, move this driver to a dedicated directory and split it into several files. Since this driver has a lot of register defines and structure definitions, it can benefit from having all of this into a dedicated header file, named mvpp2.h. A good chunk of the mvpp2 code is dedicated to Header Parser handling, so we introduce mvpp2_prs.h where all Header Parser definitions are located, and mvpp2_prs.c containing the related code. In the same way, mvpp2_cls.h and mvpp2_cls.c are created to contain Classifier and RSS related code. The former 'mvpp2.c' file is renamed 'mvpp2_main.c' so that we can keep the driver binary named 'mvpp2'. This commit is only about spliting the driver into multiple files and doesn't introduce any new function, feature or fix besides removing 'static' keywords when needed. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01mm: Remove return value of zap_vma_ptes()Leon Romanovsky
All callers of zap_vma_ptes() are not interested in the return value of that function, so let's simplify its interface and drop the return value. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/hns_roce: Don't check return value of zap_vma_ptes()Doug Ledford
There is no need to check return value of zap_vma_ptes() because there is nothing to do with this knowledge. Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/mlx4: Don't crash machine if zap_vma_ptes() failsLeon Romanovsky
The failure reported by zap_vma_ptes() means that wrong VMA pages were supplied, however it is impossible for this type of address. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/mlx5: Don't check return value of zap_vma_ptes()Leon Romanovsky
There is no need to check return value of zap_vma_ptes() because there is nothing to do with this knowledge. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/mad: Convert BUG_ONs to error flowsLeon Romanovsky
Let's perform checks in-place instead of BUG_ONs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/mad: Delete inaccessible BUG_ONLeon Romanovsky
There is no need to check existence of mad_queue, because we already did pointer dereference before call to dequeue_mad(). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/cma: Ignore unknown eventLeon Romanovsky
There is no need to bring down the whole machine, just because unknown event was received. It is better to ignore it silently. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/cm: Abort loop in case of CM dequeueLeon Romanovsky
In case CM work list is empty, the work pointer will be NULL, so instead of kernel crash it is better to abort processing of works. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/cxgb3: Don't crash kernel just because IDR is fullLeon Romanovsky
cxgb3 driver properly handles errors returned by IDR, so there is no need to have special case (kernel crash) just because IDR is full. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/mlx4: Discard unknown SQP work requestsLeon Romanovsky
There is no need to crash the machine if unknown work request was received in SQP MAD. Cc: <stable@vger.kernel.org> # 3.6 Fixes: 37bfc7c1e83f ("IB/mlx4: SR-IOV multiplex and demultiplex MADs") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01RDMA/mlx4: Catch FW<->SW misalignment without machine crashLeon Romanovsky
Any steering QP is supposed be above steering_qp_base, see function mlx4_ib_steer_qp_alloc() for it, however in case of misalignment between SW and FW, this qp_base can be wrong. Use WARN() to catch such situation without killing the machine. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01net: dsa: b53: Add BCM5389 supportDamien Thébault
This patch adds support for the BCM5389 switch connected through MDIO. Signed-off-by: Damien Thébault <damien.thebault@vitec.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01net: sched: split tc_ctl_tfilter into three handlersVlad Buslov
tc_ctl_tfilter handles three netlink message types: RTM_NEWTFILTER, RTM_DELTFILTER, RTM_GETTFILTER. However, implementation of this function involves a lot of branching on specific message type because most of the code is message-specific. This significantly complicates adding new functionality and doesn't provide much benefit of code reuse. Split tc_ctl_tfilter to three standalone functions that handle filter new, delete and get requests. The only truly protocol independent part of tc_ctl_tfilter is code that looks up queue, class, and block. Refactor this code to standalone tcf_block_find function that is used by all three new handlers. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01lightnvm: pblk: take bitmap alloc. out of critical sectionJavier González
pblk allocates line bitmaps within the line lock unnecessarily. In order to take pressure out of the fast patch, allocate line bitmaps outside of this lock and refactor accordingly. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: kick writer on new flush pointsHans Holmberg
Unless we kick the writer directly when setting a new flush point, the user risks having to wait for up to one second (the default timeout for the write thread to be kicked) for the IO to complete. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: only try to recover lines with written smetaHans Holmberg
When switching between different lun configurations, there is no guarantee that all lines that contain closed/open chunks have some valid data to recover. Check that the smeta chunk has been written to instead. Also skip bad lines (that does not have enough good chunks). Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: remove unnecessary bio_get/putJavier González
In the read path, pblk gets a reference to the incoming bio and puts it after ending the bio. Though this behavior is correct, it is unnecessary since pblk is the one putting the bio, therefore, it cannot disappear underneath it. Removing this reference, allows to clean up rqd->bio and avoids pointer bouncing for the different read paths. Now, the incoming bio always resides in the read context and pblk's internal bios (if any) reside in rqd->bio. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: add possibility to set write buffer size manuallyMarcin Dziegielewski
In some cases, users can want set write buffer size manually, e.g. to adjust it to specific workload. This patch provides the possibility to set write buffer size via module parameter feature. Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: fix partial read error pathIgor Konopko
When error occurs during bio_add_page on partial read path, pblk tries to free pages twice. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: proper error handling for pblk_bio_add_pagesIgor Konopko
Currently in case of error caused by bio_pc_add_page in pblk_bio_add_pages two issues occur when calling from pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we are trying to free pages not allocated from our mempool. Second one is the warn from dma_pool_free, that we are trying to free NULL pointer dma. This commit fix both issues. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: fix smeta write error pathHans Holmberg
Smeta write errors were previously ignored. Skip these lines instead and throw them back on the free list, so the chunks will go through a reset cycle before we attempt to use the line again. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: garbage collect lines with failed writesHans Holmberg
Write failures should not happen under normal circumstances, so in order to bring the chunk back into a known state as soon as possible, evacuate all the valid data out of the line and let the fw judge if the block can be written to in the next reset cycle. Do this by introducing a new gc list for lines with failed writes, and ensure that the rate limiter allocates a small portion of the write bandwidth to get the job done. The lba list is saved in memory for use during gc as we cannot gurantee that the emeta data is readable if a write error occurred. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: rework write error recovery pathHans Holmberg
The write error recovery path is incomplete, so rework the write error recovery handling to do resubmits directly from the write buffer. When a write error occurs, the remaining sectors in the chunk are mapped out and invalidated and the request inserted in a resubmit list. The writer thread checks if there are any requests to resubmit, scans and invalidates any lbas that have been overwritten by later writes and resubmits the failed entries. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01arm64: signal: Report signal frame size to userspace via auxvDave Martin
Stateful CPU architecture extensions may require the signal frame to grow to a size that exceeds the arch's MINSIGSTKSZ #define. However, changing this #define is an ABI break. To allow userspace the option of determining the signal frame size in a more forwards-compatible way, this patch adds a new auxv entry tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame size that the process can observe during its lifetime. If AT_MINSIGSTKSZ is absent from the aux vector, the caller can assume that the MINSIGSTKSZ #define is sufficient. This allows for a consistent interface with older kernels that do not provide AT_MINSIGSTKSZ. The idea is that libc could expose this via sysconf() or some similar mechanism. There is deliberately no AT_SIGSTKSZ. The kernel knows nothing about userspace's own stack overheads and should not pretend to know. For arm64: The primary motivation for this interface is the Scalable Vector Extension, which can require at least 4KB or so of extra space in the signal frame for the largest hardware implementations. To determine the correct value, a "Christmas tree" mode (via the add_all argument) is added to setup_sigframe_layout(), to simulate addition of all possible records to the signal frame at maximum possible size. If this procedure goes wrong somehow, resulting in a stupidly large frame layout and hence failure of sigframe_alloc() to allocate a record to the frame, then this is indicative of a kernel bug. In this case, we WARN() and no attempt is made to populate AT_MINSIGSTKSZ for userspace. For arm64 SVE: The SVE context block in the signal frame needs to be considered too when computing the maximum possible signal frame size. Because the size of this block depends on the vector length, this patch computes the size based not on the thread's current vector length but instead on the maximum possible vector length: this determines the maximum size of SVE context block that can be observed in any signal frame for the lifetime of the process. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01arm64/sve: Thin out initialisation sanity-checks for sve_max_vlDave Martin
Now that the kernel SVE support is reasonably mature, it is excessive to default sve_max_vl to the invalid value -1 and then sprinkle WARN_ON()s around the place to make sure it has been initialised before use. The cpufeatures code already runs pretty early, and will ensure sve_max_vl gets initialised. This patch initialises sve_max_vl to something sane that will be supported by every SVE implementation, and removes most of the sanity checks. The checks in find_supported_vector_length() are retained for now. If anything goes horribly wrong, we are likely to trip a check here sooner or later. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>