Age | Commit message (Collapse) | Author |
|
'kind' is an enum, thus cast of pointer on 64-bit compile test with W=1
causes:
lm75.c:581:10: error: cast to smaller integer type 'enum lm75_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230810093157.94244-6-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
'kind' is an enum, thus cast of pointer on 64-bit compile test with W=1
causes:
lm63.c:1108:16: error: cast to smaller integer type 'enum chips' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230810093157.94244-5-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
'chip' is an enum, thus cast of pointer on 64-bit compile test with W=1
causes:
ina2xx.c:627:10: error: cast to smaller integer type 'enum ina2xx_ids' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230810093157.94244-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
'chip' is an enum, thus cast of pointer on 64-bit compile test with W=1
causes:
ads7828.c:142:10: error: cast to smaller integer type 'enum ads7828_chips' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230810093157.94244-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
'type' is an enum, thus cast of pointer on 64-bit compile test with W=1
causes:
ad7418.c:256:16: error: cast to smaller integer type 'enum chips' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230810093157.94244-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
'chip' is an enum, thus cast of pointer on 64-bit compile test with W=1
causes:
adt7475.c:1655:10: error: cast to smaller integer type 'enum chips' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230810093157.94244-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Use maxim,max6639 as compatible string for the driver.
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230803144401.1151065-2-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Additional TEMP registers for nct6798d, nct6799d-r and nct6796d-s
This allows the max/max_hyst/crit attributes to be shown/stored
* Increase NUM_TEMP from 10 to 12
* Separate TEMP/MON_TEMP/OVER/HYST/CRIT registers
* Rename "PECI Calibration" to include "TSI" too
* Update ALARM/BEEP bits for temps for 6799
* For 6799, keep temp_fixed_num at 6, but increase
num_temp_alarms/num_temp_beeps to 7/8
Tested with NCT6799D-R showing additional sysfs attributes:
* temp3-temp8: max/max_hyst/beep/alarm
* temp3-temp6: crit/offset
Signed-off-by: Ahmad Khalifa <ahmad@khalifa.ws>
Link: https://lore.kernel.org/r/20230802185820.3642399-1-ahmad@khalifa.ws
[groeck: Addressed cosmetic checkpatch complaints]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add base support for Renesas HS3001 temperature
and humidity sensors and its compatibles HS3002,
HS3003 and HS3004.
The sensor has a fix I2C address 0x44. The resolution
is fixed to 14bit (ref. Missing feature).
Missing feature:
- Accessing non-volatile memory: Custom board has no
possibility to control voltage supply of sensor. Thus,
we cannot send the necessary control commands within
the first 10ms after power-on.
Signed-off-by: Andre Werner <andre.werner@systec-electronic.com>
Link: https://lore.kernel.org/r/20230725042207.22310-2-andre.werner@systec-electronic.com
[groeck: Cosmetic documentation fixup; added documentation to index;
replaced probe_new with probe dropped unused variable]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add binding for Renesas HS3001 Temperature and Relative Humidity Sensor.
Signed-off-by: Andre Werner <andre.werner@systec-electronic.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230726051635.9778-1-andre.werner@systec-electronic.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
The following warning is given by the Smatch static checker:
drivers/hwmon/hp-wmi-sensors.c:1937 hp_wmi_sensors_init()
error: uninitialized symbol 'pevents'.
If there are no instances of the HPBIOS_PlatformEvents WMI object
available, init_platform_events() never initializes this pointer,
which may then be passed to hp_wmi_debugfs_init() uninitialized.
The impact should be limited because hp_wmi_debugfs_init() uses this
pointer only if the count of HPBIOS_PlatformEvents instances is _not_
zero, while conversely, it will be uninitialized only if the count of
such instances _is_ zero. However, passing it uninitialized still
constitutes a bug.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-hwmon/f72c129b-8c57-406a-bf41-bd889b65ea0f@moroto.mountain/
Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230725094817.588640-1-james@equiv.tech
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Commit 2a2b13ae50cf ("platform/x86: wmi: Allow retrieving the number of
WMI object instances") means we no longer need to find this ourselves.
Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230722172513.9324-2-james@equiv.tech
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
* Add additional VIN/IN_MIN/IN_MAX register values
* Separate ALARM/BEEP bits for nct6799
* Update scaling factors for nct6799
Registers/alarms match for NCT6796D-S and NCT6799D-R
Tested on NCT6799D-R for new IN/MIN/MAX and ALARMS
Signed-off-by: Ahmad Khalifa <ahmad@khalifa.ws>
Link: https://lore.kernel.org/r/20230719224142.411237-1-ahmad@khalifa.ws
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
update_interval, temperature/humidity max/min and hyst
were moved to new hwmon interface, and only heater and
repeatability were reserved as non-stardard sysfs interface.
Signed-off-by: JuenKit Yip <JuenKit_Yip@hotmail.com>
Link: https://lore.kernel.org/r/DB4PR10MB626157BC697F2CD6100431359229A@DB4PR10MB6261.EURPRD10.PROD.OUTLOOK.COM
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
MP2973 & MP2971 return PGOOD instead of PB_STATUS_POWER_GOOD_N.
Fix that in the read_word_data hook.
MP2975 should not be affected, but that has not been confirmed with
hardware.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230731092204.2933045-1-Naresh.Solanki@9elements.com
[groeck: Rephrased description to indicate that MP2975 is likely not affected]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add support for PMBUS_IOUT_OC_FAULT_LIMIT.
Add a helper function to convert the limit to LINEAR11 format
and read data->info.phases[0] on MP2971 and MP2973 as well.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-8-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add support to expose the PMBUS regulator.
Tested on MP2973 and MP2971.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-7-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add support for MP2971 and MP2973, the successor of MP2975.
The major differences are:
- On MP2973 and MP2971 the Vref cannot be read and thus most of
the OVP/current calculations won't work.
- MP2973 and MP2971 also support LINEAR format for VOUT
- MP2973 and MP2971 do not support OVP2
- On MP2973 and MP2971 most registers are in LINEAR format
- The IMVP9_EN bit has a different position
- Per phase current sense haven't been implemented.
As on MP2975 most of the FAULT_LIMIT and WARN_LIMIT registers
have been redefined and doesn't provide the functionality as
defined in PMBUS spec.
Tested on MP2973 and MP2971.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-6-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
In order to add support for MP2973 and MP2971 replace hardcoded
phase count for both channels by a variable.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-5-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
In order to upstream MP2973/MP2971 simplify the code by removing support
for various VOUT formats. The MP2973 and MP2971 supports all PMBUS
supported formats for VOUT, while the MP2975 only support DIRECT and
VID for VOUT.
In DIRECT mode all chips report the voltage in 1mV/LSB.
Configure the chip to use DIRECT format for VOUT and drop the code
conversion code for other formats. The to be added chips MP2973/MP2971
will be configured to also report VOUT in DIRECT format.
The maximum voltage that can be reported in DIRECT format is 32768mV.
This is sufficient as the maximum output voltage for VR12/VR13 is
3040 mV.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-4-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add support for differntiating between the chips.
The following commits will make use of this mechanism.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-3-Naresh.Solanki@9elements.com
[groeck: double-cast of_device_get_match_data() to make gcc happy]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add Monolithic Power Systems MP2971 & MP2973 to trivial devices.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230714135124.2645339-2-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Fix whitespace error reported by checkpatch.pl
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Signed-off-by: Naresh Solanki <Naresh.Solanki@9elements.com>
Link: https://lore.kernel.org/r/20230714135124.2645339-1-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add test for sensor type AMDTSI which is available on certain recent
chipsets.
Signed-off-by: Frank Crawford <frank@crawford.emu.id.au>
Link: https://lore.kernel.org/r/20230707123005.956415-4-frank@crawford.emu.id.au
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add test if thermistor sensor type attribute should be visible, i.e.
test if the attribute is defined.
Signed-off-by: Frank Crawford <frank@crawford.emu.id.au>
Link: https://lore.kernel.org/r/20230707123005.956415-3-frank@crawford.emu.id.au
[groeck: Dropped unnecessary 'type' variable in it87_temp_is_visible()]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
The temperature sensor type will need to be used in multiple places, so
split it out into its own function.
Signed-off-by: Frank Crawford <frank@crawford.emu.id.au>
Link: https://lore.kernel.org/r/20230707123005.956415-2-frank@crawford.emu.id.au
[groeck: Dropped unnecessary 'type' variable in show_temp_type()]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Move detection logic to the start of init() function so we won't
instantiate the driver if the board is not compatible.
Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230717222526.229984-3-samsagax@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
* Increase available bits, IN: 16 to 24, FAN: 8 to 12,
TEMP: 6 to 12
* Reorder alarm/beep definitions to match in order to allow
additional inputs in the future
* Remove comments about 'unused' bits as probe() is a better
reference
Testing note:
* Tested on nct6799 with IN/FAN/TEMP, and changing min/max/high/hyst,
that triggers the corresponding alarms correctly. Good confirmation
on the original mapping of the registers and masks.
As to be expected, only 4 fans and 2 temps (fixed) have limits
currently on nct6799 on my board.
* Trouble with testing intrusion alarms and beeps, no way to confirm
those. As I understand now, intrusion/caseopen is probably not
connected on my board.
And I haven't seen a buzzer on a board in ages.
Signed-off-by: Ahmad Khalifa <ahmad@khalifa.ws>
Link: https://lore.kernel.org/r/20230717201050.1657809-1-ahmad@khalifa.ws
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
SMM665 and related chips are power controller/sequencer chips from
Summit Microelectronics. The company was acquired by Qualcomm in 2012,
and support for the chip series stopped.
The chips are long since gone from active use, making the driver
unsupportable and just consuming space and compile time. Remove it.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
nct6799d-r and nct6796d-s are very similar and chip_id is only
different in the version nibblet.
Since both will be detected by the driver anyway due to the
chipid mask, they should be labeled together for dmesg msg.
Signed-off-by: Ahmad Khalifa <ahmad@khalifa.ws>
Link: https://lore.kernel.org/r/20230715195244.1334723-1-ahmad@khalifa.ws
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Use devm_platform_ioremap_resource() to simplify code.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Link: https://lore.kernel.org/r/20230704094306.21933-1-frank.li@vivo.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
This patch introduces support for handling more than 32 DIMMs by
utilizing bitmap operations. The changes ensure that the driver can
handle a higher number of DIMMs efficiently.
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Link: https://lore.kernel.org/r/20230711152144.755177-1-Naresh.Solanki@9elements.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with direct assignment.
strlcpy in this file is used to copy fixed-length strings which can be
completely avoided by direct assignment and is safe to do so. strlen()
is used to return the length of @tbuf.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89
Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230712214307.2424810-1-azeemshaikh38@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230714174607.4057185-1-robh@kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Use the devm_clk_get_enabled() helper function instead of hand-writing it.
It saves some line of codes.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/25f2ab4c61d4fc48e8200f8754bb446f2b89ea89.1688795527.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Like the IBM CFFPS driver, export the PSU's firmware version to a
debugfs attribute as reported in the manufacturer register.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230628153453.122213-1-eajames@linux.ibm.com
[groeck: Dropped unused variable; changed buffer from char to u8]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Since commit 03c835f498b5 ("i2c: Switch .probe() to not take an id
parameter") .probe() is the recommended callback to implement an i2c
driver (again). Reflect this in the documentation and don't mention
.probe_new() which will be dropped soon.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230627064948.593804-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
A user reported that they observe race condition warning [1] and after
looking once again into the DSDT source it was found that wrong mutex
was used.
[1] https://github.com/zeule/asus-ec-sensors/issues/43
Fixes: 790dec13c012 ("hwmon: (asus-ec-sensors) add ROG Crosshair X670E Hero.")
Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20230821115418.25733-2-eugene.shalygin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
This reproduces the bug fixed by "btrfs: fix incorrect splitting in
btrfs_drop_extent_map_range", we were improperly calculating the range
for the split extent. Add a test that exercises this scenario and
validates that we get the correct resulting extent_maps in our tree.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This helper is different from the normal add_extent_mapping in that it
will stuff an em into a gap that exists between overlapping em's in the
tree. It appeared there was a bug so I wrote a self test to validate it
did the correct thing when it worked with two side by side ems.
Thankfully it is correct, but more testing is better.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
While investigating weird problems with the extent_map I wrote a self
test testing the various edge cases of btrfs_drop_extent_map_range.
This can split in different ways and behaves different in each case, so
test the various edge cases to make sure everything is functioning
properly.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
scrub_stripe_read_repair_worker()
Currently the scrub_stripe_read_repair_worker() only does reads to
rebuild the corrupted sectors, it doesn't do any writeback.
The design is mostly to put writeback into a more ordered manner, to
co-operate with dev-replace with zoned mode, which requires every write
to be submitted in their bytenr order.
However the writeback for repaired sectors into the original mirror
doesn't need such strong sync requirement, as it can only happen for
non-zoned devices.
This patch would move the writeback for repaired sectors into
scrub_stripe_read_repair_worker(), which removes two calls sites for
repaired sectors writeback. (one from flush_scrub_stripes(), one from
scrub_raid56_parity_stripe())
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The workqueue fs_info->scrub_worker would go ordered workqueue if it's a
device replace operation.
However the scrub is relying on multiple workers to do data csum
verification, and we always submit several read requests in a row.
Thus there is no need to use ordered workqueue just for dev-replace.
We have extra synchronization (the main thread will always
submit-and-wait for dev-replace writes) to handle it for zoned devices.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[REGRESSION]
There are several regression reports about the scrub performance with
v6.4 kernel.
On a PCIe 3.0 device, the old v6.3 kernel can go 3GB/s scrub speed, but
v6.4 can only go 1GB/s, an obvious 66% performance drop.
[CAUSE]
Iostat shows a very different behavior between v6.3 and v6.4 kernel:
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 9731.00 3425544.00 17237.00 63.92 2.18 352.02 21.18 100.00
nvme0n1p3 15578.00 993616.00 5.00 0.03 0.09 63.78 1.32 100.00
The upper one is v6.3 while the lower one is v6.4.
There are several obvious differences:
- Very few read merges
This turns out to be a behavior change that we no longer do bio
plug/unplug.
- Very low aqu-sz
This is due to the submit-and-wait behavior of flush_scrub_stripes(),
and extra extent/csum tree search.
Both behaviors are not that obvious on SATA SSDs, as SATA SSDs have NCQ
to merge the reads, while SATA SSDs can not handle high queue depth well
either.
[FIX]
For now this patch focuses on the read speed fix. Dev-replace replace
speed needs more work.
For the read part, we go two directions to fix the problems:
- Re-introduce blk plug/unplug to merge read requests
This is pretty simple, and the behavior is pretty easy to observe.
This would enlarge the average read request size to 512K.
- Introduce multi-group reads and no longer wait for each group
Instead of the old behavior, which submits 8 stripes and waits for
them, here we would enlarge the total number of stripes to 16 * 8.
Which is 8M per device, the same limit as the old scrub in-flight
bios size limit.
Now every time we fill a group (8 stripes), we submit them and
continue to next stripes.
Only when the full 16 * 8 stripes are all filled, we submit the
remaining ones (the last group), and wait for all groups to finish.
Then submit the repair writes and dev-replace writes.
This should enlarge the queue depth.
This would greatly improve the merge rate (thus read block size) and
queue depth:
Before (with regression, and cached extent/csum path):
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 20666.00 1318240.00 10.00 0.05 0.08 63.79 1.63 100.00
After (with all patches applied):
nvme0n1p3 5165.00 2278304.00 30557.00 85.54 0.55 441.10 2.81 100.00
i.e. 1287 to 2224 MB/s.
CC: stable@vger.kernel.org # 6.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
One of the bottleneck of the new scrub code is the extra csum tree
search.
The old code would only do the csum tree search for each scrub bio,
which can be as large as 512KiB, thus they can afford to allocate a new
path each time.
But the new scrub code is doing csum tree search for each stripe, which
is only 64KiB, this means we'd better re-use the same csum path during
each search.
This patch would introduce a per-sctx path for csum tree search, as we
don't need to re-allocate the path every time we need to do a csum tree
search.
With this change we can further improve the queue depth and improve the
scrub read performance:
Before (with regression and cached extent tree path):
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 15875.00 1013328.00 12.00 0.08 0.08 63.83 1.35 100.00
After (with both cached extent/csum tree path):
nvme0n1p3 17759.00 1133280.00 10.00 0.06 0.08 63.81 1.50 100.00
Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure")
CC: stable@vger.kernel.org # 6.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since commit e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror()
to scrub_stripe infrastructure"), scrub no longer re-use the same path
for extent tree search.
This can lead to unnecessary extent tree search, especially for the new
stripe based scrub, as we have way more stripes to prepare.
This patch would re-introduce a shared path for extent tree search, and
properly release it when the block group is scrubbed.
This change alone can improve scrub performance slightly by reducing the
time spend preparing the stripe thus improving the queue depth.
Before (with regression):
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 15578.00 993616.00 5.00 0.03 0.09 63.78 1.32 100.00
After (with this patch):
nvme0n1p3 15875.00 1013328.00 12.00 0.08 0.08 63.83 1.35 100.00
Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure")
CC: stable@vger.kernel.org # 6.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs supports creating nested subvolumes however snapshots are not
recursive. When a snapshot is taken of a volume which contains a
subvolume the subvolume is replaced with a stub subvolume which has the
same name and uses inode number 2[1]. The stub subvolume kept the
directory name but did not set the time or permissions of the stub
subvolume. This resulted in all time information being the current time
and ownership defaulting to root. When subvolumes and snapshots are
created using unshare this results in a snapshot directory the user
created but has no permissions for.
Test case:
[vmuser@archvm ~]# sudo -i
[root@archvm ~]# mkdir -p /mnt/btrfs/test
[root@archvm ~]# chown vmuser:users /mnt/btrfs/test/
[root@archvm ~]# exit
logout
[vmuser@archvm ~]$ cd /mnt/btrfs/test
[vmuser@archvm test]$ unshare --user --keep-caps --map-auto --map-root-user
[root@archvm test]# btrfs subvolume create subvolume
Create subvolume './subvolume'
[root@archvm test]# btrfs subvolume create subvolume/subsubvolume
Create subvolume 'subvolume/subsubvolume'
[root@archvm test]# btrfs subvolume snapshot subvolume snapshot
Create a snapshot of 'subvolume' in './snapshot'
[root@archvm test]# exit
logout
[vmuser@archvm test]$ tree -ug
[vmuser users ] .
├── [vmuser users ] snapshot
│ └── [vmuser users ] subsubvolume <-- Without patch perm is root:root
└── [vmuser users ] subvolume
└── [vmuser users ] subsubvolume
5 directories, 0 files
[1] https://btrfs.readthedocs.io/en/latest/btrfs-subvolume.html#nested-subvolumes
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
At btrfs_readdir_delayed_dir_index(), called when reading a directory, we
have this check for an empty list to return immediately, but it's not
needed since list_for_each_entry_safe(), called immediately after, is
prepared to deal with an empty list, it simply does nothing. So remove
the empty list check.
Besides shorter source code, it also slightly reduces the binary text
size:
Before this change:
$ size fs/btrfs/btrfs.ko
text data bss dec hex filename
1609408 167269 16864 1793541 1b5e05 fs/btrfs/btrfs.ko
After this change:
$ size fs/btrfs/btrfs.ko
text data bss dec hex filename
1609392 167269 16864 1793525 1b5df5 fs/btrfs/btrfs.ko
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
fs_devices::metadata_uuid value is already updated based on the
super_block::METADATA_UUID flag for either fsid or metadata_uuid as
appropriate. So, fs_devices::metadata_uuid can be used directly.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The function btrfs_validate_super() should verify the metadata_uuid in
the provided superblock argument. Because, all its callers expect it to
do that.
Such as in the following stacks:
write_all_supers()
sb = fs_info->super_for_commit;
btrfs_validate_write_super(.., sb)
btrfs_validate_super(.., sb, ..)
scrub_one_super()
btrfs_validate_super(.., sb, ..)
And
check_dev_super()
btrfs_validate_super(.., sb, ..)
However, it currently verifies the fs_info::super_copy::metadata_uuid
instead. Fix this using the correct metadata_uuid in the superblock
argument.
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|