Age | Commit message (Collapse) | Author |
|
There are fields 'last_hwmon_state' and 'last_thermal_state' in the
structure 'emc2305_cdev_data', which respectively store the cooling state
set by the 'hwmon' and 'thermal' subsystem, and the driver author hopes
that if the state set by 'hwmon' is lower than the value set by 'thermal',
the driver will just save it without actually setting the pwm. Currently,
the 'last_thermal_state' also be updated by 'hwmon', which will cause the
cooling state to never be set to a lower value. This patch fixes that.
Signed-off-by: Xingjiang Qiao <nanpuyue@gmail.com>
Link: https://lore.kernel.org/r/20221206055331.170459-2-nanpuyue@gmail.com
Fixes: 0d8400c5a2ce1 ("hwmon: (emc2305) add support for EMC2301/2/3/5 RPM-based PWM Fan Speed Controller.")
[groeck: renamed emc2305_set_cur_state_shim -> __emc2305_set_cur_state]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
readl() already handles endian conversion. That's the main difference
between readl() and __raw_readl(). This is benign on little-endian
systems, but big endian systems will end up byte-swabbing twice.
Fixes: 2905cb5236cb ("cxl/pci: Add (hopeful) error handling support")
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030092025.4045167.10651070153523351093.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The first argument to the CXL AER trace points is the source device.
Pass a 'const struct device *' rather than a 'const char *' for more
type precision / safety.
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030091477.4045167.15174636482098463885.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The definitions of 'EMC2305_REG_PRODUCT_ID' and 'EMC2305_REG_DEVICE' are
both '0xfd', they actually return the same value, but the values returned
by emc2301/2/3/5 are different, so probe emc2301/2/3 will fail, This patch
fixes that.
Signed-off-by: Xingjiang Qiao <nanpuyue@gmail.com>
Link: https://lore.kernel.org/r/20221206055331.170459-1-nanpuyue@gmail.com
Fixes: 0d8400c5a2ce1 ("hwmon: (emc2305) add support for EMC2301/2/3/5 RPM-based PWM Fan Speed Controller.")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
CXL PMEM security operations are routed through the NVDIMM sysfs
interface. For this reason the corresponding commands are marked
"exclusive" to preclude collisions between the ioctl ABI and the sysfs
ABI. However, a better way to preclude that collision is to simply
remove the ioctl ABI (command-id definitions) for those operations.
Now that cxl_internal_send_cmd() (formerly cxl_mbox_send_cmd()) no
longer needs to talk the cxl_mem_commands array, all of the uapi
definitions for the security commands can be dropped.
These never appeared in a released kernel, so no regression risk.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030056464.4044561.11486507095384253833.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
cxl_internal_send_cmd() skips output size validation for variable output
commands which is not ideal. Most of the time internal usages want to
fail if the output size does not match what was requested. For other
commands where the caller cannot predict the size there is usually a
a header that conveys how much vaild data is in the payload. For those
cases add @min_out as a parameter to specify what the minimum response
payload needs to be for the caller to parse the rest of the payload.
In this patch only Get Supported Logs has that behavior, but going
forward records retrieval commands like Get Poison List and Get Event
Records can use @min_out to retrieve a variable amount of records.
Critically, this validation scheme skips the needs to interrogate the
cxl_mem_commands array which in turn frees up the implementation to
support internal command enabling without also enabling external / user
commands.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030055918.4044561.10339573829837910505.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Internally cxl_mbox_send_cmd() converts all passed-in parameters to a
'struct cxl_mbox_cmd' instance and sends that to cxlds->mbox_send(). It
then teases the possibilty that the caller can validate the output size.
However, they cannot since the resulting output size is not conveyed to
the called. Fix that by making the caller pass in a constructed 'struct
cxl_mbox_cmd'. This prepares for a future patch to add output size
validation on a per-command basis.
Given the change in signature, also change the name to differentiate it
from the user command submission path that performs more validation
before generating the 'struct cxl_mbox_cmd' instance to execute.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030055370.4044561.17788093375112783036.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Multi-byte integer values in CXL mailbox payloads are little endian. Add
a definition of the Get Security State output payload and convert the
value before testing flags.
Fixes: 328281155539 ("cxl/pmem: Introduce nvdimm_security_ops with ->get_flags() operation")
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030054822.4044561.4917796262037689553.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
include/linux/lsm_hooks.h reports the result of the LSM infrastructure to
the callers, not what LSMs should return to the LSM infrastructure.
Clarify that and add that if all LSMs return a positive value
__vm_enough_memory() will be called with cap_sys_admin set. If at least one
LSM returns 0 or negative, it will be called with cap_sys_admin cleared.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
"make dtbs_check":
arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dtb: i2cswitch@70: $nodename:0: 'i2cswitch@70' does not match '^(i2c-?)?mux'
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dtb: i2cswitch@70: Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'i2c@0', 'i2c@1', 'i2c@2', 'i2c@3', 'i2c@4', 'i2c@5', 'i2c@6', 'i2c@7' were unexpected)
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
Fix this by renaming the PCA9548 node to "i2c-mux", to match the I2C bus
multiplexer/switch DT bindings and the Generic Names Recommendation in
the Devicetree Specification.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
|
|
Emails to Jee Heng Sia bounce ("550 #5.1.0 Address rejected."). Add
Keembay platform maintainers as Keembay I2S maintainers.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221205164254.36418-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The examples' cache nodes are incomplete as 'cache-unified' and
'cache-level' are required cache properties.
Acked-by: Amit Kucheria <amitk@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221104162450.1982114-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Convert the SPI IR LED bindings to DT schema.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221204104323.117974-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Convert the PWM IR LED bindings to DT schema.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221204104323.117974-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Convert the GPIO IR LED bindings to DT schema.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221204104323.117974-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The binding allows two type of LEDs - single and multi-color. They
differ with properties, so fix the bindings to accept both cases.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221127204058.57111-6-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The binding allows two type of LEDs - single and multi-color. They
differ with properties, so fix the bindings to accept both cases.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221127204058.57111-5-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The preferred name suffix for properties with single and multiple GPIOs
is "gpios". Linux GPIO core code supports both. The DTS has mixed
usage, so switch to preferred naming:
omap3-n900.dtb: lp5523@32: 'enable-gpios' does not match any of the regexes: '^led@[0-8]$', '^multi-led@[0-8]$', 'pinctrl-[0-9]+'
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221127204058.57111-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The Linux driver and at least one upstream board use 'label' property:
qcom/msm8996-xiaomi-gemini.dtb: lp5562@30: 'label' does not match any of the regexes: '^led@[0-8]$', '^multi-led@[0-8]$', 'pinctrl-[0-9]+'
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221127204058.57111-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The common.yaml schema allows further properties, so the bindings using
it should restrict it with unevaluatedProperties:false.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221127204058.57111-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Document compatible for tsens on Qualcomm SM6115 platform
according to downstream dts it ship v2.4 of IP
Signed-off-by: Adam Skladowski <a39.skl@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221130200950.144618-3-a39.skl@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
"linux,initrd-start" and "linux,initrd-end" can be 32-bit values even on
a 64-bit platform. Ideally, the size should be based on
'#address-cells', but that has never been enforced in the kernel's FDT
boot parsing code (early_init_dt_check_for_initrd()). Bootloader
behavior is known to vary. For example, kexec always writes these as
64-bit. The result of incorrectly reading 32-bit values is most likely
the reserved memory for the original initrd will still be reserved
for the new kernel. The original arm64 equivalent of this code failed to
release the initrd reserved memory in *all* cases.
Use of_read_number() to mirror the early_init_dt_check_for_initrd()
code.
Fixes: b30be4dc733e ("of: Add a common kexec FDT setup function")
Cc: stable@vger.kernel.org
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Link: https://lore.kernel.org/r/20221128202440.1411895-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Compared to the txt description this adds clocks and clock-names to
match reality.
Note that fsl,imx-lcdc was picked as the new name as this is the actual
hardware's name. There will be a new binding implementing the saner drm
concept that is supposed to supersede this legacy fb binding
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221129180414.2729091-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
json-schema patterns by default will match anywhere in a string, so
typically we want at least the start or end anchored. Fix the obvious
cases where the anchors were forgotten.
Acked-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20221118223728.1721589-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Document the compatibles that are already in use in the upstream Linux
kernel to resolve dtbs_check warnings.
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221013091208.356739-1-luca.weiss@fairphone.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Add devicetree binding for Orient Chip OCP8110 charge pump used for
camera flash LEDs.
Signed-off-by: André Apitzsch <git@apitzsch.eu>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220505185344.10067-1-git@apitzsch.eu
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Fixed string node names should be under 'properties' rather than
'patternProperties'. Additionally, without beginning and end of line
anchors, any prefix or suffix is allowed on the specified node name.
These cases don't appear to want a prefix or suffix, so move them under
'properties'.
In some cases, the diff turns out to look like we're moving some
patterns rather than the fixed string properties.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221118223708.1721134-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
In struct i2c_driver, field new_probe replaces the soon to be deprecated
field probe. Update unittest for this change. The probe function
doesn't make use of the i2c_device_id * parameter so it can be trivially
converted.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Frank Rowand <frowand.list@gmail.com>
Link: https://lore.kernel.org/r/20221118224540.619276-510-uwe@kleine-koenig.org
[robh: Add Frank's commit msg addition]
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
'cpus' is a common property, and it is now defined in dtschema schemas,
so drop the type references in the tree.
Acked-by: Suzuki K Poulose <suzuki.poulse@arm.com>
Acked-by: Bjorn Andersson <andersson@kernel.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20221111212857.4104308-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The reference by path (&{/cpus/cpu@101/thermal-idle}) in the example causes
an error with new version of dtc:
FATAL ERROR: Can't generate fixup for reference to path &{/cpus/cpu@100/thermal-idle}
This is because the examples are built as an overlay and absolute paths
are not valid as references must be by label. The path was also not
resolvable because, by default, examples are placed under 'example-N'
nodes.
As the example contains top-level nodes, the root node must be explicit for
the example to be extracted as-is. This changes the indentation for the
whole example, but the existing indentation is a mess of of random amounts.
Clean this up to be 4 spaces everywhere.
Link: https://lore.kernel.org/r/20221111162729.3381835-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
It is useful to use vmlinux.h in the xfrm_info test like other kfunc
tests do. In particular, it is common for kfunc bpf prog that requires
to use other core kernel structures in vmlinux.h
Although vmlinux.h is preferred, it needs a ___local flavor of
struct bpf_xfrm_info in order to build the bpf selftests
when CONFIG_XFRM_INTERFACE=[m|n].
Cc: Eyal Birger <eyal.birger@gmail.com>
Fixes: 90a3a05eb33f ("selftests/bpf: add xfrm_info tests")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20221206193554.1059757-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
strdup() allocates memory for path. We need to release the memory in the
following error path. Add free() to avoid memory leak.
Fixes: 8f184732b60b ("bpftool: Switch to libbpf's hashmap for pinned paths of BPF objects")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221206071906.806384-1-linmq006@gmail.com
|
|
For BPF_PSEUDO_FUNC instruction, verifier will refill imm with
correct addresses of bpf_calls and then run last pass of JIT.
Since the emit_imm of RV64 is variable-length, which will emit
appropriate length instructions accorroding to the imm, it may
broke ctx->offset, and lead to unpredictable problem, such as
inaccurate jump. So let's fix it with fixed-length instructions.
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Suggested-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20221206091410.1584784-1-pulehui@huaweicloud.com
|
|
Now that we have everything to support the PRE_COPY state,
enable it.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20221123113236.896-5-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Instead of waiting till data transfer is complete to perform dev
compatibility, do it as soon as we have enough data to perform the
check. This will be useful when we enable the support for PRE_COPY.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20221123113236.896-4-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The saving_migf is open in PRE_COPY state if it is supported and reads
initial device match data. hisi_acc_vf_stop_copy() is refactored to
make use of common code.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20221123113236.896-3-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
PRECOPY IOCTL in the case of HiSiIicon ACC driver can be used to
perform the device compatibility check earlier during migration.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20221123113236.896-2-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Now that everything has been set up for MIGRATION_PRE_COPY, enable it.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-15-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Before a SAVE command is issued, a QUERY command is issued in order to
know the device data size.
In case PRE_COPY is used, the above commands are issued while the device
is running. Thus, it is possible that between the QUERY and the SAVE
commands the state of the device will be changed significantly and thus
the SAVE will fail.
Currently, if a SAVE command is failing, the driver will fail the
migration. In the above case, don't fail the migration, but don't allow
for new SAVEs to be executed while the device is in a RUNNING state.
Once the device will be moved to STOP_COPY, SAVE can be executed again
and the full device state will be read.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-14-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
In order to support PRE_COPY, mlx5 driver transfers multiple states
(images) of the device. e.g.: the source VF can save and transfer
multiple states, and the target VF will load them by that order.
This patch implements the changes for the target VF to decompose the
header for each state and to write and load multiple states.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-13-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
During PRE_COPY the migration data FD may have a temporary "end of
stream" that is reached when the initial_bytes were read and no other
dirty data exists yet.
For instance, this may indicate that the device is idle and not
currently dirtying any internal state. When read() is done on this
temporary end of stream the kernel driver should return ENOMSG from
read(). Userspace can wait for more data or consider moving to
STOP_COPY.
To not block the user upon read() and let it get ENOMSG we add a new
state named MLX5_MIGF_STATE_PRE_COPY on the migration file.
In addition, we add the MLX5_MIGF_STATE_SAVE_LAST state to block the
read() once we call the last SAVE upon moving to STOP_COPY.
Any further error will be marked with MLX5_MIGF_STATE_ERROR and the user
won't be blocked.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-12-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
vfio precopy ioctl returns an estimation of data available for
transferring from the device.
Whenever a user is using VFIO_MIG_GET_PRECOPY_INFO, track the current
state of the device, and if needed, append the dirty data to the
transfer FD data. This is done by saving a middle state.
As mlx5 runs the SAVE command asynchronously, make sure to query for
incremental data only once there is no active save command.
Running both in parallel, might end-up with a failure in the incremental
query command on un-tracked vhca.
Also, a middle state will be saved only after the previous state has
finished its SAVE command and has been fully transferred, this prevents
endless use resources.
Co-developed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-11-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
As mentioned in the previous patches, mlx5 is transferring multiple
states when the PRE_COPY protocol is used. This states mechanism
requires the target VM to know the states' size in order to execute
multiple loads. Therefore, add SW header, with the needed information,
for each saved state the source VM is transferring to the target VM.
This patch implements the source VM handling of the headers, following
patch will implement the target VM handling of the headers.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-10-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
In order to support PRE_COPY, mlx5 driver is transferring multiple
states (images) of the device. e.g.: the source VF can save and transfer
multiple states, and the target VF will load them by that order.
The device is saving three kinds of states:
1) Initial state - when the device moves to PRE_COPY state.
2) Middle state - during PRE_COPY phase via VFIO_MIG_GET_PRECOPY_INFO.
There can be multiple states of this type.
3) Final state - when the device moves to STOP_COPY state.
After moving to PRE_COPY state, user is holding the saving migf FD and
can use it. For example: user can start transferring data via read()
callback. Also, user can switch from PRE_COPY to STOP_COPY whenever he
sees it fits. This will invoke saving of final state.
This means that mlx5 VFIO device can be switched to STOP_COPY without
transferring any data in PRE_COPY state. Therefore, when the device
moves to STOP_COPY, mlx5 will store the final state on a dedicated queue
entry on the list.
Co-developed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-9-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Refactor to use queue based data chunks on the migration file.
The SAVE command adds a chunk to the tail of the queue while the read()
API finds the required chunk and returns its data.
In case the queue is empty but the state of the migration file is
MLX5_MIGF_STATE_COMPLETE, read() may not be blocked but will return 0 to
indicate end of file.
This is a step towards maintaining multiple images and their meta data
(i.e. headers) on the migration file as part of next patches from the
series.
Note:
At that point, we still use a single chunk on the migration file but
becomes ready to support multiple.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-8-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Refactor migration file state to be an emum which is mutual exclusive.
As of that dropped the 'disabled' state as 'error' is the same from
functional point of view.
Next patches from the series will extend this enum for other relevant
states.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-7-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
This patch refactors MKEY usage such as its life cycle will be as of the
migration file instead of allocating/destroying it upon each
SAVE/LOAD command.
This is a preparation step towards the PRE_COPY series where multiple
images will be SAVED/LOADED.
We achieve it by having a new struct named mlx5_vhca_data_buffer which
holds the mkey and its related stuff as of sg_append_table,
allocated_length, etc.
The above fields were taken out from the migration file main struct,
into mlx5_vhca_data_buffer dedicated struct with the proper helpers in
place.
For now we have a single mlx5_vhca_data_buffer per migration file.
However, in coming patches we'll have multiple of them to support
multiple images.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-6-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
This patch refactors PD usage such as its life cycle will be as of the
migration file instead of allocating/destroying it upon each SAVE/LOAD
command.
This is a preparation step towards the PRE_COPY series where multiple
images will be SAVED/LOADED and a single PD can be simply reused.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-5-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Enforce a single SAVE command at a time.
As the SAVE command is an asynchronous one, we must enforce running only
a single command at a time.
This will preserve ordering between multiple calls and protect from
races on the migration file data structure.
This is a must for the next patches from the series where as part of
PRE_COPY we may have multiple images to be saved and multiple SAVE
commands may be issued from different flows.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-4-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The optional PRE_COPY states open the saving data transfer FD before
reaching STOP_COPY and allows the device to dirty track internal state
changes with the general idea to reduce the volume of data transferred
in the STOP_COPY stage.
While in PRE_COPY the device remains RUNNING, but the saving FD is open.
Only if the device also supports RUNNING_P2P can it support PRE_COPY_P2P,
which halts P2P transfers while continuing the saving FD.
PRE_COPY, with P2P support, requires the driver to implement 7 new arcs
and exists as an optional FSM branch between RUNNING and STOP_COPY:
RUNNING -> PRE_COPY -> PRE_COPY_P2P -> STOP_COPY
A new ioctl VFIO_MIG_GET_PRECOPY_INFO is provided to allow userspace to
query the progress of the precopy operation in the driver with the idea it
will judge to move to STOP_COPY at least once the initial data set is
transferred, and possibly after the dirty size has shrunk appropriately.
This ioctl is valid only in PRE_COPY states and kernel driver should
return -EINVAL from any other migration state.
Compared to the v1 clarification, STOP_COPY -> PRE_COPY is blocked
and to be defined in future.
We also split the pending_bytes report into the initial and sustaining
values, e.g.: initial_bytes and dirty_bytes.
initial_bytes: Amount of initial precopy data.
dirty_bytes: Device state changes relative to data previously retrieved.
These fields are not required to have any bearing to STOP_COPY phase.
It is recommended to leave PRE_COPY for STOP_COPY only after the
initial_bytes field reaches zero. Leaving PRE_COPY earlier might make
things slower.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20221206083438.37807-3-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|