Age | Commit message (Collapse) | Author |
|
This simplifies the code flow in read_one_chunk and makes error handling
when handling missing devices a bit simpler by reducing it to a single
check if something went wrong. No functional changes.
Reviewed-by: Su Yue <l@damenly.su>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When logging a directory we are trying to log subdirectories that were
changed in the current transaction and created in a past transaction.
This type of behaviour was introduced by commit 2f2ff0ee5e4303 ("Btrfs:
fix metadata inconsistencies after directory fsync"), to fix some metadata
inconsistencies that in the meanwhile no longer need this behaviour due to
numerous other changes that happened throughout the years.
This behaviour, besides not needed anymore, it's also undesirable because:
1) It's not reliable because it's only triggered for the directories
of dentries (dir items) that happen to be present on a leaf that
was changed in the current transaction. If a dentry that points to
a directory resides on a leaf that was not changed in the current
transaction, then it's not logged, as at log_dir_items() and
log_new_dir_dentries() we use btrfs_search_forward();
2) It's not required by posix or any standard, it's undefined territory.
The only way to guarantee a subdirectory is logged, it to explicitly
fsync it;
Making the behaviour guaranteed would require scanning all directory
items, check which point to a directory, and then fsync each subdirectory
which was modified in the current transaction. This could be very
expensive for large directories with many subdirectories and/or large
subdirectories.
So remove that obsolete logic.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When logging a directory, we go over every leaf of the subvolume tree that
was changed in the current transaction and copy all its dir index keys to
the log tree.
That includes copying dir index keys created in past transactions. This is
done mostly for simplicity, as after logging the keys we log an item that
specifies the start and end ranges of the keys we logged. That item is
then used during log replay to figure out which keys need to be deleted -
every key in that range that we find in the subvolume tree and is not in
the log tree, needs to be deleted.
Now that we log only dir index keys, and not dir item keys anymore, when
we remove dentries from a directory (due to unlink and rename operations),
we can get entire leaves that we changed only for deleting old dir index
keys, or that have few dir index keys that are new - this is due to the
fact that the offset for new index keys comes from a monotonically
increasing counter.
We can avoid logging dir index keys from past transactions, and in order
to track the deletions, only log range items (BTRFS_DIR_LOG_INDEX_KEY key
type) when we find gaps between consecutive index keys. This massively
reduces the amount of logged metadata when we have deleted directory
entries, even if it's a small percentage of the total number of entries.
The reduction comes from both less items that are logged and instead of
logging many dir index items (struct btrfs_dir_item), which have a size
of 30 bytes plus a file name, we typically log just a few range items
(struct btrfs_dir_log_item), which take only 8 bytes each.
Even if no entries were deleted from a directory and only new entries
were added, we typically still get a reduction on the amount of logged
metadata, because it's very likely the first leaf that got the new
dir index entries also has several old dir index entries.
So change the logging logic to not log dir index keys created in past
transactions and log a range item for every gap it finds between each
pair of consecutive index keys, to ensure deletions are tracked and
replayed on log replay.
This patch is part of a patchset comprised of the following patches:
1/4 btrfs: don't log unnecessary boundary keys when logging directory
2/4 btrfs: put initial index value of a directory in a constant
3/4 btrfs: stop copying old dir items when logging a directory
4/4 btrfs: stop trying to log subdirectories created in past transactions
The following test was run on a branch without this patchset and on a
branch with the first three patches applied:
$ cat test.sh
#!/bin/bash
DEV=/dev/nvme0n1
MNT=/mnt/nvme0n1
NUM_FILES=1000000
NUM_FILE_DELETES=10000
MKFS_OPTIONS="-O no-holes -R free-space-tree"
MOUNT_OPTIONS="-o ssd"
mkfs.btrfs -f $MKFS_OPTIONS $DEV
mount $MOUNT_OPTIONS $DEV $MNT
mkdir $MNT/testdir
for ((i = 1; i <= $NUM_FILES; i++)); do
echo -n > $MNT/testdir/file_$i
done
sync
del_inc=$(( $NUM_FILES / $NUM_FILE_DELETES ))
for ((i = 1; i <= $NUM_FILES; i += $del_inc)); do
rm -f $MNT/testdir/file_$i
done
start=$(date +%s%N)
xfs_io -c "fsync" $MNT/testdir
end=$(date +%s%N)
dur=$(( (end - start) / 1000000 ))
echo "dir fsync took $dur ms after deleting $NUM_FILE_DELETES files"
echo
umount $MNT
The test was run on a non-debug kernel (Debian's default kernel config),
and the results were the following for various values of NUM_FILES and
NUM_FILE_DELETES:
** before, NUM_FILES = 1 000 000, NUM_FILE_DELETES = 10 000 **
dir fsync took 585 ms after deleting 10000 files
** after, NUM_FILES = 1 000 000, NUM_FILE_DELETES = 10 000 **
dir fsync took 34 ms after deleting 10000 files (-94.2%)
** before, NUM_FILES = 100 000, NUM_FILE_DELETES = 1 000 **
dir fsync took 50 ms after deleting 1000 files
** after, NUM_FILES = 100 000, NUM_FILE_DELETES = 1 000 **
dir fsync took 7 ms after deleting 1000 files (-86.0%)
** before, NUM_FILES = 10 000, NUM_FILE_DELETES = 100 **
dir fsync took 9 ms after deleting 100 files
** after, NUM_FILES = 10 000, NUM_FILE_DELETES = 100 **
dir fsync took 5 ms after deleting 100 files (-44.4%)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
At btrfs_set_inode_index_count() we refer twice to the number 2 as the
initial index value for a directory (when it's empty), with a proper
comment explaining the reason for that value. In the next patch I'll
have to use that magic value in the directory logging code, so put
the value in a #define at btrfs_inode.h, to avoid hardcoding the
magic value again at tree-log.c.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Before we start to log dir index keys from a leaf, we check if there is a
previous index key, which normally is at the end of a leaf that was not
changed in the current transaction. Then we log that key and set the start
of logged range (item of type BTRFS_DIR_LOG_INDEX_KEY) to the offset of
that key. This is to ensure that if there were deleted index keys between
that key and the first key we are going to log, those deletions are
replayed in case we need to replay to the log after a power failure.
However we really don't need to log that previous key, we can just set the
start of the logged range to that key's offset plus 1. This achieves the
same and avoids logging one dir index key.
The same logic is performed when we finish logging the index keys of a
leaf and we find that the next leaf has index keys and was not changed in
the current transaction. We are logging the first key of that next leaf
and use its offset as the end of range we log. This is just to ensure that
if there were deleted index keys between the last index key we logged and
the first key of that next leaf, those index keys are deleted if we end
up replaying the log. However that is not necessary, we can avoid logging
that first index key of the next leaf and instead set the end of the
logged range to match the offset of that index key minus 1.
So avoid logging those index keys at the boundaries and adjust the start
and end offsets of the logged ranges as described above.
This patch is part of a patchset comprised of the following patches:
1/4 btrfs: don't log unnecessary boundary keys when logging directory
2/4 btrfs: put initial index value of a directory in a constant
3/4 btrfs: stop copying old dir items when logging a directory
4/4 btrfs: stop trying to log subdirectories created in past transactions
Performance test results are listed in the changelog of patch 3/4.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_ioctl already contains pointers to the inode and btrfs_root
structs, so we can pass them into the subfunctions instead of the
toplevel struct file.
Signed-off-by: Sahil Kang <sahil.kang@asilaycomputing.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The ->write and ->wait fields of struct walk_control, used for log trees,
are not used since 2008, more specifically since commit d0c803c4049c5c
("Btrfs: Record dirty pages tree-log pages in an extent_io tree") and
since commit d0c803c4049c5c ("Btrfs: Record dirty pages tree-log pages in
an extent_io tree"). So just remove them, along with the function
btrfs_write_tree_block(), which is also not used anymore after removing
the ->write member.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Commit 5f9c55c8066b ("ipv6: check return value of ipv6_skip_exthdr")
introduced an incorrect check, which leads to all ESP packets over
either TCPv6 or UDPv6 encapsulation being dropped. In this particular
case, offset is negative, since skb->data points to the ESP header in
the following chain of headers, while skb->network_header points to
the IPv6 header:
IPv6 | ext | ... | ext | UDP | ESP | ...
That doesn't seem to be a problem, especially considering that if we
reach esp6_input_done2, we're guaranteed to have a full set of headers
available (otherwise the packet would have been dropped earlier in the
stack). However, it means that the return value will (intentionally)
be negative. We can make the test more specific, as the expected
return value of ipv6_skip_exthdr will be the (negated) size of either
a UDP header, or a TCP header with possible options.
In the future, we should probably either make ipv6_skip_exthdr
explicitly accept negative offsets (and adjust its return value for
error cases), or make ipv6_skip_exthdr only take non-negative
offsets (and audit all callers).
Fixes: 5f9c55c8066b ("ipv6: check return value of ipv6_skip_exthdr")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Ioana Ciornei says:
====================
dpaa2-mac: add support for changing the protocol at runtime
This patch set adds support for changing the Ethernet protocol at
runtime on Layerscape SoCs which have the Lynx 28G SerDes block.
The first two patches add a new generic PHY driver for the Lynx 28G and
the bindings file associated. The driver reads the PLL configuration at
probe time (the frequency provided to the lanes) and determines what
protocols can be supported.
Based on this the driver can deny or approve a request from the
dpaa2-mac to setup a new protocol.
The next 2 patches add some MC APIs for inquiring what is the running
version of firmware and setting up a new protocol on the MAC.
Moving along, we extract the code for setting up the supported
interfaces on a MAC on a different function since in the next patches
will update the logic.
In the next patch, the dpaa2-mac is updated so that it retrieves the
SerDes PHY based on the OF node and in case of a major reconfig, call
the PHY driver to set up the new protocol on the associated lane and the
MC firmware to reconfigure the MAC side of things.
Finally, the LX2160A dtsi is annotated with the SerDes PHY nodes for the
1st SerDes block. Beside this, the LX2160A Clearfog dtsi is annotated
with the 'phys' property for the exposed SFP cages.
Changes in v2:
- 1/8: add MODULE_LICENSE
Changes in v3:
- 2/8: fix 'make dt_binding_check' errors
- 7/8: reverse order of dpaa2_mac_start() and phylink_start()
- 7/8: treat all RGMII variants in dpmac_eth_if_mode
- 7/8: remove the .mac_prepare callback
- 7/8: ignore PHY_INTERFACE_MODE_NA in validate
Changes in v4:
- 1/8: remove the DT nodes parsing
- 1/8: add an xlate function
- 2/8: remove the children phy nodes for each lane
- 7/8: rework the of_phy_get if statement
- 8/8: remove the DT nodes for each lane and the lane id in the
phys phandle
Changes in v5:
- 2/8: use phy as the name of the DT node in the example
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Describe the SerDes block #1 using the generic phys infrastructure. This
way, the ethernet nodes can each reference their serdes lanes
individually using the 'phys' dts property.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch integrates the dpaa2-eth driver with the generic PHY
infrastructure in order to search, find and reconfigure the SerDes lanes
in case of a protocol change.
On the .mac_config() callback, the phy_set_mode_ext() API is called so
that the Lynx 28G SerDes PHY driver can change the lane's configuration.
In the same phylink callback the MC firmware is called so that it
reconfigures the MAC side to run using the new protocol.
The consumer drivers - dpaa2-eth and dpaa2-switch - are updated to call
the dpaa2_mac_start/stop functions newly added which will
power_on/power_off the associated SerDes lane.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The logic to setup the supported interfaces will get annotated based on
what the configuration of the SerDes PLLs supports. Move the current
setup into a separate function just to try to keep it clean.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Retrieve the API version running on the firmware and based on it detect
which features are available for usage.
The first one to be listed is the capability to change the MAC protocol
at runtime.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The MC firmware gained recently a new command which can reconfigure the
running protocol on the underlying MAC. Add this new command which will
be used in the next patches in order to do a major reconfig on the
interface.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The dpmac_get_api_version command will be used in the next patches to
determine if the current firmware is capable or not to change the
Ethernet protocol running on the MAC.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add device tree binding for the Lynx 28G SerDes PHY driver used on
Layerscape based SoCs.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a new generic PHY driver to support the Lynx 28G SerDes
block found on some of the Layerscape SoCs such as LX2160A.
At the moment, only the following Ethernet protocols are supported:
SGMII/1000Base-X and 10GBaseR.
SerDes lanes which are not running an Ethernet protocol or a currently
supported Ethenet protocol will be left as it was configured through the
RCW (Reset Configuration Word) at boot time.
At probe time, the platform driver will read the current
configuration of both PLLs found on a SerDes block and will determine
what protocols are supported using that PLL.
For example, if a PLL is configured to generate a clock net (frate) of
5GHz the only protocols sustained by that PLL are SGMII/1000Base-X
(using a quarter of the full clock rate) and QSGMII using the full clock
net frequency on the lane.
On the .set_mode() callback, the PHY driver will first check if the
requested operating mode (protocol) is even supported by the current PLL
configuration and will error out if not.
Then, the lane is reconfigured to run on the requested protocol.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vladimir Oltean says:
====================
Basic QoS classification on Felix DSA switch using dcbnl
Basic QoS classification for Ocelot switches means port-based default
priority, DSCP-based and VLAN PCP based. This is opposed to advanced QoS
classification which is done through the VCAP IS1 TCAM based engine.
The patch set is a logical continuation of this RFC which attempted to
describe the default-prio as a matchall entry placed at the end of a
series of offloaded tc filters:
https://patchwork.kernel.org/project/netdevbpf/cover/20210113154139.1803705-1-olteanv@gmail.com/
I have tried my best to satisfy the feedback that we should cater for
pre-configured QoS profiles. Ironically, the only pre-configured QoS
profile that the Felix switch driver has is for VLAN PCP (1:1 mapping
with QoS class), yet IEEE 802.1Q or dcbnl offer no mechanism for
reporting or changing that.
Testing was done with the iproute2 dcb app. The qos_class of packets was
dumped from net/dsa/tag_ocelot.c.
(1) $ dcb app show dev swp3
default-prio 0
(2) $ dcb app replace dev swp3 default-prio 3
(3) $ dcb app replace dev swp3 dscp-prio CS3:5
(4) $ dcb app replace dev swp3 dscp-prio CS2:2
(5) $ dcb app show dev swp3
default-prio 3
dscp-prio CS2:2 CS3:5
Traffic sent with "ping -Q 64 <ipaddr>", which means CS2.
These packets match qos_class 0 after command (1),
qos_class 3 after command (2),
qos_class 3 after command (3), and
qos_class 2 after command (2).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Follow the established programming model for this driver and provide
shims in the felix DSA driver which call the implementations from the
ocelot switch lib. The ocelot switchdev driver wasn't integrated with
dcbnl due to lack of hardware availability.
The switch doesn't have any fancy QoS classification enabled by default.
The provided getters will create a default-prio app table entry of 0,
and no dscp entry. However, the getters have been made to actually
retrieve the hardware configuration rather than static values, to be
future proof in case DSA will need this information from more call paths.
For default-prio, there is a single field per port, in ANA_PORT_QOS_CFG,
called QOS_DEFAULT_VAL.
DSCP classification is enabled per-port, again via ANA_PORT_QOS_CFG
(field QOS_DSCP_ENA), and individual DSCP values are configured as
trusted or not through register ANA_DSCP_CFG (replicated 64 times).
An untrusted DSCP value falls back to other QoS classification methods.
If trusted, the selected ANA_DSCP_CFG register also holds the QoS class
in the QOS_DSCP_VAL field.
The hardware also supports DSCP remapping (DSCP value X is translated to
DSCP value Y before the QoS class is determined based on the app table
entry for Y) and DSCP packet rewriting. The dcbnl framework, for being
so flexible in other useless areas, doesn't appear to support this.
So this functionality has been left out.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar to the port-based default priority, IEEE 802.1Q-2018 allows the
Application Priority Table to define QoS classes (0 to 7) per IP DSCP
value (0 to 63).
In the absence of an app table entry for a packet with DSCP value X,
QoS classification for that packet falls back to other methods (VLAN PCP
or port-based default). The presence of an app table for DSCP value X
with priority Y makes the hardware classify the packet to QoS class Y.
As opposed to the default-prio where DSA exposes only a "set" in
dsa_switch_ops (because the port-based default is the fallback, it
always exists, either implicitly or explicitly), for DSCP priorities we
expose an "add" and a "del". The addition of a DSCP entry means trusting
that DSCP priority, the deletion means ignoring it.
Drivers that already trust (at least some) DSCP values can describe
their configuration in dsa_switch_ops :: port_get_dscp_prio(), which is
called for each DSCP value from 0 to 63.
Again, there can be more than one dcbnl app table entry for the same
DSCP value, DSA chooses the one with the largest configured priority.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The port-based default QoS class is assigned to packets that lack a
VLAN PCP (or the port is configured to not trust the VLAN PCP),
an IP DSCP (or the port is configured to not trust IP DSCP), and packets
on which no tc-skbedit action has matched.
Similar to other drivers, this can be exposed to user space using the
DCB Application Priority Table. IEEE 802.1Q-2018 specifies in Table
D-8 - Sel field values that when the Selector is 1, the Protocol ID
value of 0 denotes the "Default application priority. For use when
application priority is not otherwise specified."
The way in which the dcbnl integration in DSA has been designed has to
do with its requirements. Andrew Lunn explains that SOHO switches are
expected to come with some sort of pre-configured QoS profile, and that
it is desirable for this to come pre-loaded into the DSA slave interfaces'
DCB application priority table.
In the dcbnl design, this is possible because calls to dcb_ieee_setapp()
can be initiated by anyone including being self-initiated by this device
driver.
However, what makes this challenging to implement in DSA is that the DSA
core manages the net_devices (effectively hiding them from drivers),
while drivers manage the hardware. The DSA core has no knowledge of what
individual drivers' QoS policies are. DSA could export to drivers a
wrapper over dcb_ieee_setapp() and these could call that function to
pre-populate the app priority table, however drivers don't have a good
moment in time to do this. The dsa_switch_ops :: setup() method gets
called before the net_devices are created (dsa_slave_create), and so is
dsa_switch_ops :: port_setup(). What remains is dsa_switch_ops ::
port_enable(), but this gets called upon each ndo_open. If we add app
table entries on every open, we'd need to remove them on close, to avoid
duplicate entry errors. But if we delete app priority entries on close,
what we delete may not be the initial, driver pre-populated entries, but
rather user-added entries.
So it is clear that letting drivers choose the timing of the
dcb_ieee_setapp() call is inappropriate. The alternative which was
chosen is to introduce hardware-specific ops in dsa_switch_ops, and
effectively hide dcbnl details from drivers as well. For pre-populating
the application table, dsa_slave_dcbnl_init() will call
ds->ops->port_get_default_prio() which is supposed to read from
hardware. If the operation succeeds, DSA creates a default-prio app
table entry. The method is called as soon as the slave_dev is
registered, but before we release the rtnl_mutex. This is done such that
user space sees the app table entries as soon as it sees the interface
being registered.
The fact that we populate slave_dev->dcbnl_ops with a non-NULL pointer
changes behavior in dcb_doit() from net/dcb/dcbnl.c, which used to
return -EOPNOTSUPP for any dcbnl operation where netdev->dcbnl_ops is
NULL. Because there are still dcbnl-unaware DSA drivers even if they
have dcbnl_ops populated, the way to restore the behavior is to make all
dcbnl_ops return -EOPNOTSUPP on absence of the hardware-specific
dsa_switch_ops method.
The dcbnl framework absurdly allows there to be more than one app table
entry for the same selector and protocol (in other words, more than one
port-based default priority). In the iproute2 dcb program, there is a
"replace" syntactical sugar command which performs an "add" and a "del"
to hide this away. But we choose the largest configured priority when we
call ds->ops->port_set_default_prio(), using __fls(). When there is no
default-prio app table entry left, the port-default priority is restored
to 0.
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210113154139.1803705-2-olteanv@gmail.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case the controller is transitioning to L1 in rcar_pcie_config_access(),
any read/write access to PCIECDR triggers asynchronous external abort. This
is because the transition to L1 link state must be manually finished by the
driver. The PCIe IP can transition back from L1 state to L0 on its own.
The current asynchronous external abort hook implementation restarts
the instruction which finally triggered the fault, which can be a
different instruction than the read/write instruction which started
the faulting access. Usually the instruction which finally triggers
the fault is one which has some data dependency on the result of the
read/write. In case of read, the read value after fixup is undefined,
while a read value of faulting read should be PCI_ERROR_RESPONSE.
It is possible to enforce the fault using 'isb' instruction placed
right after the read/write instruction which started the faulting
access. Add custom register accessors which perform the read/write
followed immediately by 'isb'.
This way, the fault always happens on the 'isb' and in case of read,
which is located one instruction before the 'isb', it is now possible
to fix up the return value of the read in the asynchronous external
abort hook and make that read return PCI_ERROR_RESPONSE.
Link: https://lore.kernel.org/r/20220312212349.781799-2-marek.vasut@gmail.com
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Krzysztof Wilczyński <kw@linux.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: linux-renesas-soc@vger.kernel.org
|
|
In case the controller is transitioning to L1 in rcar_pcie_config_access(),
any read/write access to PCIECDR triggers asynchronous external abort. This
is because the transition to L1 link state must be manually finished by the
driver. The PCIe IP can transition back from L1 state to L0 on its own.
Avoid triggering the abort in rcar_pcie_config_access() by checking whether
the controller is in the transition state, and if so, finish the transition
right away. This prevents a lot of unnecessary exceptions, although not all
of them.
Link: https://lore.kernel.org/r/20220312212349.781799-1-marek.vasut@gmail.com
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Krzysztof Wilczyński <kw@linux.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: linux-renesas-soc@vger.kernel.org
|
|
Some tests, such as Test d052: Add 1M filters with the same action, may
not work with a small timeout value.
Increase timeout to 24 seconds.
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
____napi_schedule() needs to be invoked with disabled interrupts due to
__raise_softirq_irqoff (in order not to corrupt the per-CPU list).
____napi_schedule() needs also to be invoked from an interrupt context
so that the raised-softirq is processed while the interrupt context is
left.
Add lockdep asserts for both conditions.
While this is the second time the irq/softirq check is needed, provide a
generic lockdep_assert_softirq_will_run() which is used by both caller.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add spi_device_id tables to avoid logs like "SPI driver ksz9477-switch
has no spi_device_id".
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ziyang Xuan says:
====================
net: macvlan: fix potential UAF problem for lowerdev
Add the reference operation to lowerdev of macvlan to avoid
the potential UAF problem under the following known scenario:
Someone module puts the NETDEV_UNREGISTER event handler to a
work, and lowerdev is accessed in the work handler. But when
the work is excuted, lowerdev has been destroyed because upper
macvlan did not get reference to lowerdev correctly.
In addition, add net device refcount tracker to macvlan.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add net device refcount tracker to macvlan.
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the reference operation to lowerdev of macvlan to avoid
the potential UAF problem under the following known scenario:
Someone module puts the NETDEV_UNREGISTER event handler to a
work, and lowerdev is accessed in the work handler. But when
the work is excuted, lowerdev has been destroyed because upper
macvlan did not get reference to lowerdev correctly.
That likes as the scenario occurred by
commit 563bcbae3ba2 ("net: vlan: fix a UAF in vlan_dev_real_dev()").
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allocating memory with kmalloc and GPF_DMA32 is not allowed, the
allocator will ignore the attribute.
Instead, use dma_alloc_coherent() API as we allocate a small amount of
memory to transfer firmware fragment to the ISH.
On Arcada chromebook, after the patch the warning:
"Unexpected gfp: 0x4 (GFP_DMA32). Fixing up to gfp: 0xcc0 (GFP_KERNEL). Fix your code!"
is gone. The ISH firmware is loaded properly and we can interact with
the ISH:
> ectool --name cros_ish version
...
Build info: arcada_ish_v2.0.3661+3c1a1c1ae0 2022-02-08 05:37:47 @localhost
Tool version: v2.0.12300-900b03ec7f 2022-02-08 10:01:48 @localhost
Fixes: commit 91b228107da3 ("HID: intel-ish-hid: ISH firmware loader client driver")
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates from Marc Zyngier:
- Add support for the STM32MP13 variant
- Move parent device away from struct irq_chip
- Remove all instances of non-const strings assigned to
struct irq_chip::name, enabling a nice cleanup for VIC and GIC)
- Simplify the Qualcomm PDC driver
- A bunch of SiFive PLIC cleanups
- Add support for a new variant of the Meson GPIO block
- Add support for the irqchip side of the Apple M1 PMU
- Add support for the Apple M1 Pro/Max AICv2 irqchip
- Add support for the Qualcomm MPM wakeup gadget
- Move the Xilinx driver over to the generic irqdomain handling
- Tiny speedup for IPIs on GICv3 systems
- The usual odd cleanups
Link: https://lore.kernel.org/all/20220313105142.704579-1-maz@kernel.org
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/core
Pull clocksource/events updates from Daniel Lezcano:
- Fix return error code check for the timer-of layer when getting
the base address (Guillaume Ranquet)
- Remove MMIO dependency, add notrace annotation for sched_clock
and increase the timer resolution for the Microchip
PIT64b (Claudiu Beznea)
- Convert DT bindings to yaml for the Tegra timer (David Heidelberg)
- Fix compilation error on architecture other than ARM for the
i.MX TPM (Nathan Chancellor)
- Add support for the event stream scaling for 1GHz counter on
the arch ARM timer (Marc Zyngier)
- Support a higher number of interrupts by the Exynos MCT timer
driver (Alim Akhtar)
- Detect and prevent memory corruption when the specified number
of interrupts in the DTS is greater than the array size in the
code for the Exynos MCT timer (Krzysztof Kozlowski)
- Fix regression from a previous errata fix on the TI DM
timer (Drew Fustini)
- Several fixes and code improvements for the i.MX TPM
driver (Peng Fan)
Link: https://lore.kernel.org/all/a8cd9be9-7d70-80df-2b74-1a8226a215e1@linaro.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core
Pull tick/NOHZ updates from Frederic Weisbecker:
- A fix for rare jiffies update stalls that were reported by Paul McKenney
- Tick side cleanups after RCU_FAST_NO_HZ removal
- Handle softirqs on idle more gracefully
Link: https://lore.kernel.org/all/20220307233034.34550-1-frederic@kernel.org
|
|
In order to better organize the platform/Kconfig, place
exynos-gsc-specific config stuff on a separate Kconfig file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to better organize the platform/Kconfig, place
coda-specific config stuff on a separate Kconfig file.
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to better organize the platform/Kconfig, place
amphion-specific config stuff on a separate Kconfig file.
Reviewed-by: Shijie Qin <shijie.qin@nxp.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to better organize the platform/Kconfig, place
allegro-dvt-specific config stuff on a separate Kconfig file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to cleanup the main platform media directory, move Renesas
driver to its own directory.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to cleanup the main platform media directory, move Via
driver to its own directory.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to cleanup the main platform media directory, move Intel
driver to its own directory.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to cleanup the main platform media directory, move NXP
drivers to their own directory.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
In order to cleanup the main platform media directory, move Aspeed
driver to its own directory.
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
Right now, platform dependencies are organized by the type of
the platform driver. Yet, things tend to become very messy with
time. The better seems to organize the drivers per manufacturer,
as other Kernel subsystems are doing.
As a preparation for such purpose, get rid of menuconfigs,
moving the per-menu dependencies to be at the driver-specifig
config entires.
This shoud give flexibility to reorganize the platform drivers
per manufacturer and re-sort them.
This patch removes all "if..endif" options from the platform
Kconfig, converting them into depends on.
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
There are lots of inconsistencies here: some directories are
included as-is, and others included using one (or more) symbols
that are inside it. Also, its entries are not sorted.
That makes it harder to maintain.
Reorganize it by placing everything on alphabetic order and
providing some hints about how patches for such file is expected.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
Alphabetically sort entries at the Makefiles per group,
in ASCII order, e. g., using the output of:
$ LC_ALL=C sort Makefile |grep obj-y
...
$ LC_ALL=C sort Makefile |grep obj.*CONFIG
...
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
|
|
The simple-audio-card and renesas,rsnd bindings used 'patternProperties'
with fixed strings to work-around a dtschema meta-schema limitation. This
is now fixed and the schemas can be fixed to use 'properties' instead.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20220311234802.417610-1-robh@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
It should be better to reverse the check on codec_dai
and returned early in order to be easier to understand.
Fixes: de2c6f98817f ("ASoC: soc-compress: prevent the potentially use of null pointer")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220310030041.1556323-1-jiasheng@iscas.ac.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
A missing bounds check in vm_access() can lead to an out-of-bounds read
or write in the adjacent memory area, since the len attribute is not
validated before the memcpy later in the function, potentially hitting:
[ 183.637831] BUG: unable to handle page fault for address: ffffc90000c86000
[ 183.637934] #PF: supervisor read access in kernel mode
[ 183.637997] #PF: error_code(0x0000) - not-present page
[ 183.638059] PGD 100000067 P4D 100000067 PUD 100258067 PMD 106341067 PTE 0
[ 183.638144] Oops: 0000 [#2] PREEMPT SMP NOPTI
[ 183.638201] CPU: 3 PID: 1790 Comm: poc Tainted: G D 5.17.0-rc6-ci-drm-11296+ #1
[ 183.638298] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X208.B00.1905301319 05/30/2019
[ 183.638430] RIP: 0010:memcpy_erms+0x6/0x10
[ 183.640213] RSP: 0018:ffffc90001763d48 EFLAGS: 00010246
[ 183.641117] RAX: ffff888109c14000 RBX: ffff888111bece40 RCX: 0000000000000ffc
[ 183.642029] RDX: 0000000000001000 RSI: ffffc90000c86000 RDI: ffff888109c14004
[ 183.642946] RBP: 0000000000000ffc R08: 800000000000016b R09: 0000000000000000
[ 183.643848] R10: ffffc90000c85000 R11: 0000000000000048 R12: 0000000000001000
[ 183.644742] R13: ffff888111bed190 R14: ffff888109c14000 R15: 0000000000001000
[ 183.645653] FS: 00007fe5ef807540(0000) GS:ffff88845b380000(0000) knlGS:0000000000000000
[ 183.646570] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 183.647481] CR2: ffffc90000c86000 CR3: 000000010ff02006 CR4: 00000000003706e0
[ 183.648384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 183.649271] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 183.650142] Call Trace:
[ 183.650988] <TASK>
[ 183.651793] vm_access+0x1f0/0x2a0 [i915]
[ 183.652726] __access_remote_vm+0x224/0x380
[ 183.653561] mem_rw.isra.0+0xf9/0x190
[ 183.654402] vfs_read+0x9d/0x1b0
[ 183.655238] ksys_read+0x63/0xe0
[ 183.656065] do_syscall_64+0x38/0xc0
[ 183.656882] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 183.657663] RIP: 0033:0x7fe5ef725142
[ 183.659351] RSP: 002b:00007ffe1e81c7e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 183.660227] RAX: ffffffffffffffda RBX: 0000557055dfb780 RCX: 00007fe5ef725142
[ 183.661104] RDX: 0000000000001000 RSI: 00007ffe1e81d880 RDI: 0000000000000005
[ 183.661972] RBP: 00007ffe1e81e890 R08: 0000000000000030 R09: 0000000000000046
[ 183.662832] R10: 0000557055dfc2e0 R11: 0000000000000246 R12: 0000557055dfb1c0
[ 183.663691] R13: 00007ffe1e81e980 R14: 0000000000000000 R15: 0000000000000000
Changes since v1:
- Updated if condition with range_overflows_t [Chris Wilson]
Fixes: 9f909e215fea ("drm/i915: Implement vm_ops->access for gdb access into mmaps")
Signed-off-by: Mastan Katragadda <mastanx.katragadda@intel.com>
Suggested-by: Adam Zabrocki <adamza@microsoft.com>
Reported-by: Jackson Cody <cody.jackson@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
[mauld: tidy up the commit message and add Cc: stable]
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303060428.1668844-1-mastanx.katragadda@intel.com
(cherry picked from commit 661412e301e2ca86799aa4f400d1cf0bd38c57c6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
iput() has already judged the incoming parameter, so there is no need to
repeat the judgment here.
Link: https://lore.kernel.org/r/20220311151240.62045-1-libang.linuxer@gmail.com
Signed-off-by: Bang Li <libang.linuxer@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Instead of using sprintf, use snprintf with buffer size limited to
PAGE_SIZE just like what we have for the rest of the file.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|