Age | Commit message (Collapse) | Author |
|
Make the order of the clocks and clock-names properties match the order
in the device tree bindings. This isn't strictly necessary from a point
of view of the operating system because matching will be done based on
the clock-names, but it makes it easier to validate the device trees
against the DT schema.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The CML1 and PLL_E clocks are never explicitly used by the AHCI
controller found on Tegra124, so drop them from the corresponding device
tree node.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The I2C controller found on Tegra124 is not fully compatible with the
Tegra114 version, so drop the fallback compatible string from the list.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
OPP table name now should start with "opp-table" and OPP entries
shouldn't contain commas and @ signs in accordance to the new schema
requirement. Reorganize CPU and EMC OPP table device-tree nodes.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The DT schema requires that nodes representing thermal zones include a
"-thermal" suffix in their name.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
When the Tegra High-Speed UART is used instead of the regular UART, the
reg-shift property is implied from the compatible string and should not
be explicitly listed.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
In order to be able to pass DT schema validation, change the GPU nodes'
unit-address to the standard notation. Previously this was using a "0,"
prefix that originated from a time when the top-level device tree node
contained #address-cells = <2>.
Note that this technically breaks backwards-compatibility with certain
older versions of the U-Boot bootloader because early versions used a
hard-coded DT path lookup to find the GPU node and perform some fixups
on it. However, this was changed to a compatible string based lookup in
April 2016, so it's reasonable to expect people to update U-Boot on the
systems that they want to use this updated kernel DTB with.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
GPIO hog nodes must have a "hog-" prefix or "-hog" suffix according to
the DT schema. Rename all such nodes to allow validation to pass.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The Tegra memory controller provides reset controls for hotflush reset,
so the #reset-cells property must be specified.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The TKE (time-keeping engine) found on Tegra114 and later is no longer
backwards compatible with the version found on Tegra20, so update the
compatible string list accordingly.
Note that while the hardware block is strictly backwards-compatible, an
architectural timer exists on those newer SoCs that is more reliable, so
that should always be preferred.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Regulators defined at the top level in device tree are no longer part of
a simple bus and therefore don't have a reg property. Nodes without a
reg property shouldn't have a unit-address either, so drop the unit
address from the node names. To ensure nodes aren't duplicated (in which
case they would end up merged in the final DTB), append the name of the
regulator to the node name.
[treding@nvidia.com: factored out patch and wrote commit message]
Signed-off-by: David Heidelberg <david@ixit.cz>
Co-developed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Clocks defined at the top level in device tree are no longer part of a
simple bus and therefore don't have a reg property. Nodes without a reg
property shouldn't have a unit-address either, so drop the unit address
from the node names. To ensure nodes aren't duplicated (in which case
they would end up merged in the final DTB), append the name of the clock
to the node name.
[treding@nvidia.com: factored out patch and wrote commit message]
Signed-off-by: David Heidelberg <david@ixit.cz>
Co-developed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
SPI flash chip nodes should be named "flash" instead of "spi-flash".
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The PMIC found on Dalmore, TN7 and Roth is a TPS65913, so add the
specific compatible string in addition to the generic Palmas series
compatible string.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The external memory controller should be sorted after the memory
controller to keep the ordering by unit-address intact.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/drivers
Memory controller drivers for v5.17 - Renesas
Changes to the Renesas RPC-IF driver:
1. Add support for R9A07G044 / RZ/G2L.
2. Several minor fixes and improvements to the driver.
* tag 'memory-controller-drv-renesas-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl:
memory: renesas-rpc-if: refactor MOIIO and IOFV macros
memory: renesas-rpc-if: avoid use of undocumented bits
memory: renesas-rpc-if: simplify register update
memory: renesas-rpc-if: Silence clang warning
memory: renesas-rpc-if: Add support for RZ/G2L
memory: renesas-rpc-if: Drop usage of RPCIF_DIRMAP_SIZE macro
memory: renesas-rpc-if: Return error in case devm_ioremap_resource() fails
dt-bindings: memory: renesas,rpc-if: Add optional interrupts property
dt-bindings: memory: renesas,rpc-if: Add support for the R9A07G044
Link: https://lore.kernel.org/r/20211213105618.5686-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
When logging a directory, once we finish processing a leaf that is full
of dir items, if we find the next leaf was not modified in the current
transaction, we grab the first key of that next leaf and log it as to
mark the end of a key range boundary.
However we did not update the value of ctx->last_dir_item_offset, which
tracks the offset of the last logged key. This can result in subsequent
logging of the same directory in the current transaction to not realize
that key was already logged, and then add it to the middle of a batch
that starts with a lower key, resulting later in a leaf with one key
that is duplicated and at non-consecutive slots. When that happens we get
an error later when writing out the leaf, reporting that there is a pair
of keys in wrong order. The report is something like the following:
Dec 13 21:44:50 kernel: BTRFS critical (device dm-0): corrupt leaf:
root=18446744073709551610 block=118444032 slot=21, bad key order, prev
(704687 84 4146773349) current (704687 84 1063561078)
Dec 13 21:44:50 kernel: BTRFS info (device dm-0): leaf 118444032 gen
91449 total ptrs 39 free space 546 owner 18446744073709551610
Dec 13 21:44:50 kernel: item 0 key (704687 1 0) itemoff 3835
itemsize 160
Dec 13 21:44:50 kernel: inode generation 35532 size
1026 mode 40755
Dec 13 21:44:50 kernel: item 1 key (704687 12 704685) itemoff
3822 itemsize 13
Dec 13 21:44:50 kernel: item 2 key (704687 24 3817753667)
itemoff 3736 itemsize 86
Dec 13 21:44:50 kernel: item 3 key (704687 60 0) itemoff 3728 itemsize 8
Dec 13 21:44:50 kernel: item 4 key (704687 72 0) itemoff 3720 itemsize 8
Dec 13 21:44:50 kernel: item 5 key (704687 84 140445108)
itemoff 3666 itemsize 54
Dec 13 21:44:50 kernel: dir oid 704793 type 1
Dec 13 21:44:50 kernel: item 6 key (704687 84 298800632)
itemoff 3599 itemsize 67
Dec 13 21:44:50 kernel: dir oid 707849 type 2
Dec 13 21:44:50 kernel: item 7 key (704687 84 476147658)
itemoff 3532 itemsize 67
Dec 13 21:44:50 kernel: dir oid 707901 type 2
Dec 13 21:44:50 kernel: item 8 key (704687 84 633818382)
itemoff 3471 itemsize 61
Dec 13 21:44:50 kernel: dir oid 704694 type 2
Dec 13 21:44:50 kernel: item 9 key (704687 84 654256665)
itemoff 3403 itemsize 68
Dec 13 21:44:50 kernel: dir oid 707841 type 1
Dec 13 21:44:50 kernel: item 10 key (704687 84 995843418)
itemoff 3331 itemsize 72
Dec 13 21:44:50 kernel: dir oid 2167736 type 1
Dec 13 21:44:50 kernel: item 11 key (704687 84 1063561078)
itemoff 3278 itemsize 53
Dec 13 21:44:50 kernel: dir oid 704799 type 2
Dec 13 21:44:50 kernel: item 12 key (704687 84 1101156010)
itemoff 3225 itemsize 53
Dec 13 21:44:50 kernel: dir oid 704696 type 1
Dec 13 21:44:50 kernel: item 13 key (704687 84 2521936574)
itemoff 3173 itemsize 52
Dec 13 21:44:50 kernel: dir oid 704704 type 2
Dec 13 21:44:50 kernel: item 14 key (704687 84 2618368432)
itemoff 3112 itemsize 61
Dec 13 21:44:50 kernel: dir oid 704738 type 1
Dec 13 21:44:50 kernel: item 15 key (704687 84 2676316190)
itemoff 3046 itemsize 66
Dec 13 21:44:50 kernel: dir oid 2167729 type 1
Dec 13 21:44:50 kernel: item 16 key (704687 84 3319104192)
itemoff 2986 itemsize 60
Dec 13 21:44:50 kernel: dir oid 704745 type 2
Dec 13 21:44:50 kernel: item 17 key (704687 84 3908046265)
itemoff 2929 itemsize 57
Dec 13 21:44:50 kernel: dir oid 2167734 type 1
Dec 13 21:44:50 kernel: item 18 key (704687 84 3945713089)
itemoff 2857 itemsize 72
Dec 13 21:44:50 kernel: dir oid 2167730 type 1
Dec 13 21:44:50 kernel: item 19 key (704687 84 4077169308)
itemoff 2795 itemsize 62
Dec 13 21:44:50 kernel: dir oid 704688 type 1
Dec 13 21:44:50 kernel: item 20 key (704687 84 4146773349)
itemoff 2727 itemsize 68
Dec 13 21:44:50 kernel: dir oid 707892 type 1
Dec 13 21:44:50 kernel: item 21 key (704687 84 1063561078)
itemoff 2674 itemsize 53
Dec 13 21:44:50 kernel: dir oid 704799 type 2
Dec 13 21:44:50 kernel: item 22 key (704687 96 2) itemoff 2612
itemsize 62
Dec 13 21:44:50 kernel: item 23 key (704687 96 6) itemoff 2551
itemsize 61
Dec 13 21:44:50 kernel: item 24 key (704687 96 7) itemoff 2498
itemsize 53
Dec 13 21:44:50 kernel: item 25 key (704687 96 12) itemoff
2446 itemsize 52
Dec 13 21:44:50 kernel: item 26 key (704687 96 14) itemoff
2385 itemsize 61
Dec 13 21:44:50 kernel: item 27 key (704687 96 18) itemoff
2325 itemsize 60
Dec 13 21:44:50 kernel: item 28 key (704687 96 24) itemoff
2271 itemsize 54
Dec 13 21:44:50 kernel: item 29 key (704687 96 28) itemoff
2218 itemsize 53
Dec 13 21:44:50 kernel: item 30 key (704687 96 62) itemoff
2150 itemsize 68
Dec 13 21:44:50 kernel: item 31 key (704687 96 66) itemoff
2083 itemsize 67
Dec 13 21:44:50 kernel: item 32 key (704687 96 75) itemoff
2015 itemsize 68
Dec 13 21:44:50 kernel: item 33 key (704687 96 79) itemoff
1948 itemsize 67
Dec 13 21:44:50 kernel: item 34 key (704687 96 82) itemoff
1882 itemsize 66
Dec 13 21:44:50 kernel: item 35 key (704687 96 83) itemoff
1810 itemsize 72
Dec 13 21:44:50 kernel: item 36 key (704687 96 85) itemoff
1753 itemsize 57
Dec 13 21:44:50 kernel: item 37 key (704687 96 87) itemoff
1681 itemsize 72
Dec 13 21:44:50 kernel: item 38 key (704694 1 0) itemoff 1521
itemsize 160
Dec 13 21:44:50 kernel: inode generation 35534 size 30
mode 40755
Dec 13 21:44:50 kernel: BTRFS error (device dm-0): block=118444032
write time tree block corruption detected
So fix that by adding the missing update of ctx->last_dir_item_offset with
the offset of the boundary key.
Reported-by: Chris Murphy <lists@colorremedies.com>
Link: https://lore.kernel.org/linux-btrfs/CAJCQCtT+RSzpUjbMq+UfzNUMe1X5+1G+DnAGbHC=OZ=iRS24jg@mail.gmail.com/
Fixes: dc2872247ec0ca ("btrfs: keep track of the last logged keys when logging a directory")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When creating a subvolume, at create_subvol(), we allocate an anonymous
device and later call btrfs_get_new_fs_root(), which in turn just calls
btrfs_get_root_ref(). There we call btrfs_init_fs_root() which assigns
the anonymous device to the root, but if after that call there's an error,
when we jump to 'fail' label, we call btrfs_put_root(), which frees the
anonymous device and then returns an error that is propagated back to
create_subvol(). Than create_subvol() frees the anonymous device again.
When this happens, if the anonymous device was not reallocated after
the first time it was freed with btrfs_put_root(), we get a kernel
message like the following:
(...)
[13950.282466] BTRFS: error (device dm-0) in create_subvol:663: errno=-5 IO failure
[13950.283027] ida_free called for id=65 which is not allocated.
[13950.285974] BTRFS info (device dm-0): forced readonly
(...)
If the anonymous device gets reallocated by another btrfs filesystem
or any other kernel subsystem, then bad things can happen.
So fix this by setting the root's anonymous device to 0 at
btrfs_get_root_ref(), before we call btrfs_put_root(), if an error
happened.
Fixes: 2dfb1e43f57dd3 ("btrfs: preallocate anon block device at first phase of snapshot creation")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Line 1169 (#3) allocates a memory chunk for victim_name by kmalloc(),
but when the function returns in line 1184 (#4) victim_name allocated
by line 1169 (#3) is not freed, which will lead to a memory leak.
There is a similar snippet of code in this function as allocating a memory
chunk for victim_name in line 1104 (#1) as well as releasing the memory
in line 1116 (#2).
We should kfree() victim_name when the return value of backref_in_log()
is less than zero and before the function returns in line 1184 (#4).
1057 static inline int __add_inode_ref(struct btrfs_trans_handle *trans,
1058 struct btrfs_root *root,
1059 struct btrfs_path *path,
1060 struct btrfs_root *log_root,
1061 struct btrfs_inode *dir,
1062 struct btrfs_inode *inode,
1063 u64 inode_objectid, u64 parent_objectid,
1064 u64 ref_index, char *name, int namelen,
1065 int *search_done)
1066 {
1104 victim_name = kmalloc(victim_name_len, GFP_NOFS);
// #1: kmalloc (victim_name-1)
1105 if (!victim_name)
1106 return -ENOMEM;
1112 ret = backref_in_log(log_root, &search_key,
1113 parent_objectid, victim_name,
1114 victim_name_len);
1115 if (ret < 0) {
1116 kfree(victim_name); // #2: kfree (victim_name-1)
1117 return ret;
1118 } else if (!ret) {
1169 victim_name = kmalloc(victim_name_len, GFP_NOFS);
// #3: kmalloc (victim_name-2)
1170 if (!victim_name)
1171 return -ENOMEM;
1180 ret = backref_in_log(log_root, &search_key,
1181 parent_objectid, victim_name,
1182 victim_name_len);
1183 if (ret < 0) {
1184 return ret; // #4: missing kfree (victim_name-2)
1185 } else if (!ret) {
1241 return 0;
1242 }
Fixes: d3316c8233bb ("btrfs: Properly handle backref_in_log retval")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The commit referenced below added fixup_map_timer support (to create a
BPF map containing timers), but failed to increase the size of the
map_fds array, leading to out of bounds write. Fix this by changing
MAX_NR_MAPS to 22.
Fixes: e60e6962c503 ("selftests/bpf: Add tests for restricted helpers")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214014800.78762-1-memxor@gmail.com
|
|
Do not sleep in poll() when the need_wakeup flag is set. When this
flag is set, the application needs to explicitly wake up the driver
with a syscall (poll, recvmsg, sendmsg, etc.) to guarantee that Rx
and/or Tx processing will be processed promptly. But the current code
in poll(), sleeps first then wakes up the driver. This means that no
driver processing will occur (baring any interrupts) until the timeout
has expired.
Fix this by checking the need_wakeup flag first and if set, wake the
driver and return to the application. Only if need_wakeup is not set
should the process sleep if there is a timeout set in the poll() call.
Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings")
Reported-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20211214102607.7677-1-magnus.karlsson@gmail.com
|
|
The donation calculation logic assumes that the donor has non-zero
after-donation hweight, so the lowest active hweight a donating cgroup can
have is 2 so that it can donate 1 while keeping the other 1 for itself.
Earlier, we only donated from cgroups with sizable surpluses so this
condition was always true. However, with the precise donation algorithm
implemented, f1de2439ec43 ("blk-iocost: revamp donation amount
determination") made the donation amount calculation exact enabling even low
hweight cgroups to donate.
This means that in rare occasions, a cgroup with active hweight of 1 can
enter donation calculation triggering the following warning and then a
divide-by-zero oops.
WARNING: CPU: 4 PID: 0 at block/blk-iocost.c:1928 transfer_surpluses.cold+0x0/0x53 [884/94867]
...
RIP: 0010:transfer_surpluses.cold+0x0/0x53
Code: 92 ff 48 c7 c7 28 d1 ab b5 65 48 8b 34 25 00 ae 01 00 48 81 c6 90 06 00 00 e8 8b 3f fe ff 48 c7 c0 ea ff ff ff e9 95 ff 92 ff <0f> 0b 48 c7 c7 30 da ab b5 e8 71 3f fe ff 4c 89 e8 4d 85 ed 74 0
4
...
Call Trace:
<IRQ>
ioc_timer_fn+0x1043/0x1390
call_timer_fn+0xa1/0x2c0
__run_timers.part.0+0x1ec/0x2e0
run_timer_softirq+0x35/0x70
...
iocg: invalid donation weights in /a/b: active=1 donating=1 after=0
Fix it by excluding cgroups w/ active hweight < 2 from donating. Excluding
these extreme low hweight donations shouldn't affect work conservation in
any meaningful way.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: f1de2439ec43 ("blk-iocost: revamp donation amount determination")
Cc: stable@vger.kernel.org # v5.10+
Link: https://lore.kernel.org/r/Ybfh86iSvpWKxhVM@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
* for-next/perf-user-counter-access:
Documentation: arm64: Document PMU counters access from userspace
arm64: perf: Enable PMU counter userspace access for perf event
arm64: perf: Add userspace counter access disable switch
perf: Add a counter for number of user access events in context
x86: perf: Move RDPMC event flag to a common definition
|
|
* for-next/perf-smmu:
perf/smmuv3: Synthesize IIDR from CoreSight ID registers
perf/smmuv3: Add devicetree support
dt-bindings: Add Arm SMMUv3 PMCG binding
|
|
* for-next/perf-hisi:
drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
docs: perf: Add description for HiSilicon PCIe PMU driver
|
|
* for-next/perf-cn10k:
dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings
drivers: perf: Add LLC-TAD perf counter support
|
|
* for-next/perf-cmn:
perf/arm-cmn: Add debugfs topology info
perf/arm-cmn: Add CI-700 Support
dt-bindings: perf: arm-cmn: Add CI-700
perf/arm-cmn: Support new IP features
perf/arm-cmn: Demarcate CMN-600 specifics
perf/arm-cmn: Move group validation data off-stack
perf/arm-cmn: Optimise DTC counter accesses
perf/arm-cmn: Optimise DTM counter reads
perf/arm-cmn: Refactor DTM handling
perf/arm-cmn: Streamline node iteration
perf/arm-cmn: Refactor node ID handling
perf/arm-cmn: Drop compile-test restriction
perf/arm-cmn: Account for NUMA affinity
perf/arm-cmn: Fix CPU hotplug unregistration
|
|
Some powers were changed during the jack insert detection
and clk's enable/disable in CCF.
If in parallel, the influence has a chance to detect
the wrong jack type, so add a lock.
Signed-off-by: Derek Fang <derek.fang@realtek.com>
Link: https://lore.kernel.org/r/20211214105033.471-1-derek.fang@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The FEAT_LSE atomic instructions include LD* instructions which return
the original value of a memory location can be used to directly
implement FETCH opertations. Each RETURN op is implemented as a copy of
the corresponding FETCH op with a trailing instruction to generate the
new value of the memory location. We only directly implement
*_fetch_add*(), for which we have a trailing `add` instruction.
As the compiler has no visibility of the `add`, this leads to less than
optimal code generation when consuming the result.
For example, the compiler cannot constant-fold the addition into later
operations, and currently GCC 11.1.0 will compile:
return __lse_atomic_sub_return(1, v) == 0;
As:
mov w1, #0xffffffff
ldaddal w1, w2, [x0]
add w1, w1, w2
cmp w1, #0x0
cset w0, eq // eq = none
ret
This patch improves this by replacing the `add` with C addition after
the inline assembly block, e.g.
ret += i;
This allows the compiler to manipulate `i`. This permits the compiler to
merge the `add` and `cmp` for the above, e.g.
mov w1, #0xffffffff
ldaddal w1, w1, [x0]
cmp w1, #0x1
cset w0, eq // eq = none
ret
With this change the assembly for each RETURN op is identical to the
corresponding FETCH op (including barriers and clobbers) so I've removed
the inline assembly and rewritten each RETURN op in terms of the
corresponding FETCH op, e.g.
| static inline void __lse_atomic_add_return(int i, atomic_t *v)
| {
| return __lse_atomic_fetch_add(i, v) + i
| }
The new construction does not adversely affect the common case, and
before and after this patch GCC 11.1.0 can compile:
__lse_atomic_add_return(i, v)
As:
ldaddal w0, w2, [x1]
add w0, w0, w2
... while having the freedom to do better elsewhere.
This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
We have overly conservative assembly constraints for the basic FEAT_LSE
atomic instructions, and using more accurate and permissive constraints
will allow for better code generation.
The FEAT_LSE basic atomic instructions have come in two forms:
LD{op}{order}{size} <Rs>, <Rt>, [<Rn>]
ST{op}{order}{size} <Rs>, [<Rn>]
The ST* forms are aliases of the LD* forms where:
ST{op}{order}{size} <Rs>, [<Rn>]
Is:
LD{op}{order}{size} <Rs>, XZR, [<Rn>]
For either form, both <Rs> and <Rn> are read but not written back to,
and <Rt> is written with the original value of the memory location.
Where (<Rt> == <Rs>) or (<Rt> == <Rn>), <Rt> is written *after* the
other register value(s) are consumed. There are no UNPREDICTABLE or
CONSTRAINED UNPREDICTABLE behaviours when any pair of <Rs>, <Rt>, or
<Rn> are the same register.
Our current inline assembly always uses <Rs> == <Rt>, treating this
register as both an input and an output (using a '+r' constraint). This
forces the compiler to do some unnecessary register shuffling and/or
redundant value generation.
For example, the compiler cannot reuse the <Rs> value, and currently GCC
11.1.0 will compile:
__lse_atomic_add(1, a);
__lse_atomic_add(1, b);
__lse_atomic_add(1, c);
As:
mov w3, #0x1
mov w4, w3
stadd w4, [x0]
mov w0, w3
stadd w0, [x1]
stadd w3, [x2]
We can improve this with more accurate constraints, separating <Rs> and
<Rt>, where <Rs> is an input-only register ('r'), and <Rt> is an
output-only value ('=r'). As <Rt> is written back after <Rs> is
consumed, it does not need to be earlyclobber ('=&r'), leaving the
compiler free to use the same register for both <Rs> and <Rt> where this
is desirable.
At the same time, the redundant 'r' constraint for `v` is removed, as
the `+Q` constraint is sufficient.
With this change, the above example becomes:
mov w3, #0x1
stadd w3, [x0]
stadd w3, [x1]
stadd w3, [x2]
I've made this change for the non-value-returning and FETCH ops. The
RETURN ops have a multi-instruction sequence for which we cannot use the
same constraints, and a subsequent patch will rewrite hte RETURN ops in
terms of the FETCH ops, relying on the ability for the compiler to reuse
the <Rs> value.
This is intended as an optimization.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The FEAT_LSE atomic instructions include atomic bit-clear instructions
(`ldclr*` and `stclr*`) which can be used to directly implement ANDNOT
operations. Each AND op is implemented as a copy of the corresponding
ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the
`i` argument.
As the compiler has no visibility of the `mvn`, this leads to less than
optimal code generation when generating `i` into a register. For
example, __lse_atomic_fetch_and(0xf, v) can be compiled to:
mov w1, #0xf
mvn w1, w1
ldclral w1, w1, [x2]
This patch improves this by replacing the `mvn` with NOT in C before the
inline assembly block, e.g.
i = ~i;
This allows the compiler to generate `i` into a register more optimally,
e.g.
mov w1, #0xfffffff0
ldclral w1, w1, [x2]
With this change the assembly for each AND op is identical to the
corresponding ANDNOT op (including barriers and clobbers), so I've
removed the inline assembly and rewritten each AND op in terms of the
corresponding ANDNOT op, e.g.
| static inline void __lse_atomic_and(int i, atomic_t *v)
| {
| return __lse_atomic_andnot(~i, v);
| }
This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The FEAT_LSE atomic instructions include atomic ADD instructions
(`stadd*` and `ldadd*`), but do not include atomic SUB instructions, so
we must build all of the SUB operations using the ADD instructions. We
open-code these today, with each SUB op implemented as a copy of the
corresponding ADD op with a leading `neg` instruction in the inline
assembly to negate the `i` argument.
As the compiler has no visibility of the `neg`, this leads to less than
optimal code generation when generating `i` into a register. For
example, __les_atomic_fetch_sub(1, v) can be compiled to:
mov w1, #0x1
neg w1, w1
ldaddal w1, w1, [x2]
This patch improves this by replacing the `neg` with negation in C
before the inline assembly block, e.g.
i = -i;
This allows the compiler to generate `i` into a register more optimally,
e.g.
mov w1, #0xffffffff
ldaddal w1, w1, [x2]
With this change the assembly for each SUB op is identical to the
corresponding ADD op (including barriers and clobbers), so I've removed
the inline assembly and rewritten each SUB op in terms of the
corresponding ADD op, e.g.
| static inline void __lse_atomic_sub(int i, atomic_t *v)
| {
| __lse_atomic_add(-i, v);
| }
For clarity I've moved the definition of each SUB op immediately after
the corresponding ADD op, and used a single macro to create the RETURN
forms of both ops.
This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The code for the atomic ops is formatted inconsistently, and while this
is not a functional problem it is rather distracting when working on
them.
Some have ops have consistent indentation, e.g.
| #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \
| static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \
| { \
| u32 tmp; \
| \
| asm volatile( \
| __LSE_PREAMBLE \
| " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
| " add %w[i], %w[i], %w[tmp]" \
| : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
| : "r" (v) \
| : cl); \
| \
| return i; \
| }
While others have negative indentation for some lines, and/or have
misaligned trailing backslashes, e.g.
| static inline void __lse_atomic_##op(int i, atomic_t *v) \
| { \
| asm volatile( \
| __LSE_PREAMBLE \
| " " #asm_op " %w[i], %[v]\n" \
| : [i] "+r" (i), [v] "+Q" (v->counter) \
| : "r" (v)); \
| }
This patch makes the indentation consistent and also aligns the trailing
backslashes. This makes the code easier to read for those (like myself)
who are easily distracted by these inconsistencies.
This is intended as a cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Ido Schimmel says:
====================
mlxsw: MAC profiles occupancy fix
Patch #1 fixes a router interface (RIF) MAC profiles occupancy bug that
was merged in the last cycle.
Patch #2 adds a selftest that fails without the fix.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a test case to cover the bug fixed by the previous patch.
Edit the MAC address of one netdev so that it matches the MAC address of
the second netdev. Verify that the two MAC profiles were consolidated by
testing that the MAC profiles occupancy decreased by one.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, when setting a router interface (RIF) MAC address while the
MAC profile is not shared with other RIFs, the profile is edited so that
the new MAC address is assigned to it.
This does not take into account a situation in which the new MAC address
already matches an existing MAC profile. In that situation, two MAC
profiles will be occupied even though they hold MAC addresses from the
same profile.
In order to prevent that, add a check to ensure that editing a MAC
profile takes place only when the new MAC address does not match an
existing profile.
Fixes: 605d25cd782a6 ("mlxsw: spectrum_router: Add RIF MAC profiles support")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__rds_conn_create() did not release conn->c_path when loop_trans != 0 and
trans->t_prefer_loopback != 0 and is_outgoing == 0.
Fixes: aced3ce57cd3 ("RDS tcp loopback connection can hang")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Sharath Srinivasan <sharath.srinivasan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A fairly large number of fixes this time:
* fix a station info memory leak on insert collisions
* a rate control fix for retransmissions
* two aggregation setup fixes
* reload current regdomain when reloading database
* a locking fix in regulatory work
* a probe request allocation size fix in mac80211
* apply TCP vs. aggregation (sk pacing) on mesh
* fix ordering of channel context update vs. station
state
* set up skb->dev for mesh forwarding properly
* track QoS data frames only for admission control to
avoid out-of-bounds read (found by syzbot)
* validate extended element ID vs. existing data to
avoid out-of-bounds read (found by syzbot)
* fix locking in mac80211 aggregation TX setup
* fix traffic stall after HW restart when TXQs are used
* fix ordering of reconfig/restart after HW restart
* fix interface type for extended aggregation capability
lookup
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-12-13
This series contains updates to iavf driver only.
Dan Carpenter fixes some missing mutex unlocking.
Stefan Assmann restores stopping watchdog from overriding to reset state.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need to return EOPNOTSUPP for the unsupported mpls action type when
setup the flow action.
In the original implement, we will return 0 for the unsupported mpls
action type, actually we do not setup it and the following actions
to the flow action entry.
Fixes: 9838b20a7fb2 ("net: sched: take rtnl lock in tc_setup_flow_action()")
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When CMD13 is sent after switching to HS400ES mode, the bus
is operating at either MMC_HIGH_26_MAX_DTR or MMC_HIGH_52_MAX_DTR.
To meet Tegra SDHCI requirement at HS400ES mode, force SDHCI
interface clock to MMC_HS200_MAX_DTR (200 MHz) so that host
controller CAR clock and the interface clock are rate matched.
Signed-off-by: Prathamesh Shete <pshete@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: dfc9700cef77 ("mmc: tegra: Implement HS400 enhanced strobe")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211214113653.4631-1-pshete@nvidia.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
To replicate the issue:-
1) Add 1 flower filter for VLAN Priority based frame steering:-
$ IFDEVNAME=eth0
$ tc qdisc add dev $IFDEVNAME ingress
$ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc filter add dev $IFDEVNAME parent ffff: protocol 802.1Q \
flower vlan_prio 0 hw_tc 0
2) Get the 'pref' id
$ tc filter show dev $IFDEVNAME ingress
3) Delete a specific tc flower record (say pref 49151)
$ tc filter del dev $IFDEVNAME parent ffff: pref 49151
From dmesg, we will observe kernel NULL pointer ooops
[ 197.170464] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 197.171367] #PF: supervisor read access in kernel mode
[ 197.171367] #PF: error_code(0x0000) - not-present page
[ 197.171367] PGD 0 P4D 0
[ 197.171367] Oops: 0000 [#1] PREEMPT SMP NOPTI
<snip>
[ 197.171367] RIP: 0010:tc_setup_cls+0x20b/0x4a0 [stmmac]
<snip>
[ 197.171367] Call Trace:
[ 197.171367] <TASK>
[ 197.171367] ? __stmmac_disable_all_queues+0xa8/0xe0 [stmmac]
[ 197.171367] stmmac_setup_tc_block_cb+0x70/0x110 [stmmac]
[ 197.171367] tc_setup_cb_destroy+0xb3/0x180
[ 197.171367] fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
The above issue is due to previous incorrect implementation of
tc_del_vlan_flow(), shown below, that uses flow_cls_offload_flow_rule()
to get struct flow_rule *rule which is no longer valid for tc filter
delete operation.
struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
struct flow_dissector *dissector = rule->match.dissector;
So, to ensure tc_del_vlan_flow() deletes the right VLAN cls record for
earlier configured RX queue (configured by hw_tc) in tc_add_vlan_flow(),
this patch introduces stmmac_rfs_entry as driver-side flow_cls_offload
record for 'RX frame steering' tc flower, currently used for VLAN
priority. The implementation has taken consideration for future extension
to include other type RX frame steering such as EtherType based.
v2:
- Clean up overly extensive backtrace and rewrite git message to better
explain the kernel NULL pointer issue.
Fixes: 0e039f5cf86c ("net: stmmac: add RX frame steering based on VLAN priority in tc flower")
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported
to sample bandwidth, latency, buffer occupation etc.
Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is
registered as a PMU in /sys/bus/event_source/devices, so users can
select target PMU, and use filter to do further sets.
Filtering options contains:
event - select the event.
port - select target Root Ports. Information of Root Ports are
shown under sysfs.
bdf - select requester_id of target EP device.
trig_len - set trigger condition for starting event statistics.
trig_mode - set trigger mode. 0 means starting to statistic when bigger
than trigger condition, and 1 means smaller.
thr_len - set threshold for statistics.
thr_mode - set threshold mode. 0 means count when bigger than threshold,
and 1 means smaller.
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported on
HiSilicon HIP09 platform. Document it to provide guidance on how to
use it.
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20211202080633.2919-2-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
PCI_MSIX_FLAGS_MASKALL is set in the MSI-X control register at MSI-X
interrupt setup time. It's cleared on success, but the error handling path
only clears the PCI_MSIX_FLAGS_ENABLE bit.
That's incorrect as the reset state of the PCI_MSIX_FLAGS_MASKALL bit is
zero. That can be observed via lspci:
Capabilities: [b0] MSI-X: Enable- Count=67 Masked+
Clear the bit in the error path to restore the reset state.
Fixes: 438553958ba1 ("PCI/MSI: Enable and mask MSI-X early")
Reported-by: Stefan Roese <sr@denx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Stefan Roese <sr@denx.de>
Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Marek Vasut <marex@denx.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87tufevoqx.ffs@tglx
|
|
Masking all unused MSI-X entries is done to ensure that a crash kernel
starts from a clean slate, which correponds to the reset state of the
device as defined in the PCI-E specificion 3.0 and later:
Vector Control for MSI-X Table Entries
--------------------------------------
"00: Mask bit: When this bit is set, the function is prohibited from
sending a message using this MSI-X Table entry.
...
This bit’s state after reset is 1 (entry is masked)."
A Marvell NVME device fails to deliver MSI interrupts after trying to
enable MSI-X interrupts due to that masking. It seems to take the MSI-X
mask bits into account even when MSI-X is disabled.
While not specification compliant, this can be cured by moving the masking
into the success path, so that the MSI-X table entries stay in device reset
state when the MSI-X setup fails.
[ tglx: Move it into the success path, add comment and amend changelog ]
Fixes: aa8092c1d1f1 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Marek Vasut <marex@denx.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211210161025.3287927-1-sr@denx.de
|
|
Add device tree bindings for Last-level-cache Tag-and-data
(LLC-TAD) unit PMU for Marvell CN10K SoCs.
Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211115043506.6679-3-bbudiredla@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This driver adds support for Last-level cache tag-and-data unit
(LLC-TAD) PMU that is featured in some of the Marvell's CN10K
infrastructure silicons.
The LLC is divided into 2N slices distributed across N Mesh tiles
in a single-socket configuration. The driver always configures the
same counter for all of the TADs. The user would end up effectively
reserving one of eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of an application run. The driver does not provide a way
for the user to partition TADs so that different TADs are used for
different applications.
The event counters are zeroed to start event counting to avoid any
rollover issues. TAD perf counters are 64-bit, so it's not currently
possible to overflow event counters at current mesh and core
frequencies.
To measure tad pmu events use perf tool stat command. For instance:
perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application>
perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application>
Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Use the EOR3 instruction to implement xor_blocks() if the instruction is
available, which is the case if the CPU implements the SHA-3 extension.
This is about 20% faster on Apple M1 when using the 5-way version.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Livepatching a loaded module involves applying relocations through
apply_relocate_add(), which attempts to write to read-only memory when
CONFIG_STRICT_MODULE_RWX=y. Work around this by performing these
writes through the text poke area by using patch_instruction().
R_PPC_REL24 is the only relocation type generated by the kpatch-build
userspace tool or klp-convert kernel tree that I observed applying a
relocation to a post-init module.
A more comprehensive solution is planned, but using patch_instruction()
for R_PPC_REL24 on should serve as a sufficient fix.
This does have a performance impact, I observed ~15% overhead in
module_load() on POWER8 bare metal with checksum verification off.
Fixes: c35717c71e98 ("powerpc: Set ARCH_HAS_STRICT_MODULE_RWX")
Cc: stable@vger.kernel.org # v5.14+
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
[mpe: Check return codes from patch_instruction()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211214121248.777249-1-mpe@ellerman.id.au
|