Age | Commit message (Collapse) | Author |
|
kernel test robot reports build errors in 3 Xilinx ethernet drivers.
They all use ioremap functions that are only available when HAS_IOMEM
is set/enabled. If it is not enabled, they all have build errors,
so make these 3 drivers depend on HAS_IOMEM.
ld: drivers/net/ethernet/xilinx/xilinx_emaclite.o: in function `xemaclite_of_probe':
xilinx_emaclite.c:(.text+0x9fc): undefined reference to `devm_ioremap_resource'
ld: drivers/net/ethernet/xilinx/xilinx_axienet_main.o: in function `axienet_probe':
xilinx_axienet_main.c:(.text+0x942): undefined reference to `devm_ioremap_resource'
ld: drivers/net/ethernet/xilinx/ll_temac_main.o: in function `temac_probe':
ll_temac_main.c:(.text+0x1283): undefined reference to `devm_platform_ioremap_resource_byname'
ld: ll_temac_main.c:(.text+0x13ad): undefined reference to `devm_of_iomap'
ld: ll_temac_main.c:(.text+0x162e): undefined reference to `devm_platform_ioremap_resource'
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Zhang Changzhong <zhangchangzhong@huawei.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: stable@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vladimir Oltean says:
====================
Flow control for NXP ENETC
This patch series contains logic for enabling the lossless mode on the
RX rings of the ENETC, and the PAUSE thresholds on the internal FIFO
memory.
During testing it was found that, with the default FIFO configuration,
a sender which isn't persuaded by our PAUSE frames and keeps sending
will cause some MAC RX frame errors. To mitigate this, we need to ensure
that the FIFO never runs completely full, so we need to fix up a setting
that was supposed to be configured well out of reset. Unfortunately this
requires the addition of a new mini-driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the ENETC receive path, a frame received by the MAC is first stored
in a 256KB 'FIFO' memory, then transferred to DRAM when enqueuing it to
the RX ring. The FIFO is a shared resource for all ENETC ports, but
every port keeps track of its own memory utilization, on RX and on TX.
There is a setting for RX rings through which they can either operate in
'lossy' mode (where the lack of a free buffer causes an immediate
discard of the frame) or in 'lossless' mode (where the lack of a free
buffer in the ring makes the frame stay longer in the FIFO).
In turn, when the memory utilization of the FIFO exceeds a certain
margin, the MAC can be configured to emit PAUSE frames.
There is enough FIFO memory to buffer up to 3 MTU-sized frames per RX
port while not jeopardizing the other use cases (jumbo frames), and
also not consume bytes from the port TX allocations. Also, 3 MTU-sized
frames worth of memory is enough to ensure zero loss for 64 byte packets
at 1G line rate.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a node describing the address in the SoC memory space for the IERB.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The NXP ENETC is a 4-port Ethernet controller which 'smells' to
operating systems like 4 distinct PCIe PFs with SR-IOV, each PF having
its own driver instance, but in fact there are some hardware resources
which are shared between all ports, like for example the 256 KB SRAM
FIFO between the MACs and the Host Transfer Agent which DMAs frames to
DRAM.
To hide the stuff that cannot be neatly exposed per port, the hardware
designers came up with this idea of having a dedicated register block
which is supposed to be populated by the bootloader, and contains
everything configuration-related: MAC addresses, FIFO partitioning, etc.
When a port is reset using PCIe Function Level Reset, its defaults are
transferred from the IERB configuration. Most of the time, the settings
made through the IERB are read-only in the port's memory space (if they
are even visible), so they cannot be modified at runtime.
Linux doesn't have any advanced FIFO partitioning requirements at all,
but when reading through the hardware manual, it became clear that, even
though there are many good 'recommendations' for default values, many of
them were not actually put in practice on LS1028A. So we end up with a
default configuration that:
(a) does not have enough TX and RX byte credits to support the max MTU
of 9600 (which the Linux driver claims already) properly (at full speed)
(b) allows the FIFO to be overrun with RX traffic, potentially
overwriting internal data structures.
The last part sounds a bit catastrophic, but it isn't. Frames are
supposed to transit the FIFO for a very short time, but they can
actually accumulate there under 2 conditions:
(a) there is very severe congestion on DRAM memory, or
(b) the RX rings visible to the operating system were configured for
lossless operation, and they just ran out of free buffers to copy
the frame to. This is what is used to put backpressure onto the MAC
with flow control.
So since ENETC has not supported flow control thus far, RX FIFO overruns
were never seen with Linux. But with the addition of flow control, we
should configure some registers to prevent this from happening. What we
are trying to protect against are bad actors which continue to send us
traffic despite the fact that we have signaled a PAUSE condition. Of
course we can't be lossless in that case, but it is best to configure
the FIFO to do tail dropping rather than letting it overrun.
So in a nutshell, this driver is a fixup for all the IERB default values
that should have been but aren't.
The IERB configuration needs to be done _before_ the PFs are enabled.
So every PF searches for the presence of the "fsl,ls1028a-enetc-ierb"
node in the device tree, and if it finds it, it "registers" with the
IERB, which means that it requests the IERB to fix up its default
values. This is done through -EPROBE_DEFER. The IERB driver is part of
the fsl_enetc module, but is technically a platform driver, since the
IERB is a good old fashioned MMIO region, as opposed to ENETC ports
which pretend to be PCIe devices.
The driver was already configuring ENETC_PTXMBAR (FIFO allocation for
TX) because due to an omission, TXMBAR is a read/write register in the
PF memory space. But the manual is quite clear that the formula for this
should depend upon the TX byte credits (TXBCR). In turn, the TX byte
credits are only readable/writable through the IERB. So if we want to
ensure that the TXBCR register also has a value that is correct and in
line with TXMBAR, there is simply no way this can be done from the PF
driver, access to the IERB is needed.
I could have modified U-Boot to fix up the IERB values, but that is
quite undesirable, as old U-Boot versions are likely to be floating
around for quite some time from now.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Mention the required compatible string and base address for the
Integrated Endpoint Register Block node.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Even though ENETC interfaces are exposed as individual PCIe PFs with
their own driver instances, the ENETC is still fundamentally a
multi-port Ethernet controller, and some parts of the IP take a port
number (as can be seen in the PSFP implementation).
Create a common helper that can be used outside of the TSN code for
retrieving the ENETC port number based on the PF number. This is only
correct for LS1028A, the only Linux-capable instantiation of ENETC thus
far.
Note that ENETC port 3 is PF 6. The TSN code did not care about this
because ENETC port 3 does not support TSN, so the wrong mapping done by
enetc_get_port for PF 6 could have never been hit.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a reproducible sequence from the userland that will trigger a WARN_ON()
condition in taprio_get_start_time, which causes kernel to panic if configured
as "panic_on_warn". Catch this condition in parse_taprio_schedule to
prevent this condition.
Reported as bug on syzkaller:
https://syzkaller.appspot.com/bug?extid=d50710fd0873a9c6b40c
Reported-by: syzbot+d50710fd0873a9c6b40c@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florent Revest says:
====================
We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)
A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.
This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-revest@google.com/T/#u
- A new arg type guarantees the NULL-termination of string arguments and
lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
string is known at verification time, the format string validation is
done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-revest@chromium.org/T/#u
---
Changes in v5:
- Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
- Added a couple of extra incorrect specifiers tests
- Call test_snprintf_single__destroy unconditionally
- Fixed a C++-style comment
---
Changes in v4:
- Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
- Added negative test cases (various invalid format strings)
- Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
- Fixed a mistake that caused temporary buffers to be unconditionally
freed in bpf_printf_prepare
- Fixed a mistake that caused missing 0 character to be ignored
- Fixed a warning about integer to pointer conversion
- Misc cleanups
---
Changes in v3:
- Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
- Made zero-termination check more consistent
- Allowed NULL output_buffer
- Simplified the BPF_CAST_FMT_ARG macro
- Three new test cases: number padding, simple string with no arg and
string length extraction only with a NULL output buffer
- Clarified helper's description for edge cases (eg: str_size == 0)
- Lots of cosmetic changes
---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes
====================
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fix the following out-of-bounds warning:
net/ethtool/ioctl.c:492:2: warning: 'memcpy' offset [49, 84] from the object at 'link_usettings' is out of the bounds of referenced subobject 'base' with type 'struct ethtool_link_settings' at offset 0 [-Warray-bounds]
The problem is that the original code is trying to copy data into a
some struct members adjacent to each other in a single call to
memcpy(). This causes a legitimate compiler warning because memcpy()
overruns the length of &link_usettings.base. Fix this by directly
using &link_usettings and _from_ as destination and source addresses,
instead.
This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/109
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The "positive" part tests all format specifiers when things go well.
The "negative" part makes sure that incorrect format strings fail at
load time.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-7-revest@chromium.org
|
|
Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org
|
|
When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.
Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org
|
|
The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:
To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.
Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.
Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org
|
|
This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-3-revest@chromium.org
|
|
Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.
The implementation of the two existing helpers already drifted quite a
bit so unifying them entailed a lot of changes:
- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
using an enum instead of obscure int increments.
- bpf_seq_printf now uses bpf_trace_copy_string instead of
strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.
- both were changed to use a global per-cpu tmp buffer instead of one
stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.
The implementation is also moved from bpf_trace.c to helpers.c because
the upcoming bpf_snprintf helper will be made available to all BPF
programs and will need it.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-2-revest@chromium.org
|
|
Add a VF driver for Microsoft Azure Network Adapter (MANA) that will be
available in the future.
Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Co-developed-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some prototypes are unnecessary, so delete it.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ido Schimmel says:
====================
nexthop: Support large scale nexthop flushing
Patch #1 fixes a day-one bug in the nexthop code and allows "ip nexthop
flush" to work correctly with large number of nexthops that do not fit
in a single-part dump.
Patch #2 adds a test case.
Targeting at net-next since this use case never worked, the flow is
pretty obscure and such a large number of nexthops is unlikely to be
used in any real-world scenario.
Tested with fib_nexthops.sh:
Tests passed: 219
Tests failed: 0
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test that all the nexthops are flushed when a multi-part nexthop dump is
required for the flushing.
Without previous patch:
# ./fib_nexthops.sh
TEST: Large scale nexthop flushing [FAIL]
With previous patch:
# ./fib_nexthops.sh
TEST: Large scale nexthop flushing [ OK ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, a multi-part nexthop dump is restarted based on the number of
nexthops that have been dumped so far. This can result in a lot of
nexthops not being dumped when nexthops are simultaneously deleted:
# ip nexthop | wc -l
65536
# ip nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 36040 nexthops
# ip nexthop | wc -l
29496
Instead, restart the dump based on the nexthop identifier (fixed number)
of the last successfully dumped nexthop:
# ip nexthop | wc -l
65536
# ip nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 65536 nexthops
# ip nexthop | wc -l
0
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 04c53de57cb6435738961dace8b1b71d3ecd3c39.
Nathan Chancellor points out that it should not have been merged into
mainline by itself. It was a fix for "gcov: use kvmalloc()", which is
still in -mm/-next. Merging it alone has broken the build.
Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/2384465683?check_suite_focus=true
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.13
Second set of patches for v5.13. A lot of iwlwifi and mt76 patches
this time, and also smaller features and fixes all over.
mt76
* mt7915/mt7615 decapsulation offload support
* threaded NAPI support
* new device IDs
* mt7921 device reset support
* rx timestamp support
iwlwifi
* passive scan support for 6GHz
* new hardware support
wilc1000
* CRC support for SPI bus
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
VMCI feature is not supported in conjunction with the vSphere Fault
Tolerance (FT) feature.
VMware Tools can repeatedly try to create a vsock connection. If FT is
enabled the kernel logs is flooded with the following messages:
qp_alloc_hypercall result = -20
Could not attach to queue pair with -20
"qp_alloc_hypercall result = -20" was hidden by commit e8266c4c3307
("VMCI: Stop log spew when qp allocation isn't possible"), but "Could
not attach to queue pair with -20" is still there flooding the log.
Since the error message can be useful in some cases, print it only once.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
[BUG]
When running btrfs/071 with inode_need_compress() removed from
compress_file_range(), we got the following crash:
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
RIP: 0010:compress_file_range+0x476/0x7b0 [btrfs]
Call Trace:
? submit_compressed_extents+0x450/0x450 [btrfs]
async_cow_start+0x16/0x40 [btrfs]
btrfs_work_helper+0xf2/0x3e0 [btrfs]
process_one_work+0x278/0x5e0
worker_thread+0x55/0x400
? process_one_work+0x5e0/0x5e0
kthread+0x168/0x190
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x22/0x30
---[ end trace 65faf4eae941fa7d ]---
This is already after the patch "btrfs: inode: fix NULL pointer
dereference if inode doesn't need compression."
[CAUSE]
@pages is firstly created by kcalloc() in compress_file_extent():
pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
Then passed to btrfs_compress_pages() to be utilized there:
ret = btrfs_compress_pages(...
pages,
&nr_pages,
...);
btrfs_compress_pages() will initialize each page as output, in
zlib_compress_pages() we have:
pages[nr_pages] = out_page;
nr_pages++;
Normally this is completely fine, but there is a special case which
is in btrfs_compress_pages() itself:
switch (type) {
default:
return -E2BIG;
}
In this case, we didn't modify @pages nor @out_pages, leaving them
untouched, then when we cleanup pages, the we can hit NULL pointer
dereference again:
if (pages) {
for (i = 0; i < nr_pages; i++) {
WARN_ON(pages[i]->mapping);
put_page(pages[i]);
}
...
}
Since pages[i] are all initialized to zero, and btrfs_compress_pages()
doesn't change them at all, accessing pages[i]->mapping would lead to
NULL pointer dereference.
This is not possible for current kernel, as we check
inode_need_compress() before doing pages allocation.
But if we're going to remove that inode_need_compress() in
compress_file_extent(), then it's going to be a problem.
[FIX]
When btrfs_compress_pages() hits its default case, modify @out_pages to
0 to prevent such problem from happening.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212331
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
In case AUX failures happen unexpectedly during a modeset, the driver
should still complete the modeset. In particular the driver should
perform the link training sequence steps even in case of an AUX failure,
as this sequence also includes port initialization steps. Not doing that
can leave the port/pipe in a broken state and lead for instance to a
flip done timeout.
Fix this by continuing with link training (in a no-LTTPR mode) if the
DPRX DPCD readout failed for some reason at the beginning of link
training. After a successful connector detection we already have the
DPCD read out and cached, so the failed repeated read for it should not
cause a problem. Note that a partial AUX read could in theory partly
overwrite the cached DPCD (and return error) but this overwrite should
not happen if the returned values are corrupted (due to a timeout or
some other IO error).
Kudos to Ville to root cause the problem.
Fixes: 7dffbdedb96a ("drm/i915: Disable LTTPR support when the DPCD rev < 1.4")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3308
Cc: stable@vger.kernel.org # 5.11
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210412232413.2755054-1-imre.deak@intel.com
(cherry picked from commit e42e7e585984b85b0fb9dd1fefc85ee4800ca629)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[adjusted Fixes: tag]
|
|
Alder Lake RAPL support is the same as previous Sky Lake.
Add Alder Lake model for RAPL.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-26-git-send-email-kan.liang@linux.intel.com
|
|
Compared with the Rocket Lake, the CORE C1 Residency Counter is added
for Alder Lake, but the CORE C3 Residency Counter is removed. Other
counters are the same.
Create a new adl_cstates for Alder Lake. Update the comments
accordingly.
The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.
The patch has been tested on real hardware.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-25-git-send-email-kan.liang@linux.intel.com
|
|
PPERF and SMI_COUNT MSRs are also supported on Alder Lake.
The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.
The patch has been tested on real hardware.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-24-git-send-email-kan.liang@linux.intel.com
|
|
The uncore subsystem for Alder Lake is similar to the previous Tiger
Lake.
The difference includes:
- New MSR addresses for global control, fixed counters, CBOX and ARB.
Add a new adl_uncore_msr_ops for uncore operations.
- Add a new threshold field for CBOX.
- New PCIIDs for IMC devices.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-23-git-send-email-kan.liang@linux.intel.com
|
|
Current Hardware events and Hardware cache events have special perf
types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't
pass the PMU type in the user interface. For a hybrid system, the perf
subsystem doesn't know which PMU the events belong to. The first capable
PMU will always be assigned to the events. The events never get a chance
to run on the other capable PMUs.
Extend the two types to become PMU aware types. The PMU type ID is
stored at attr.config[63:32].
Add a new PMU capability, PERF_PMU_CAP_EXTENDED_HW_TYPE, to indicate a
PMU which supports the extended PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE.
The PMU type is only required when searching a specific PMU. The PMU
specific codes will only be interested in the 'real' config value, which
is stored in the low 32 bit of the event->attr.config. Update the
event->attr.config in the generic code, so the PMU specific codes don't
need to calculate it separately.
If a user specifies a PMU type, but the PMU doesn't support the extended
type, error out.
If an event cannot be initialized in a PMU specified by a user, error
out immediately. Perf should not try to open it on other PMUs.
The new PMU capability is only set for the X86 hybrid PMUs for now.
Other architectures, e.g., ARM, may need it as well. The support on ARM
may be implemented later separately.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1618237865-33448-22-git-send-email-kan.liang@linux.intel.com
|
|
Alder Lake Hybrid system has two different types of core, Golden Cove
core and Gracemont core. The Golden Cove core is registered to
"cpu_core" PMU. The Gracemont core is registered to "cpu_atom" PMU.
The difference between the two PMUs include:
- Number of GP and fixed counters
- Events
- The "cpu_core" PMU supports Topdown metrics.
The "cpu_atom" PMU supports PEBS-via-PT.
The "cpu_core" PMU is similar to the Sapphire Rapids PMU, but without
PMEM.
The "cpu_atom" PMU is similar to Tremont, but with different events,
event_constraints, extra_regs and number of counters.
The mem-loads AUX event workaround only applies to the Golden Cove core.
Users may disable all CPUs of the same CPU type on the command line or
in the BIOS. For this case, perf still register a PMU for the CPU type
but the CPU mask is 0.
Current caps/pmu_name is usually the microarch codename. Assign the
"alderlake_hybrid" to the caps/pmu_name of both PMUs to indicate the
hybrid Alder Lake microarchitecture.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-21-git-send-email-kan.liang@linux.intel.com
|
|
Implement filter_match callback for X86, which check whether an event is
schedulable on the current CPU.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-20-git-send-email-kan.liang@linux.intel.com
|
|
The attribute_group for Hybrid PMUs should be different from the
previous
cpu PMU. For example, cpumask is required for a Hybrid PMU. The PMU type
should be included in the event and format attribute.
Add hybrid_attr_update for the Hybrid PMU.
Check the PMU type in is_visible() function. Only display the event or
format for the matched Hybrid PMU.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-19-git-send-email-kan.liang@linux.intel.com
|
|
Hybrid PMUs have different events and formats. In theory, Hybrid PMU
specific attributes should be maintained in the dedicated struct
x86_hybrid_pmu, but it wastes space because the events and formats are
similar among Hybrid PMUs.
To reduce duplication, all hybrid PMUs will share a group of attributes
in the following patch. To distinguish an attribute from different
Hybrid PMUs, a PMU aware attribute structure is introduced. A PMU type
is required for the attribute structure. The type is internal usage. It
is not visible in the sysfs API.
Hybrid PMUs may support the same event name, but with different event
encoding, e.g., the mem-loads event on an Atom PMU has different event
encoding from a Core PMU. It brings issue if two attributes are
created for them. Current sysfs_update_group finds an attribute by
searching the attr name (aka event name). If two attributes have the
same event name, the first attribute will be replaced.
To address the issue, only one attribute is created for the event. The
event_str is extended and stores event encodings from all Hybrid PMUs.
Each event encoding is divided by ";". The order of the event encodings
must follow the order of the hybrid PMU index. The event_str is internal
usage as well. When a user wants to show the attribute of a Hybrid PMU,
only the corresponding part of the string is displayed.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-18-git-send-email-kan.liang@linux.intel.com
|
|
Different hybrid PMUs have different PMU capabilities and events. Perf
should registers a dedicated PMU for each of them.
To check the X86 event, perf has to go through all possible hybrid pmus.
All the hybrid PMUs are registered at boot time. Before the
registration, add intel_pmu_check_hybrid_pmus() to check and update the
counters information, the event constraints, the extra registers and the
unique capabilities for each hybrid PMUs.
Postpone the display of the PMU information and HW check to
CPU_STARTING, because the boot CPU is the only online CPU in the
init_hw_perf_events(). Perf doesn't know the availability of the other
PMUs. Perf should display the PMU information only if the counters of
the PMU are available.
One type of CPUs may be all offline. For this case, users can still
observe the PMU in /sys/devices, but its CPU mask is 0.
All hybrid PMUs have capability PERF_PMU_CAP_HETEROGENEOUS_CPUS.
The PMU name for hybrid PMUs will be "cpu_XXX", which will be assigned
later in a separated patch.
The PMU type id for the core PMU is still PERF_TYPE_RAW. For the other
hybrid PMUs, the PMU type id is not hard code.
The event->cpu must be compatitable with the supported CPUs of the PMU.
Add a check in the x86_pmu_event_init().
The events in a group must be from the same type of hybrid PMU.
The fake cpuc used in the validation must be from the supported CPU of
the event->pmu.
Perf may not retrieve a valid core type from get_this_hybrid_cpu_type().
For example, ADL may have an alternative configuration. With that
configuration, Perf cannot retrieve the core type from the CPUID leaf
0x1a. Add a platform specific get_hybrid_cpu_type(). If the generic way
fails, invoke the platform specific get_hybrid_cpu_type().
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1618237865-33448-17-git-send-email-kan.liang@linux.intel.com
|
|
The PMU capabilities are different among hybrid PMUs. Perf should dump
the PMU capabilities information for each hybrid PMU.
Factor out x86_pmu_show_pmu_cap() which shows the PMU capabilities
information. The function will be reused later when registering a
dedicated hybrid PMU.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-16-git-send-email-kan.liang@linux.intel.com
|
|
The temporary pmu assignment in event_init is unnecessary.
The assignment was introduced by commit 8113070d6639 ("perf_events:
Add fast-path to the rescheduling code"). At that time, event->pmu is
not assigned yet when initializing an event. The assignment is required.
However, from commit 7e5b2a01d2ca ("perf: provide PMU when initing
events"), the event->pmu is provided before event_init is invoked.
The temporary pmu assignment in event_init should be removed.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-15-git-send-email-kan.liang@linux.intel.com
|
|
Each Hybrid PMU has to check and update its own extra registers before
registration.
The intel_pmu_check_extra_regs will be reused later to check the extra
registers of each hybrid PMU.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-14-git-send-email-kan.liang@linux.intel.com
|
|
Each Hybrid PMU has to check and update its own event constraints before
registration.
The intel_pmu_check_event_constraints will be reused later to check
the event constraints of each hybrid PMU.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-13-git-send-email-kan.liang@linux.intel.com
|
|
Each Hybrid PMU has to check its own number of counters and mask fixed
counters before registration.
The intel_pmu_check_num_counters will be reused later to check the
number of the counters for each hybrid PMU.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-12-git-send-email-kan.liang@linux.intel.com
|
|
Different hybrid PMU may have different extra registers, e.g. Core PMU
may have offcore registers, frontend register and ldlat register. Atom
core may only have offcore registers and ldlat register. Each hybrid PMU
should use its own extra_regs.
An Intel Hybrid system should always have extra registers.
Unconditionally allocate shared_regs for Intel Hybrid system.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-11-git-send-email-kan.liang@linux.intel.com
|
|
The events are different among hybrid PMUs. Each hybrid PMU should use
its own event constraints.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-10-git-send-email-kan.liang@linux.intel.com
|
|
The hardware cache events are different among hybrid PMUs. Each hybrid
PMU should have its own hw cache event table.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1618237865-33448-9-git-send-email-kan.liang@linux.intel.com
|
|
The unconstrained value depends on the number of GP and fixed counters.
Each hybrid PMU should use its own unconstrained.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1618237865-33448-8-git-send-email-kan.liang@linux.intel.com
|
|
The number of GP and fixed counters are different among hybrid PMUs.
Each hybrid PMU should use its own counter related information.
When handling a certain hybrid PMU, apply the number of counters from
the corresponding hybrid PMU.
When reserving the counters in the initialization of a new event,
reserve all possible counters.
The number of counter recored in the global x86_pmu is for the
architecture counters which are available for all hybrid PMUs. KVM
doesn't support the hybrid PMU yet. Return the number of the
architecture counters for now.
For the functions only available for the old platforms, e.g.,
intel_pmu_drain_pebs_nhm(), nothing is changed.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-7-git-send-email-kan.liang@linux.intel.com
|
|
The intel_ctrl is the counter mask of a PMU. The PMU counter information
may be different among hybrid PMUs, each hybrid PMU should use its own
intel_ctrl to check and access the counters.
When handling a certain hybrid PMU, apply the intel_ctrl from the
corresponding hybrid PMU.
When checking the HW existence, apply the PMU and number of counters
from the corresponding hybrid PMU as well. Perf will check the HW
existence for each Hybrid PMU before registration. Expose the
check_hw_exists() for a later patch.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1618237865-33448-6-git-send-email-kan.liang@linux.intel.com
|
|
Some platforms, e.g. Alder Lake, have hybrid architecture. Although most
PMU capabilities are the same, there are still some unique PMU
capabilities for different hybrid PMUs. Perf should register a dedicated
pmu for each hybrid PMU.
Add a new struct x86_hybrid_pmu, which saves the dedicated pmu and
capabilities for each hybrid PMU.
The architecture MSR, MSR_IA32_PERF_CAPABILITIES, only indicates the
architecture features which are available on all hybrid PMUs. The
architecture features are stored in the global x86_pmu.intel_cap.
For Alder Lake, the model-specific features are perf metrics and
PEBS-via-PT. The corresponding bits of the global x86_pmu.intel_cap
should be 0 for these two features. Perf should not use the global
intel_cap to check the features on a hybrid system.
Add a dedicated intel_cap in the x86_hybrid_pmu to store the
model-specific capabilities. Use the dedicated intel_cap to replace
the global intel_cap for thse two features. The dedicated intel_cap
will be set in the following "Add Alder Lake Hybrid support" patch.
Add is_hybrid() to distinguish a hybrid system. ADL may have an
alternative configuration. With that configuration, the
X86_FEATURE_HYBRID_CPU is not set. Perf cannot rely on the feature bit.
Add a new static_key_false, perf_is_hybrid, to indicate a hybrid system.
It will be assigned in the following "Add Alder Lake Hybrid support"
patch as well.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1618237865-33448-5-git-send-email-kan.liang@linux.intel.com
|
|
Some platforms, e.g. Alder Lake, have hybrid architecture. In the same
package, there may be more than one type of CPU. The PMU capabilities
are different among different types of CPU. Perf will register a
dedicated PMU for each type of CPU.
Add a 'pmu' variable in the struct cpu_hw_events to track the dedicated
PMU of the current CPU.
Current x86_get_pmu() use the global 'pmu', which will be broken on a
hybrid platform. Modify it to apply the 'pmu' of the specific CPU.
Initialize the per-CPU 'pmu' variable with the global 'pmu'. There is
nothing changed for the non-hybrid platforms.
The is_x86_event() will be updated in the later patch ("perf/x86:
Register hybrid PMUs") for hybrid platforms. For the non-hybrid
platforms, nothing is changed here.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1618237865-33448-4-git-send-email-kan.liang@linux.intel.com
|
|
On processors with Intel Hybrid Technology (i.e., one having more than
one type of CPU in the same package), all CPUs support the same
instruction set and enumerate the same features on CPUID. Thus, all
software can run on any CPU without restrictions. However, there may be
model-specific differences among types of CPUs. For instance, each type
of CPU may support a different number of performance counters. Also,
machine check error banks may be wired differently. Even though most
software will not care about these differences, kernel subsystems
dealing with these differences must know.
Add and expose a new helper function get_this_hybrid_cpu_type() to query
the type of the current hybrid CPU. The function will be used later in
the perf subsystem.
The Intel Software Developer's Manual defines the CPU type as 8-bit
identifier.
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1618237865-33448-3-git-send-email-kan.liang@linux.intel.com
|