Age | Commit message (Collapse) | Author |
|
[Why]
drm_atomic_normalize_zpos() can return an error code when there's
modeset lock contention. This was being ignored.
[How]
Bail out of atomic check if normalize_zpos() returns an error.
Fixes: b261509952bc ("drm/amd/display: Fix double cursor on non-video RGB MPO")
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When building on OpenBSD/arm64 with clang 15, unaligned access
warnings are seen when a union is embedded inside a packed struct.
drm/amd/pm/powerplay/hwmgr/vega20_pptable.h:136:17: error: field
smcPPTable within 'struct _ATOM_VEGA20_POWERPLAYTABLE' is less aligned
than 'PPTable_t' and is usually due to
'struct _ATOM_VEGA20_POWERPLAYTABLE' being packed, which can lead to
unaligned accesses [-Werror,-Wunaligned-access]
PPTable_t smcPPTable;
^
Make PPTable_t packed to avoid this.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When building on OpenBSD/arm64 with clang 15, unaligned access
warnings are seen when a union is embedded inside a packed struct.
drm/amd/display/dmub/inc/dmub_cmd.h:941:18: error: field
cursor_copy_src within 'struct dmub_rb_cmd_mall' is less aligned than
'union dmub_addr' and is usually due to 'struct dmub_rb_cmd_mall'
being packed, which can lead to unaligned accesses
[-Werror,-Wunaligned-access]
union dmub_addr cursor_copy_src; /**< Cursor copy address */
^
drm/amd/display/dmub/inc/dmub_cmd.h:942:18: error: field cursor_copy_dst
within 'struct dmub_rb_cmd_mall' is less aligned than
'union dmub_addr' and is usually due to 'struct dmub_rb_cmd_mall'
being packed, which can lead to unaligned accesses
[-Werror,-Wunaligned-access]
union dmub_addr cursor_copy_dst; /**< Cursor copy destination */
^
Add pragma pack around dmub_addr to avoid this.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove duplicate or repeating expressions in the if condition
evaluation. Issue identified using doubletest.cocci Coccinelle semantic
patch.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Deepak R Varma <drv@mailo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-02-10
1) From Roi and Mark: MultiPort eswitch support
MultiPort E-Switch builds on newer hardware's capabilities and introduces
a mode where a single E-Switch is used and all the vports and physical
ports on the NIC are connected to it.
The new mode will allow in the future a decrease in the memory used by the
driver and advanced features that aren't possible today.
This represents a big change in the current E-Switch implantation in mlx5.
Currently, by default, each E-Switch manager manages its E-Switch.
Steering rules in each E-Switch can only forward traffic to the native
physical port associated with that E-Switch. While there are ways to target
non-native physical ports, for example using a bond or via special TC
rules. None of the ways allows a user to configure the driver
to operate by default in such a mode nor can the driver decide
to move to this mode by default as it's user configuration-driven right now.
While MultiPort E-Switch single FDB mode is the preferred mode, older
generations of ConnectX hardware couldn't support this mode so it was never
implemented. Now that there is capable hardware present, start the
transition to having this mode by default.
Introduce a devlink parameter to control MultiPort Eswitch single FDB mode.
This will allow users to select this mode on their system right now
and in the future will allow the driver to move to this mode by default.
2) From Jiri: Improvements and fixes for mlx5 netdev's devlink logic
2.1) Cleanups related to mlx5's devlink port logic
2.2) Move devlink port registration to be done before netdev alloc
2.3) Create auxdev devlink instance in the same ns as parent devlink
2.4) Suspend auxiliary devices only in case of PCI device suspend
* tag 'mlx5-updates-2023-02-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Suspend auxiliary devices only in case of PCI device suspend
net/mlx5: Remove "recovery" arg from mlx5_load_one() function
net/mlx5e: Create auxdev devlink instance in the same ns as parent devlink
net/mlx5e: Move devlink port registration to be done before netdev alloc
net/mlx5e: Move dl_port to struct mlx5e_dev
net/mlx5e: Replace usage of mlx5e_devlink_get_dl_port() by netdev->devlink_port
net/mlx5e: Pass mdev to mlx5e_devlink_port_register()
net/mlx5: Remove outdated comment
net/mlx5e: TC, Remove redundant parse_attr argument
net/mlx5e: Use a simpler comparison for uplink rep
net/mlx5: Lag, Add single RDMA device in multiport mode
net/mlx5: Lag, set different uplink vport metadata in multiport eswitch mode
net/mlx5: E-Switch, rename bond update function to be reused
net/mlx5e: TC, Add peer flow in mpesw mode
net/mlx5: Lag, Control MultiPort E-Switch single FDB mode
====================
Link: https://lore.kernel.org/r/20230214221239.159033-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove duplicate or repeating expressions in the if condition
evaluation. Issue identified using doubletest.cocci Coccinelle semantic
patch.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Deepak R Varma <drv@mailo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make variables declaration inside ifdef guard, as they are only used
inside the same ifdef guard. This remove some of the
-Wunused-but-set-variable warning.
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove arguments present on kernel-doc that are not present on the
function declaration and add the new ones if present.
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add includes that were previously missing to reduce the number of
-Wmissing-prototypes warnings.
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add function prototypes to headers to reduce the number of
-Wmissing-prototypes warnings.
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add function prototypes to headers to reduce the number of
-Wmissing-prototypes warnings.
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Turn global functions that are only used locally into static ones. This
reduces the number of -Wmissing-prototypes warnings.
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We don't use this function anywhere, therefore, remove it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The function resource_validate_ctx_update_pointer_after_copy() is
declared in resource.h but never defined, therefore, remove its
declaration from headers.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In mod_color_calculate_{degamma/regamma}_params(), a tf variable is
initialized as TRANSFER_FUNCTION_SRGB but tf is only used after tf =
input->tf, therefore, better to just remove this initial value and avoid
misleading interpretations.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Rename mapUserRamp to map_user_ramp and doClamping to do_clamping
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Only Navi1x requires dummy read workaround. Allocate the table in VRAM
only for Navi1x.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit d47d2f9392f69f069c31d60ac3088471b1e1c7d4.
regression detected by the change. Revert until
fix is available.
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This version brings along the following:
- Move domain power control to DMCUB for DCN314
- Enable P-state validation check for DCN314
- Add support for multiple overlay planes
- Fixes in prefetch, k1 k2 divider programming and more
- Code cleanup
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts a part of the
commit 826e7ffaf079c72607bf3199d4e19730eaf8ca00
("drm/amd/display: [FW Promotion] Release 0.0.153.0")
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Ayush Gupta<ayugupta@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-02-14 (ixgbe, i40e)
This series contains updates to ixgbe and i40e drivers.
Jason Xing corrects comparison of frame sizes for setting MTU with XDP on
ixgbe and adjusts frame size to account for a second VLAN header on ixgbe
and i40e.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ixgbe: add double of VLAN header when computing the max MTU
i40e: add double of VLAN header when computing the max MTU
ixgbe: allow to increase MTU to 3K with XDP enabled
====================
Link: https://lore.kernel.org/r/20230214185146.1305819-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
[WHY & HOW]
- make link_dp_dpia_bw.c available for linux.
- add the verify link peak bw
- clean up code and comment format.
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
The DMCUB implementation required to workaround corruption is
not currently stable and may cause intermittent corruption or hangs.
[How]
Disable PG until the sequence is stable.
Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This function has many conditions and all code style issues (identation,
missing braces, etc.) make reading it really annoying.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Freeing memory was warned during suspend.
Move the self test out of suspend.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2151825
Cc: jfalempe@redhat.com
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Tested-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
'devlink-cleanups-and-move-devlink-health-functionality-to-separate-file'
Moshe Shemesh says:
====================
devlink: cleanups and move devlink health functionality to separate file
This patchset moves devlink health callbacks, helpers and related code
from leftover.c to new file health.c. About 1.3K LoC are moved by this
patchset, covering all devlink health functionality.
In addition this patchset includes a couple of small cleanups in devlink
health code and documentation update.
====================
Link: https://lore.kernel.org/r/1676392686-405892-1-git-send-email-moshe@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a bug in trace point definition for devlink health report, as
TP_STRUCT_entry of reporter_name should get reporter_name and not msg.
Note no fixes tag as this is a harmless bug as both reporter_name and
msg are strings and TP_fast_assign for this entry is correct.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update devlink-health.rst file:
- Add devlink formatted message (fmsg) API documentation.
- Add auto-dump as a condition to do dump once error reported.
- Expand OOB to clarify this acronym.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that all devlink health callbacks and related code are in file
health.c move common health functions and devlink_health_reporter struct
to be local in health.c file.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move devlink health report test callback from leftover.c to health.c. No
functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move devlink health report dump callbacks and related code from
leftover.c to health.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Devlink fmsg (formatted message) is used by devlink health diagnose,
dump and drivers which support these devlink health callbacks.
Therefore, move devlink fmsg helpers and related code to file health.c.
Move devlink health diagnose to file health.c. No functional change in
this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move devlink health report helper and recover callback and related code
from leftover.c to health.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move devlink health get and set callbacks and related code from
leftover.c to health.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
devlink_nl_health_reporter_fill() error flow calls nla_nest_end(). Fix
it to call nla_nest_cancel() instead.
Note the bug is harmless as genlmsg_cancel() cancel the entire message,
so no fixes tag added.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move devlink health reporter create/destroy and related dev code to new
file health.c. This file shall include all callbacks and functionality
that are related to devlink health.
In addition, fix kdoc indentation and make reporter create/destroy kdoc
more clear. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now ice driver supports xdp multi-buffer so add it to xdp_features.
Check vsi type before setting xdp_features flag.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/8a4781511ab6e3cd280e944eef69158954f1a15f.1676385351.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Set xdp_features flag just for I40E_VSI_MAIN vsi type since XDP is
supported just in this configuration.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
&xdp_buff and &xdp_frame are bound in a way that
xdp_buff->data_hard_start == xdp_frame
It's always the case and e.g. xdp_convert_buff_to_frame() relies on
this.
IOW, the following:
for (u32 i = 0; i < 0xdead; i++) {
xdpf = xdp_convert_buff_to_frame(&xdp);
xdp_convert_frame_to_buff(xdpf, &xdp);
}
shouldn't ever modify @xdpf's contents or the pointer itself.
However, "live packet" code wrongly treats &xdp_frame as part of its
context placed *before* the data_hard_start. With such flow,
data_hard_start is sizeof(*xdpf) off to the right and no longer points
to the XDP frame.
Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
places and praying that there are no more miscalcs left somewhere in the
code, unionize ::frm with ::data in a flex array, so that both starts
pointing to the actual data_hard_start and the XDP frame actually starts
being a part of it, i.e. a part of the headroom, not the context.
A nice side effect is that the maximum frame size for this mode gets
increased by 40 bytes, as xdp_buff::frame_sz includes everything from
data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
info.
Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
Minor: align `&head->data` with how `head->frm` is assigned for
consistency.
Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
clarity.
(was found while testing XDP traffic generator on ice, which calls
xdp_convert_frame_to_buff() for each XDP frame)
Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
The max string length for a histogram variable is 256 bytes. The max depth
of a stacktrace is 16. With 8byte words, that's 16 * 8 = 128. Which can
easily fit in the string variable. The histogram stacktrace is being
stored in the string value (with the given max length), with the
assumption it will fit. To make sure that this is always the case (in the
case that the stack trace depth increases), add a BUILD_BUG_ON() to test
this.
Link: https://lore.kernel.org/linux-trace-kernel/20230214002418.0103b9e765d3e5c374d2aa7d@kernel.org/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Because stacktraces are saved in dynamic strings,
trace_event_raw_event_synth() uses strlen to determine the length of
the stack. Stacktraces may contain 0-bytes, though, in the saved
addresses, so the length found and passed to reserve() will be too
small.
Fix this by using the first unsigned long in the stack variables to
store the actual number of elements in the stack and have
trace_event_raw_event_synth() use that to determine the length of the
stack.
Link: https://lkml.kernel.org/r/1ed6906cd9d6477ef2bd8e63c61de20a9ffe64d7.1676063532.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Anton Protopopov says:
====================
Add a new benchmark for hashmap lookups and fix several typos.
In commit 3 I've patched the bench utility so that now command line options
can be reused by different benchmarks.
The benchmark itself is added in the last commit 7. I was using this benchmark
to test map lookup productivity when using a different hash function [1]. When
run with --quiet, the results can be easily plotted [2]. The results provided
by the benchmark look reasonable and match the results of my different
benchmarks (requiring to patch kernel to get actual statistics on map lookups).
Links:
[1] https://fosdem.org/2023/schedule/event/bpf_hashing/
[2] https://github.com/aspsk/bpf-bench/tree/master/hashmap-bench
Changes,
v1->v2:
- percpu_times_index[] is of wrong size (Martin)
- use base 0 for strtol (Andrii)
- just use -q without argument (Andrii)
- use less hacks when parsing arguments (Andrii)
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Add a new benchmark which measures hashmap lookup operations speed. A user can
control the following parameters of the benchmark:
* key_size (max 1024): the key size to use
* max_entries: the hashmap max entries
* nr_entries: the number of entries to insert/lookup
* nr_loops: the number of loops for the benchmark
* map_flags The hashmap flags passed to BPF_MAP_CREATE
The BPF program performing the benchmarks calls two nested bpf_loop:
bpf_loop(nr_loops/nr_entries)
bpf_loop(nr_entries)
bpf_map_lookup()
So the nr_loops determines the number of actual map lookups. All lookups are
successful.
Example (the output is generated on a AMD Ryzen 9 3950X machine):
for nr_entries in `seq 4096 4096 65536`; do echo -n "$((nr_entries*100/65536))% full: "; sudo ./bench -d2 -a bpf-hashmap-lookup --key_size=4 --nr_entries=$nr_entries --max_entries=65536 --nr_loops=1000000 --map_flags=0x40 | grep cpu; done
6% full: cpu01: lookup 50.739M ± 0.018M events/sec (approximated from 32 samples of ~19ms)
12% full: cpu01: lookup 47.751M ± 0.015M events/sec (approximated from 32 samples of ~20ms)
18% full: cpu01: lookup 45.153M ± 0.013M events/sec (approximated from 32 samples of ~22ms)
25% full: cpu01: lookup 43.826M ± 0.014M events/sec (approximated from 32 samples of ~22ms)
31% full: cpu01: lookup 41.971M ± 0.012M events/sec (approximated from 32 samples of ~23ms)
37% full: cpu01: lookup 41.034M ± 0.015M events/sec (approximated from 32 samples of ~24ms)
43% full: cpu01: lookup 39.946M ± 0.012M events/sec (approximated from 32 samples of ~25ms)
50% full: cpu01: lookup 38.256M ± 0.014M events/sec (approximated from 32 samples of ~26ms)
56% full: cpu01: lookup 36.580M ± 0.018M events/sec (approximated from 32 samples of ~27ms)
62% full: cpu01: lookup 36.252M ± 0.012M events/sec (approximated from 32 samples of ~27ms)
68% full: cpu01: lookup 35.200M ± 0.012M events/sec (approximated from 32 samples of ~28ms)
75% full: cpu01: lookup 34.061M ± 0.009M events/sec (approximated from 32 samples of ~29ms)
81% full: cpu01: lookup 34.374M ± 0.010M events/sec (approximated from 32 samples of ~29ms)
87% full: cpu01: lookup 33.244M ± 0.011M events/sec (approximated from 32 samples of ~30ms)
93% full: cpu01: lookup 32.182M ± 0.013M events/sec (approximated from 32 samples of ~31ms)
100% full: cpu01: lookup 31.497M ± 0.016M events/sec (approximated from 32 samples of ~31ms)
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-8-aspsk@isovalent.com
|
|
The bench utility will print
Setting up benchmark '<bench-name>'...
Benchmark '<bench-name>' started.
on startup to stdout. Suppress this output if --quiet option if given. This
makes it simpler to parse benchmark output by a script.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-7-aspsk@isovalent.com
|
|
The "local-storage-tasks-trace" benchmark has a `--quiet` option. Move it to
the list of common options, so that the main code and other benchmarks can use
(new) env.quiet variable. Patch the run_bench_local_storage_rcu_tasks_trace.sh
helper script accordingly.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-6-aspsk@isovalent.com
|
|
The benchs/bench_bpf_hashmap_full_update.c doesn't set a custom argp,
so it shouldn't include the <argp.h> header.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-5-aspsk@isovalent.com
|
|
To parse command line the bench utility uses the argp_parse() function. This
function takes as an argument a parent 'struct argp' structure which defines
common command line options and an array of children 'struct argp' structures
which defines additional command line options for particular benchmarks. This
implementation doesn't allow benchmarks to share option names, e.g., if two
benchmarks want to use, say, the --option option, then only one of them will
succeed (the first one encountered in the array). This will be convenient if
same option names could be used in different benchmarks (with the same
semantics, e.g., --nr_loops=N).
Fix this by calling the argp_parse() function twice. The first call is the same
as it was before, with all children argps, and helps to find the benchmark name
and to print a combined help message if anything is wrong. Given the name, we
can call the argp_parse the second time, but now the children array points only
to a correct benchmark thus always calling the correct parsers. (If there's no
a specific list of arguments, then only one call to argp_parse will be done.)
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-4-aspsk@isovalent.com
|
|
The hashmap_report_final callback function defined in the
benchs/bench_bpf_hashmap_full_update.c file should be static.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-3-aspsk@isovalent.com
|
|
To call the bpf_hashmap_full_update benchmark, one should say:
bench bpf-hashmap-ful-update
The patch adds a missing 'l' to the benchmark name.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-2-aspsk@isovalent.com
|