Age | Commit message (Collapse) | Author |
|
gaudi_mmu_invalidate_cache() doesn't use the flags parameter, and thus
it can be set to 0 when the function is called in the gaudi only files.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Device is now enabled before the hw_init() because part of the
initialization requires communication with the device firmware to get
information that is required for the initialization itself
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
|
|
Create a device MMU-mapped internal command buffer pool, in order to allow
the driver to allocate CBs for the signal/wait operations
that are fetched by the queues when they are configured with the user's
address space ID.
We must pre-map this internal pool due to performance issues.
This pool is needed for future ASIC support and it is currently unused in
GOYA and GAUDI.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Update the boot interface file from the latest version from firmware.
Defines for secure boot were added.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
|
|
For internal needs of our CI we need to move all the common code into a
common folder instead of putting them in the root folder of the driver.
Same applies to the common header files under include/
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
|
|
In GAUDI we use QMAN0 DMA for clearing the MMU memory region
at initialization. if this operation fails it places the DMA in an error
state and then when trying to initialize QMAN0 we fail and erroneously
assume its the QMAN that failed.
This commit adds a check and clear of such DMA errors at initialization so
we will have a better understanding of what went wrong.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
In order for the user to be aware of wrong inputs, we must return
error in case the amount of jobs per cs exceeds the corresponding
queue size.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
We identified a possible race during job completion when working
with a single multi-threaded work queue. In order to overcome this
race we suggest using a single threaded work queue per completion
queue, hence we guarantee jobs completion in order.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Currently the driver halts the device CPU in the halt engines function,
which halts all the engines of the ASIC. The problem is that if later on we
stop the reset process (due to inability to clean memory mappings in time),
the CPU will remain in halt mode. This creates many issues, such as
thermal/power control and FLR handling.
Therefore, move the halting of the device CPU to the very end of the reset
process, just before writing to the registers to initiate the reset. In
addition, the driver now needs to send a message to the device F/W to
disable it from sending interrupts to the host machine because during halt
engines function the driver disables the MSI/MSI-X interrupts.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
|
|
Remove an old hash that is not in use anymore.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Instead of using the free slots amount on the compute CQ to determine
whether we can submit work to queues, use the queues pi/ci.
This is needed in future ASICs where we don't have CQ per queue.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Currently the amount of maximum queues is statically configured.
Using a static value is causing redundunt cycles when traversing
all queues and consumes more memory than actually needed.
In this patch we configure each asic with the exact number of
queues needed.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Soft-reset isn't supported in GAUDI. Remove the code that performs it and
print error in case the user wants to do it via sysfs.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
|
|
Divide iATU initialization into inbound/outbound methods.
We must separate it in order to enable different match mode
per PCIe region.
In addition, added support for PCI address match mode.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
ECC (Error Correcting Code) interrupts are going to be handled
by the FW. Hence, we define an interface in which the driver can
obtain the relevant ECC information.
This information is needed for monitoring and can also lead
to a hard reset if ECC error is not correctable.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Add command submission statistics structure which can be obtained
through the info ioctl. Each drop counter describes the reason for
which the command submission was dropped.
This information is needed for the user to be aware of the specific
reason for which the submitted work was dropped. The user can then
utilize the driver more efficiently.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Extract detection of the cpu boot status to a function
to allow code reuse
Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
rephrase some error/warning/notice messages to make them more accessible to
ordinary users.
There is no need to print context ASID as the driver currently doesn't
support multiple contexts.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
|
|
After recent concurrent cs amount increase, we must also
increase queues depth since much more concurrent work can be done.
All external queue depths were increased to 4096 as gaudi's
internal queue depths were also increased to 1024.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Rephrase F/W error message to make it more understandable to ordinary
users.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
The profiler needs to know the PLL values for correctly showing the
profiling data. Because our firmware can use different PLL configurations,
we need to read the PLL values from the ASIC to pass them to the profiler.
Signed-off-by: Adam Aharon <aaharon@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Once there is a 64-bit field in a structure, GCC compiler for ARM aligns
the structure to 8 bytes. In order to avoid confusion when these
structures are being passed between CPUs from different architectures, we
explicitly align the structure to 8 bytes.
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
MAP/UNMAP are done also for device memory.
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Use proper bitfield masks instead of shifting values when configuring
packets sent to device.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Currently sync stream is limited only for external queues. We want to
remove this constraint by adding a new queue property dedicated for sync
stream. In addition we move the initialization and reset methods to the
common code since we can re-use them with slight changes.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Training schemes requires much more concurrent command submissions than
inference does. In addition, training command submissions can be completed
in a non serialized manner. Hence, we add support in which each ASIC will
be able to configure the amount of concurrent pending command submissions,
rather than use a predefined amount. This change will enhance performance
by allowing the user to add more concurrent work without waiting for the
previous work to be completed.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
We no longer need to initialize the rate limiters in GAUDI A1.
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
CONFIG_TIME_NS is dependes on GENERIC_VDSO_TIME_NS.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200624083321.144975-7-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Forbid splitting VVAR VMA resulting in a stricter ABI and reducing the
amount of corner-cases to consider while working further on VDSO time
namespace support.
As the offset from timens to VVAR page is computed compile-time, the pages
in VVAR should stay together and not being partically mremap()'ed.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200624083321.144975-6-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
If a task belongs to a time namespace then the VVAR page which contains
the system wide VDSO data is replaced with a namespace specific page
which has the same layout as the VVAR page.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200624083321.144975-5-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
G14(GA401) series with ALC289
This patch fixes a small typo I accidently submitted with the initial patch. The board should be named GA401 not G401.
Fixes: ff53664daff2 ("ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G14(G401) series with ALC289")
Signed-off-by: Armas Spann <zappel@retarded.farm>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200724140837.302763-1-zappel@retarded.farm
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
with ALC289
This patch adds support for headset mic to the ASUS ROG Zephyrus
G15(GA502) notebook series by adding the corresponding
vendor/pci_device id, as well as adding a new fixup for the used
realtek ALC289. The fixup stets the correct pin to get the headset mic
correctly recognized on audio-jack.
Signed-off-by: Armas Spann <zappel@retarded.farm>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200724140616.298892-1-zappel@retarded.farm
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
into char-misc-next
Georgi writes:
interconnect changes for 5.9
Here are the interconnect changes for the 5.9-rc1 merge window
consisting mostly of changes that give the core more flexibility
in order to support some new provider drivers.
Core changes:
- Export of_icc_get_from_provider()
- Relax requirement in of_icc_get_from_provider()
- Allow inter-provider pairs to be configured
- Mark all dummy functions as static inline
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
* tag 'icc-5.9-rc1' of https://git.linaro.org/people/georgi.djakov/linux:
interconnect: Mark all dummy functions as static inline
interconnect: Allow inter-provider pairs to be configured
interconnect: Relax requirement in of_icc_get_from_provider()
interconnect: Export of_icc_get_from_provider()
|
|
Export ddebug_exec_queries() for use by modules.
This will allow module authors to control all their *pr_debug*s
dynamically. And since ddebug_exec_queries() is what implements
"echo $query >control", it gives the same per-callsite control.
Virtues of this:
- simplicity. just an export.
- full control over any/all subsets of callsites.
- same "query/command-string" in code and console
- full callsite selectivity with module file line format
Format in particular deserves special attention; it is where
low-hanging fruit will be found.
Consider: drivers/gpu/drm/amd/display/include/logger_types.h:
#define DC_LOG_SURFACE(...) pr_debug("[SURFACE]:"__VA_ARGS__)
#define DC_LOG_HW_LINK_TRAINING(...) pr_debug("[HW_LINK_TRAINING]:"__VA_ARGS__)
.. 9 more ..
Thats 11 string prefixes, used in 804 places in drivers/gpu/**
Clearly this is a systematized classification of those callsites.
And one I'd expect to see repeated often.
Using ddebug_exec_queries(), authors can select on those prefixes
as a unitary set, equivalent to:
echo "module=MODULE_NAME format=^[SURFACE]: +p" >control
Trivially, those sets can be subsected with the other query terms too,
say file=foo, should the author see fit.
Perhaps as important, users can modify the set of enabled callsites,
presumably to aid debugging by enabling helpful debug callsites, and
disabling those that just clutter the info.
Authors could even alter [fmlt] flags, though I dont see a good reason
why they would. Perhaps harnessed by bug-logging automation to get
fuller, or more minimal bug-reports.
DRM
drm has both drm.debug, which defines 32 categories of drm_printk
logging, and entirely separate uses of pr_debug, which are dynamic on
this i915 laptop, running mainline. So I can observe and report on
both.
The i915 driver has 118 dyndbg callsites, with following
"classifications" defined in drivers/gpu/drm/i915/gvt/**
$ grep 915 /proc/dynamic_debug/control | cut -d= -f2 | cut -d: -f1,2 | sort -u
_ "gvt: cmd
_ "gvt: core
_ "gvt: dpy
_ "gvt: el
_ "gvt: irq
_ "gvt: mm
_ "gvt: mmio
_ "gvt: render
_ "gvt: sched
_ "%s for root hub!\012"
_ "Vendor defined info completion code %u\012"
This classification is entirely out-of-band for control by drm.debug,
and is only available to root user at the console. But module authors
can activate them with ddebug_exec_queries(sprintf("format=^%s +p")),
and then decide how to expose the groups to the user for max utility.
drm.debug
drm.debug has 32 bit-flags, and matching enum drm_debug_category
values to classify the ~2943 DRM_DEBUG*() callsites in drivers/gpu
The drm.debug callback could invoke ddebug_exec_queries() with 32
different hardcoded query strings, needing only (bit) ? " +p" : " -p"
added.
I briefly enabled drm.debug=0xff on my i915 laptop, which yielded
these unique prefixes: (dmesg | cut -c17- | cut -d\] -f1 | sort -u)
[drm:drm_atomic_check_only [drm
[drm:drm_atomic_get_crtc_state [drm
[drm:drm_atomic_get_plane_state [drm
[drm:drm_atomic_nonblocking_commit [drm
[drm:drm_atomic_set_fb_for_plane [drm
[drm:drm_atomic_state_default_clear [drm
[drm:__drm_atomic_state_free [drm
[drm:drm_atomic_state_init [drm
[drm:drm_crtc_vblank_helper_get_vblank_timestamp_internal [drm
[drm:drm_handle_vblank [drm
[drm:drm_ioctl [drm
[drm:drm_mode_addfb2 [drm
[drm:drm_mode_object_get [drm
[drm:drm_mode_object_put.part.0 [drm
[drm:drm_update_vblank_count [drm
[drm:drm_vblank_enable [drm
[drm:drm_vblank_restore [drm
[drm:vblank_disable_fn [drm
i915 0000:00:02.0: [drm:gen9_set_dc_state [i915
i915 0000:00:02.0: [drm:intel_atomic_get_global_obj_state [i915
i915 0000:00:02.0: [drm:__intel_display_power_get_domain.part.0 [i915
i915 0000:00:02.0: [drm:__intel_display_power_put_domain [i915
i915 0000:00:02.0: [drm:intel_plane_atomic_calc_changes [i915
i915 0000:00:02.0: [drm:skl_enable_dc6 [i915
Several good format=^prefixes are apparent there, and some misses.
^[drm:drm_atomic_ # misses: [drm:__drm_atomic_state_free [drm
^[drm:drm_ioctl
^[drm:drm_mode
^[drm:drm_vblank_ # misses: [drm:drm_update_vblank_count & [drm:vblank_disable_fn
Its not a perfect 1:1 single format-match per class, but the misses
above can be covered with 1 & 2 additional queries, which can be
concatenated together with ";" separators and submitted with 1 call.
Benefits:
For drm, adapting DRM_DEBUG to use dynamic-debug inside could
replicate (and thereby obsolete) lots of bit-checking in current
DRM_DEBUG callsites, at least with JUMP_LABEL optimized code.
ddebug_exec_queries() and a handful of fixed query-strings can select
and thereby control the already classified callsites.
With the classes mapped to queries, the enum type and parameter can be
eliminated (folded away with macro magic), at least for DYNAMIC_DEBUG
& JUMP_LABEL builds.
Is it safe ?
ddebug_exec_queries() is currently exposed to user space in
several limited ways;
1 it is called from module-load callback, where it implements the
$modname.dyndbg=+p "fake" parameter provided to all modules.
2 it handles query input via >control directly
IOW, it is "fully" exposed to local root user; exposing the same
functionality to other kernel modules is no additional risk.
The other standard issue to check is locking:
dyndbg has a single mutex, taken by ddebug_change to handle >control,
and by ddebug_proc_(start|stop) to span `cat control`. Queries
submitted via export will typically have module specified, which
dramatically cuts the scan by ddebug_change vs "module=* +p".
ISTM this proposed export presents no locking problems.
TLDR;
It would be interesting to see how drm.dyndbg=$QUERY and
drm.debug=$HEXY would interact; it might be order dependent, as
if given as modprobe args or in /etc/modprobe.d/
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-19-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
For log-message output, reduce column space consumed by current
pr_fmt by dropping __func__ and shortening "dynamic_debug" to
"dyndbg". This improves readability on narrow consoles, and better
matches other kernel boot info messages.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-18-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This should work:
echo module=amd* format=^[IF_TRACE]: +p >/proc/dynamic_debug/control
consider drivers/gpu/drm/amd/display/include/logger_types.h:
It has 11 defines like:
#define DC_LOG_IF_TRACE(...) pr_debug("[IF_TRACE]:"__VA_ARGS__)
These defines are used 804 times at recent count; they are a good use
case to evaluate existing format-message based classifications of
*pr_debug*. Those macros prefix the supplied format with a fixed
string, I'd expect most existing message classification schemes to do
something similar.
Hence we want to be able to anchor our match to the beginning of the
format string, allowing easy construction of clear and precise
queries, leveraging the existing classification scheme to enable and
disable those callsites.
Note that unlike other search terms, formats are implicitly floating
substring matches, without the need for explicit wildcards.
This makes no attempt at wider regex features, just the one we need.
TLDR: Using the anchor also means the []s are less helpful for
disamiguating the prefix from a random in-message occurrence, allowing
shorter prefixes.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-17-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
flags & mask are used together everywhere, and are passed around
together between multiple functions; they belong together in a struct,
call that struct flag_settings.
Use struct flag_settings to rework 3 functions:
- ddebug_exec_query - declares query and flag-settings,
calls other 2, passing flags
- ddebug_parse_flags - fills flag_settings and returns
- ddebug_change - test all callsites against query,
modify passing sites.
benefits:
- bit-banging always needs flags & mask, best together.
- simpler function signatures
- 1 less parameter, less stack overhead
no functional changes
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-16-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Current code expects "keyword" "arg" as 2 words, space separated.
Change to also accept "keyword=arg" form as well, and drop !(nwords%2)
requirement. Then in rest of function, use new keyword, arg variables
instead of word[i], word[i+1]
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-15-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Accept these additional query forms:
echo "file $filestr +_" > control
path/to/file.c:100 # as from control, column 1
path/to/file.c:1-100 # or any legal line-range
path/to/file.c:func_A # as from an editor/browser
path/to/file.c:drm_* # wildcards still work
path/to/file.c:*_foo # lead wildcard too
1st 2 examples are treated as line-ranges, 3-5 are treated as func's
Doc these changes, and sprinkle in a few extra wild-card examples and
trailing # explanation texts.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-14-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make the code-block reusable to later handle "file foo.c:101-200" etc.
This is a 99% code move, with reindent, function wrap&call, +pr_debug.
no functional changes.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-13-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
reduce word count via gcc ?: extension, no actual code change.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-12-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
loadable modules are the last in on this list, and are the only
modules that could be removed. ddebug_remove_module() searches from
head, but ddebug_add_module() uses list_add_tail(). Change it to
list_add() for a micro-optimization.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-11-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
ddebug_exec_query declares an auto var, and passes it to
ddebug_parse_query, which memsets it before using it. Drop that
memset, instead initialize the variable in the caller; let the
compiler decide how to do it.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-10-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
this pr_err attempts to print the string after the OP, but the string
has been parsed and chopped up, so looks empty.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-9-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
ddebug_describe_flags() currently fills a caller provided string buffer,
after testing its size (also passed) in a BUG_ON. Fix this by
replacing them with a known-big-enough string buffer wrapped in a
struct, and passing that instead.
Also simplify ddebug_describe_flags() flags parameter from a struct to
a member in that struct, and hoist the member deref up to the caller.
This makes the function reusable (soon) where flags are unpacked.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-8-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
during dyndbg init, verbose logging prints its ram overhead. It
counted strlens of struct _ddebug's 4 string members, in all callsite
entries, which would be approximately correct if each had been
mallocd. But they are pointers into shared .rodata; for example, all
10 kobject callsites have identical filename, module values.
Its best not to count that memory at all, since we cannot know they
were linked in because of CONFIG_DYNAMIC_DEBUG=y, and we want to
report a number that reflects what ram is saved by deconfiguring it.
Also fix wording and size under-reporting of the __dyndbg section.
Heres my overhead, on a virtme-run VM on a fedora-31 laptop:
dynamic_debug:dynamic_debug_init: 260 modules, 2479 entries \
and 10400 bytes in ddebug tables, 138824 bytes in __dyndbg section
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-7-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
dyndbg populates its callsite info into __verbose section, change that
to a more specific and descriptive name, __dyndbg.
Also, per checkpatch:
simplify __attribute(..) to __section(__dyndbg) declaration.
and 1 spelling fix, decriptor
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-6-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The verbose/debug logging done for `cat $MNT/dynamic_debug/control` is
voluminous (2 per control file entry + 2 per PAGE). Moreover, it just
prints pointer and sequence, which is not useful to a dyndbg user.
So just drop them.
Also require verbose>=2 for several other debug printks that are a bit
too chatty for typical needs;
ddebug_change() prints changes, once per modified callsite. Since
queries like "+p" will enable ~2300 callsites in a typical laptop, a
user probably doesn't need to see them often. ddebug_exec_queries()
still summarizes with verbose=1.
ddebug_(add|remove)_module() also print 1 line per action on a module,
not needed by typical modprobe user.
This leaves verbose=1 better focussed on the >control parsing process.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-5-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4bad78c55002 ("lib/dynamic_debug.c: use seq_open_private() instead of seq_open()")'
The commit was one of a tree-wide set which replaced open-coded
boilerplate with a single tail-call. It therefore obsoleted the
comment about that boilerplate, clean that up now.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-4-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
since cf964976484 in 2012, initialization is done with early_initcall,
update the Docs, which still say arch_initcall.
Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|