diff options
Diffstat (limited to 'Documentation/process/debugging')
6 files changed, 1891 insertions, 0 deletions
diff --git a/Documentation/process/debugging/driver_development_debugging_guide.rst b/Documentation/process/debugging/driver_development_debugging_guide.rst new file mode 100644 index 000000000000..46becda8764b --- /dev/null +++ b/Documentation/process/debugging/driver_development_debugging_guide.rst @@ -0,0 +1,235 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================================== +Debugging advice for driver development +======================================== + +This document serves as a general starting point and lookup for debugging +device drivers. +While this guide focuses on debugging that requires re-compiling the +module/kernel, the :doc:`userspace debugging guide +</process/debugging/userspace_debugging_guide>` will guide +you through tools like dynamic debug, ftrace and other tools useful for +debugging issues and behavior. +For general debugging advice, see the :doc:`general advice document +</process/debugging/index>`. + +.. contents:: + :depth: 3 + +The following sections show you the available tools. + +printk() & friends +------------------ + +These are derivatives of printf() with varying destinations and support for +being dynamically turned on or off, or lack thereof. + +Simple printk() +~~~~~~~~~~~~~~~ + +The classic, can be used to great effect for quick and dirty development +of new modules or to extract arbitrary necessary data for troubleshooting. + +Prerequisite: ``CONFIG_PRINTK`` (usually enabled by default) + +**Pros**: + +- No need to learn anything, simple to use +- Easy to modify exactly to your needs (formatting of the data (See: + :doc:`/core-api/printk-formats`), visibility in the log) +- Can cause delays in the execution of the code (beneficial to confirm whether + timing is a factor) + +**Cons**: + +- Requires rebuilding the kernel/module +- Can cause delays in the execution of the code (which can cause issues to be + not reproducible) + +For the full documentation see :doc:`/core-api/printk-basics` + +Trace_printk +~~~~~~~~~~~~ + +Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>`` + +It is a tiny bit less comfortable to use than printk(), because you will have +to read the messages from the trace file (See: :ref:`read_ftrace_log` +instead of from the kernel log, but very useful when printk() adds unwanted +delays into the code execution, causing issues to be flaky or hidden.) + +If the processing of this still causes timing issues then you can try +trace_puts(). + +For the full Documentation see trace_printk() + +dev_dbg +~~~~~~~ + +Print statement, which can be targeted by +:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains +additional information about the device used within the context. + +**When is it appropriate to leave a debug print in the code?** + +Permanent debug statements have to be useful for a developer to troubleshoot +driver misbehavior. Judging that is a bit more of an art than a science, but +some guidelines are in the :ref:`Coding style guidelines +<process/coding-style:13) printing kernel messages>`. In almost all cases the +debug statements shouldn't be upstreamed, as a working driver is supposed to be +silent. + +Custom printk +~~~~~~~~~~~~~ + +Example:: + + #define core_dbg(fmt, arg...) do { \ + if (core_debug) \ + printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \ + } while (0) + +**When should you do this?** + +It is better to just use a pr_debug(), which can later be turned on/off with +dynamic debug. Additionally, a lot of drivers activate these prints via a +variable like ``core_debug`` set by a module parameter. However, Module +parameters `are not recommended anymore +<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_. + +Ftrace +------ + +Creating a custom Ftrace tracepoint +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A tracepoint adds a hook into your code that will be called and logged when the +tracepoint is enabled. This can be used, for example, to trace hitting a +conditional branch or to dump the internal state at specific points of the code +flow during a debugging session. + +Here is a basic description of :ref:`how to implement new tracepoints +<trace/tracepoints:usage>`. + +For the full event tracing documentation see :doc:`/trace/events` + +For the full Ftrace documentation see :doc:`/trace/ftrace` + +DebugFS +------- + +Prerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>`` + +DebugFS differs from the other approaches of debugging, as it doesn't write +messages to the kernel log nor add traces to the code. Instead it allows the +developer to handle a set of files. +With these files you can either store values of variables or make +register/memory dumps or you can make these files writable and modify +values/settings in the driver. + +Possible use-cases among others: + +- Store register values +- Keep track of variables +- Store errors +- Store settings +- Toggle a setting like debug on/off +- Error injection + +This is especially useful, when the size of a data dump would be hard to digest +as part of the general kernel log (for example when dumping raw bitstream data) +or when you are not interested in all the values all the time, but with the +possibility to inspect them. + +The general idea is: + +- Create a directory during probe (``struct dentry *parent = + debugfs_create_dir("my_driver", NULL);``) +- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``) + + - In this example the file is found in + ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for + user/group/all) + - any read of the file will return the current contents of the variable + ``my_variable`` + +- Clean up the directory when removing the device + (``debugfs_remove_recursive(parent);``) + +For the full documentation see :doc:`/filesystems/debugfs`. + +KASAN, UBSAN, lockdep and other error checkers +---------------------------------------------- + +KASAN (Kernel Address Sanitizer) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Prerequisite: ``CONFIG_KASAN`` + +KASAN is a dynamic memory error detector that helps to find use-after-free and +out-of-bounds bugs. It uses compile-time instrumentation to check every memory +access. + +For the full documentation see :doc:`/dev-tools/kasan`. + +UBSAN (Undefined Behavior Sanitizer) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Prerequisite: ``CONFIG_UBSAN`` + +UBSAN relies on compiler instrumentation and runtime checks to detect undefined +behavior. It is designed to find a variety of issues, including signed integer +overflow, array index out of bounds, and more. + +For the full documentation see :doc:`/dev-tools/ubsan` + +lockdep (Lock Dependency Validator) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Prerequisite: ``CONFIG_DEBUG_LOCKDEP`` + +lockdep is a runtime lock dependency validator that detects potential deadlocks +and other locking-related issues in the kernel. +It tracks lock acquisitions and releases, building a dependency graph that is +analyzed for potential deadlocks. +lockdep is especially useful for validating the correctness of lock ordering in +the kernel. + +PSI (Pressure stall information tracking) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Prerequisite: ``CONFIG_PSI`` + +PSI is a measurement tool to identify excessive overcommits on hardware +resources, that can cause performance disruptions or even OOM kills. + +device coredump +--------------- + +Prerequisite: ``CONFIG_DEV_COREDUMP`` & ``#include <linux/devcoredump.h>`` + +Provides the infrastructure for a driver to provide arbitrary data to userland. +It is most often used in conjunction with udev or similar userland application +to listen for kernel uevents, which indicate that the dump is ready. Udev has +rules to copy that file somewhere for long-term storage and analysis, as by +default, the data for the dump is automatically cleaned up after a default +5 minutes. That data is analyzed with driver-specific tools or GDB. + +A device coredump can be created with a vmalloc area, with read/free +methods, or as a scatter/gather list. + +You can find an example implementation at: +`drivers/media/platform/qcom/venus/core.c +<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__, +in the Bluetooth HCI layer, in several wireless drivers, and in several +DRM drivers. + +devcoredump interfaces +~~~~~~~~~~~~~~~~~~~~~~ + +.. kernel-doc:: include/linux/devcoredump.h + +.. kernel-doc:: drivers/base/devcoredump.c + +**Copyright** ©2024 : Collabora diff --git a/Documentation/process/debugging/gdb-kernel-debugging.rst b/Documentation/process/debugging/gdb-kernel-debugging.rst new file mode 100644 index 000000000000..895285c037c7 --- /dev/null +++ b/Documentation/process/debugging/gdb-kernel-debugging.rst @@ -0,0 +1,179 @@ +.. highlight:: none + +Debugging kernel and modules via gdb +==================================== + +The kernel debugger kgdb, hypervisors like QEMU or JTAG-based hardware +interfaces allow to debug the Linux kernel and its modules during runtime +using gdb. Gdb comes with a powerful scripting interface for python. The +kernel provides a collection of helper scripts that can simplify typical +kernel debugging steps. This is a short tutorial about how to enable and use +them. It focuses on QEMU/KVM virtual machines as target, but the examples can +be transferred to the other gdb stubs as well. + + +Requirements +------------ + +- gdb 7.2+ (recommended: 7.4+) with python support enabled (typically true + for distributions) + + +Setup +----- + +- Create a virtual Linux machine for QEMU/KVM (see www.linux-kvm.org and + www.qemu.org for more details). For cross-development, + https://landley.net/aboriginal/bin keeps a pool of machine images and + toolchains that can be helpful to start from. + +- Build the kernel with CONFIG_GDB_SCRIPTS enabled, but leave + CONFIG_DEBUG_INFO_REDUCED off. If your architecture supports + CONFIG_FRAME_POINTER, keep it enabled. + +- Install that kernel on the guest, turn off KASLR if necessary by adding + "nokaslr" to the kernel command line. + Alternatively, QEMU allows to boot the kernel directly using -kernel, + -append, -initrd command line switches. This is generally only useful if + you do not depend on modules. See QEMU documentation for more details on + this mode. In this case, you should build the kernel with + CONFIG_RANDOMIZE_BASE disabled if the architecture supports KASLR. + +- Build the gdb scripts (required on kernels v5.1 and above):: + + make scripts_gdb + +- Enable the gdb stub of QEMU/KVM, either + + - at VM startup time by appending "-s" to the QEMU command line + + or + + - during runtime by issuing "gdbserver" from the QEMU monitor + console + +- cd /path/to/linux-build + +- Start gdb: gdb vmlinux + + Note: Some distros may restrict auto-loading of gdb scripts to known safe + directories. In case gdb reports to refuse loading vmlinux-gdb.py, add:: + + add-auto-load-safe-path /path/to/linux-build + + to ~/.gdbinit. See gdb help for more details. + +- Attach to the booted guest:: + + (gdb) target remote :1234 + + +Examples of using the Linux-provided gdb helpers +------------------------------------------------ + +- Load module (and main kernel) symbols:: + + (gdb) lx-symbols + loading vmlinux + scanning for modules in /home/user/linux/build + loading @0xffffffffa0020000: /home/user/linux/build/net/netfilter/xt_tcpudp.ko + loading @0xffffffffa0016000: /home/user/linux/build/net/netfilter/xt_pkttype.ko + loading @0xffffffffa0002000: /home/user/linux/build/net/netfilter/xt_limit.ko + loading @0xffffffffa00ca000: /home/user/linux/build/net/packet/af_packet.ko + loading @0xffffffffa003c000: /home/user/linux/build/fs/fuse/fuse.ko + ... + loading @0xffffffffa0000000: /home/user/linux/build/drivers/ata/ata_generic.ko + +- Set a breakpoint on some not yet loaded module function, e.g.:: + + (gdb) b btrfs_init_sysfs + Function "btrfs_init_sysfs" not defined. + Make breakpoint pending on future shared library load? (y or [n]) y + Breakpoint 1 (btrfs_init_sysfs) pending. + +- Continue the target:: + + (gdb) c + +- Load the module on the target and watch the symbols being loaded as well as + the breakpoint hit:: + + loading @0xffffffffa0034000: /home/user/linux/build/lib/libcrc32c.ko + loading @0xffffffffa0050000: /home/user/linux/build/lib/lzo/lzo_compress.ko + loading @0xffffffffa006e000: /home/user/linux/build/lib/zlib_deflate/zlib_deflate.ko + loading @0xffffffffa01b1000: /home/user/linux/build/fs/btrfs/btrfs.ko + + Breakpoint 1, btrfs_init_sysfs () at /home/user/linux/fs/btrfs/sysfs.c:36 + 36 btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj); + +- Dump the log buffer of the target kernel:: + + (gdb) lx-dmesg + [ 0.000000] Initializing cgroup subsys cpuset + [ 0.000000] Initializing cgroup subsys cpu + [ 0.000000] Linux version 3.8.0-rc4-dbg+ (... + [ 0.000000] Command line: root=/dev/sda2 resume=/dev/sda1 vga=0x314 + [ 0.000000] e820: BIOS-provided physical RAM map: + [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable + [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved + .... + +- Examine fields of the current task struct(supported by x86 and arm64 only):: + + (gdb) p $lx_current().pid + $1 = 4998 + (gdb) p $lx_current().comm + $2 = "modprobe\000\000\000\000\000\000\000" + +- Make use of the per-cpu function for the current or a specified CPU:: + + (gdb) p $lx_per_cpu("runqueues").nr_running + $3 = 1 + (gdb) p $lx_per_cpu("runqueues", 2).nr_running + $4 = 0 + +- Dig into hrtimers using the container_of helper:: + + (gdb) set $next = $lx_per_cpu("hrtimer_bases").clock_base[0].active.next + (gdb) p *$container_of($next, "struct hrtimer", "node") + $5 = { + node = { + node = { + __rb_parent_color = 18446612133355256072, + rb_right = 0x0 <irq_stack_union>, + rb_left = 0x0 <irq_stack_union> + }, + expires = { + tv64 = 1835268000000 + } + }, + _softexpires = { + tv64 = 1835268000000 + }, + function = 0xffffffff81078232 <tick_sched_timer>, + base = 0xffff88003fd0d6f0, + state = 1, + start_pid = 0, + start_site = 0xffffffff81055c1f <hrtimer_start_range_ns+20>, + start_comm = "swapper/2\000\000\000\000\000\000" + } + + +List of commands and functions +------------------------------ + +The number of commands and convenience functions may evolve over the time, +this is just a snapshot of the initial version:: + + (gdb) apropos lx + function lx_current -- Return current task + function lx_module -- Find module by name and return the module variable + function lx_per_cpu -- Return per-cpu variable + function lx_task_by_pid -- Find Linux task by PID and return the task_struct variable + function lx_thread_info -- Calculate Linux thread_info from task variable + lx-dmesg -- Print Linux kernel log buffer + lx-lsmod -- List currently loaded modules + lx-symbols -- (Re-)load symbols of Linux kernel and currently loaded modules + +Detailed help can be obtained via "help <command-name>" for commands and "help +function <function-name>" for convenience functions. diff --git a/Documentation/process/debugging/index.rst b/Documentation/process/debugging/index.rst new file mode 100644 index 000000000000..387d33d16f5e --- /dev/null +++ b/Documentation/process/debugging/index.rst @@ -0,0 +1,80 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================ +Debugging advice for Linux Kernel developers +============================================ + +general guides +-------------- + +.. toctree:: + :maxdepth: 1 + + driver_development_debugging_guide + gdb-kernel-debugging + kgdb + userspace_debugging_guide + +.. only:: subproject and html + +subsystem specific guides +------------------------- + +.. toctree:: + :maxdepth: 1 + + media_specific_debugging_guide + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` + +General debugging advice +======================== + +Depending on the issue, a different set of tools is available to track down the +problem or even to realize whether there is one in the first place. + +As a first step you have to figure out what kind of issue you want to debug. +Depending on the answer, your methodology and choice of tools may vary. + +Do I need to debug with limited access? +--------------------------------------- + +Do you have limited access to the machine or are you unable to stop the running +execution? + +In this case your debugging capability depends on built-in debugging support of +provided distribution kernel. +The :doc:`/process/debugging/userspace_debugging_guide` provides a brief +overview over a range of possible debugging tools in that situation. You can +check the capability of your kernel, in most cases, by looking into config file +within the /boot directory. + +Do I have root access to the system? +------------------------------------ + +Are you easily able to replace the module in question or to install a new +kernel? + +In that case your range of available tools is a lot bigger, you can find the +tools in the :doc:`/process/debugging/driver_development_debugging_guide`. + +Is timing a factor? +------------------- + +It is important to understand if the problem you want to debug manifests itself +consistently (i.e. given a set of inputs you always get the same, incorrect +output), or inconsistently. If it manifests itself inconsistently, some timing +factor might be at play. If inserting delays into the code does change the +behavior, then quite likely timing is a factor. + +When timing does alter the outcome of the code execution using a simple +printk() for debugging purposes may not work, a similar alternative is to use +trace_printk() , which logs the debug messages to the trace file instead of the +kernel log. + +**Copyright** ©2024 : Collabora diff --git a/Documentation/process/debugging/kgdb.rst b/Documentation/process/debugging/kgdb.rst new file mode 100644 index 000000000000..b29b0aac2717 --- /dev/null +++ b/Documentation/process/debugging/kgdb.rst @@ -0,0 +1,937 @@ +================================================= +Using kgdb, kdb and the kernel debugger internals +================================================= + +:Author: Jason Wessel + +Introduction +============ + +The kernel has two different debugger front ends (kdb and kgdb) which +interface to the debug core. It is possible to use either of the +debugger front ends and dynamically transition between them if you +configure the kernel properly at compile and runtime. + +Kdb is simplistic shell-style interface which you can use on a system +console with a keyboard or serial console. You can use it to inspect +memory, registers, process lists, dmesg, and even set breakpoints to +stop in a certain location. Kdb is not a source level debugger, although +you can set breakpoints and execute some basic kernel run control. Kdb +is mainly aimed at doing some analysis to aid in development or +diagnosing kernel problems. You can access some symbols by name in +kernel built-ins or in kernel modules if the code was built with +``CONFIG_KALLSYMS``. + +Kgdb is intended to be used as a source level debugger for the Linux +kernel. It is used along with gdb to debug a Linux kernel. The +expectation is that gdb can be used to "break in" to the kernel to +inspect memory, variables and look through call stack information +similar to the way an application developer would use gdb to debug an +application. It is possible to place breakpoints in kernel code and +perform some limited execution stepping. + +Two machines are required for using kgdb. One of these machines is a +development machine and the other is the target machine. The kernel to +be debugged runs on the target machine. The development machine runs an +instance of gdb against the vmlinux file which contains the symbols (not +a boot image such as bzImage, zImage, uImage...). In gdb the developer +specifies the connection parameters and connects to kgdb. The type of +connection a developer makes with gdb depends on the availability of +kgdb I/O modules compiled as built-ins or loadable kernel modules in the +test machine's kernel. + +Compiling a kernel +================== + +- In order to enable compilation of kdb, you must first enable kgdb. + +- The kgdb test compile options are described in the kgdb test suite + chapter. + +Kernel config options for kgdb +------------------------------ + +To enable ``CONFIG_KGDB`` you should look under +:menuselection:`Kernel hacking --> Kernel debugging` and select +:menuselection:`KGDB: kernel debugger`. + +While it is not a hard requirement that you have symbols in your vmlinux +file, gdb tends not to be very useful without the symbolic data, so you +will want to turn on ``CONFIG_DEBUG_INFO`` which is called +:menuselection:`Compile the kernel with debug info` in the config menu. + +It is advised, but not required, that you turn on the +``CONFIG_FRAME_POINTER`` kernel option which is called :menuselection:`Compile +the kernel with frame pointers` in the config menu. This option inserts code +into the compiled executable which saves the frame information in registers +or on the stack at different points which allows a debugger such as gdb to +more accurately construct stack back traces while debugging the kernel. + +If the architecture that you are using supports the kernel option +``CONFIG_STRICT_KERNEL_RWX``, you should consider turning it off. This +option will prevent the use of software breakpoints because it marks +certain regions of the kernel's memory space as read-only. If kgdb +supports it for the architecture you are using, you can use hardware +breakpoints if you desire to run with the ``CONFIG_STRICT_KERNEL_RWX`` +option turned on, else you need to turn off this option. + +Next you should choose one or more I/O drivers to interconnect the debugging +host and debugged target. Early boot debugging requires a KGDB I/O +driver that supports early debugging and the driver must be built into +the kernel directly. Kgdb I/O driver configuration takes place via +kernel or module parameters which you can learn more about in the +section that describes the parameter kgdboc. + +Here is an example set of ``.config`` symbols to enable or disable for kgdb:: + + # CONFIG_STRICT_KERNEL_RWX is not set + CONFIG_FRAME_POINTER=y + CONFIG_KGDB=y + CONFIG_KGDB_SERIAL_CONSOLE=y + +Kernel config options for kdb +----------------------------- + +Kdb is quite a bit more complex than the simple gdbstub sitting on top +of the kernel's debug core. Kdb must implement a shell, and also adds +some helper functions in other parts of the kernel, responsible for +printing out interesting data such as what you would see if you ran +``lsmod``, or ``ps``. In order to build kdb into the kernel you follow the +same steps as you would for kgdb. + +The main config option for kdb is ``CONFIG_KGDB_KDB`` which is called +:menuselection:`KGDB_KDB: include kdb frontend for kgdb` in the config menu. +In theory you would have already also selected an I/O driver such as the +``CONFIG_KGDB_SERIAL_CONSOLE`` interface if you plan on using kdb on a +serial port, when you were configuring kgdb. + +If you want to use a PS/2-style keyboard with kdb, you would select +``CONFIG_KDB_KEYBOARD`` which is called :menuselection:`KGDB_KDB: keyboard as +input device` in the config menu. The ``CONFIG_KDB_KEYBOARD`` option is not +used for anything in the gdb interface to kgdb. The ``CONFIG_KDB_KEYBOARD`` +option only works with kdb. + +Here is an example set of ``.config`` symbols to enable/disable kdb:: + + # CONFIG_STRICT_KERNEL_RWX is not set + CONFIG_FRAME_POINTER=y + CONFIG_KGDB=y + CONFIG_KGDB_SERIAL_CONSOLE=y + CONFIG_KGDB_KDB=y + CONFIG_KDB_KEYBOARD=y + +Kernel Debugger Boot Arguments +============================== + +This section describes the various runtime kernel parameters that affect +the configuration of the kernel debugger. The following chapter covers +using kdb and kgdb as well as providing some examples of the +configuration parameters. + +Kernel parameter: kgdboc +------------------------ + +The kgdboc driver was originally an abbreviation meant to stand for +"kgdb over console". Today it is the primary mechanism to configure how +to communicate from gdb to kgdb as well as the devices you want to use +to interact with the kdb shell. + +For kgdb/gdb, kgdboc is designed to work with a single serial port. It +is intended to cover the circumstance where you want to use a serial +console as your primary console as well as using it to perform kernel +debugging. It is also possible to use kgdb on a serial port which is not +designated as a system console. Kgdboc may be configured as a kernel +built-in or a kernel loadable module. You can only make use of +``kgdbwait`` and early debugging if you build kgdboc into the kernel as +a built-in. + +Optionally you can elect to activate kms (Kernel Mode Setting) +integration. When you use kms with kgdboc and you have a video driver +that has atomic mode setting hooks, it is possible to enter the debugger +on the graphics console. When the kernel execution is resumed, the +previous graphics mode will be restored. This integration can serve as a +useful tool to aid in diagnosing crashes or doing analysis of memory +with kdb while allowing the full graphics console applications to run. + +kgdboc arguments +~~~~~~~~~~~~~~~~ + +Usage:: + + kgdboc=[kms][[,]kbd][[,]serial_device][,baud] + +The order listed above must be observed if you use any of the optional +configurations together. + +Abbreviations: + +- kms = Kernel Mode Setting + +- kbd = Keyboard + +You can configure kgdboc to use the keyboard, and/or a serial device +depending on if you are using kdb and/or kgdb, in one of the following +scenarios. The order listed above must be observed if you use any of the +optional configurations together. Using kms + only gdb is generally not +a useful combination. + +Using loadable module or built-in +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +1. As a kernel built-in: + + Use the kernel boot argument:: + + kgdboc=<tty-device>,[baud] + +2. As a kernel loadable module: + + Use the command:: + + modprobe kgdboc kgdboc=<tty-device>,[baud] + + Here are two examples of how you might format the kgdboc string. The + first is for an x86 target using the first serial port. The second + example is for the ARM Versatile AB using the second serial port. + + 1. ``kgdboc=ttyS0,115200`` + + 2. ``kgdboc=ttyAMA1,115200`` + +Configure kgdboc at runtime with sysfs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +At run time you can enable or disable kgdboc by writing parameters +into sysfs. Here are two examples: + +1. Enable kgdboc on ttyS0:: + + echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc + +2. Disable kgdboc:: + + echo "" > /sys/module/kgdboc/parameters/kgdboc + +.. note:: + + You do not need to specify the baud if you are configuring the + console on tty which is already configured or open. + +More examples +^^^^^^^^^^^^^ + +You can configure kgdboc to use the keyboard, and/or a serial device +depending on if you are using kdb and/or kgdb, in one of the following +scenarios. + +1. kdb and kgdb over only a serial port:: + + kgdboc=<serial_device>[,baud] + + Example:: + + kgdboc=ttyS0,115200 + +2. kdb and kgdb with keyboard and a serial port:: + + kgdboc=kbd,<serial_device>[,baud] + + Example:: + + kgdboc=kbd,ttyS0,115200 + +3. kdb with a keyboard:: + + kgdboc=kbd + +4. kdb with kernel mode setting:: + + kgdboc=kms,kbd + +5. kdb with kernel mode setting and kgdb over a serial port:: + + kgdboc=kms,kbd,ttyS0,115200 + +.. note:: + + Kgdboc does not support interrupting the target via the gdb remote + protocol. You must manually send a `SysRq-G` unless you have a proxy + that splits console output to a terminal program. A console proxy has a + separate TCP port for the debugger and a separate TCP port for the + "human" console. The proxy can take care of sending the `SysRq-G` + for you. + +When using kgdboc with no debugger proxy, you can end up connecting the +debugger at one of two entry points. If an exception occurs after you +have loaded kgdboc, a message should print on the console stating it is +waiting for the debugger. In this case you disconnect your terminal +program and then connect the debugger in its place. If you want to +interrupt the target system and forcibly enter a debug session you have +to issue a `Sysrq` sequence and then type the letter `g`. Then you +disconnect the terminal session and connect gdb. Your options if you +don't like this are to hack gdb to send the `SysRq-G` for you as well as +on the initial connect, or to use a debugger proxy that allows an +unmodified gdb to do the debugging. + +Kernel parameter: ``kgdboc_earlycon`` +------------------------------------- + +If you specify the kernel parameter ``kgdboc_earlycon`` and your serial +driver registers a boot console that supports polling (doesn't need +interrupts and implements a nonblocking read() function) kgdb will attempt +to work using the boot console until it can transition to the regular +tty driver specified by the ``kgdboc`` parameter. + +Normally there is only one boot console (especially that implements the +read() function) so just adding ``kgdboc_earlycon`` on its own is +sufficient to make this work. If you have more than one boot console you +can add the boot console's name to differentiate. Note that names that +are registered through the boot console layer and the tty layer are not +the same for the same port. + +For instance, on one board to be explicit you might do:: + + kgdboc_earlycon=qcom_geni kgdboc=ttyMSM0 + +If the only boot console on the device was "qcom_geni", you could simplify:: + + kgdboc_earlycon kgdboc=ttyMSM0 + +Kernel parameter: ``kgdbwait`` +------------------------------ + +The Kernel command line option ``kgdbwait`` makes kgdb wait for a +debugger connection during booting of a kernel. You can only use this +option if you compiled a kgdb I/O driver into the kernel and you +specified the I/O driver configuration as a kernel command line option. +The kgdbwait parameter should always follow the configuration parameter +for the kgdb I/O driver in the kernel command line else the I/O driver +will not be configured prior to asking the kernel to use it to wait. + +The kernel will stop and wait as early as the I/O driver and +architecture allows when you use this option. If you build the kgdb I/O +driver as a loadable kernel module kgdbwait will not do anything. + +Kernel parameter: ``kgdbcon`` +----------------------------- + +The ``kgdbcon`` feature allows you to see printk() messages inside gdb +while gdb is connected to the kernel. Kdb does not make use of the kgdbcon +feature. + +Kgdb supports using the gdb serial protocol to send console messages to +the debugger when the debugger is connected and running. There are two +ways to activate this feature. + +1. Activate with the kernel command line option:: + + kgdbcon + +2. Use sysfs before configuring an I/O driver:: + + echo 1 > /sys/module/debug_core/parameters/kgdb_use_con + +.. note:: + + If you do this after you configure the kgdb I/O driver, the + setting will not take effect until the next point the I/O is + reconfigured. + +.. important:: + + You cannot use kgdboc + kgdbcon on a tty that is an + active system console. An example of incorrect usage is:: + + console=ttyS0,115200 kgdboc=ttyS0 kgdbcon + +It is possible to use this option with kgdboc on a tty that is not a +system console. + +Run time parameter: ``kgdbreboot`` +---------------------------------- + +The kgdbreboot feature allows you to change how the debugger deals with +the reboot notification. You have 3 choices for the behavior. The +default behavior is always set to 0. + +.. tabularcolumns:: |p{0.4cm}|p{11.5cm}|p{5.6cm}| + +.. flat-table:: + :widths: 1 10 8 + + * - 1 + - ``echo -1 > /sys/module/debug_core/parameters/kgdbreboot`` + - Ignore the reboot notification entirely. + + * - 2 + - ``echo 0 > /sys/module/debug_core/parameters/kgdbreboot`` + - Send the detach message to any attached debugger client. + + * - 3 + - ``echo 1 > /sys/module/debug_core/parameters/kgdbreboot`` + - Enter the debugger on reboot notify. + +Kernel parameter: ``nokaslr`` +----------------------------- + +If the architecture that you are using enables KASLR by default, +you should consider turning it off. KASLR randomizes the +virtual address where the kernel image is mapped and confuses +gdb which resolves addresses of kernel symbols from the symbol table +of vmlinux. + +Using kdb +========= + +Quick start for kdb on a serial port +------------------------------------ + +This is a quick example of how to use kdb. + +1. Configure kgdboc at boot using kernel parameters:: + + console=ttyS0,115200 kgdboc=ttyS0,115200 nokaslr + + OR + + Configure kgdboc after the kernel has booted; assuming you are using + a serial port console:: + + echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc + +2. Enter the kernel debugger manually or by waiting for an oops or + fault. There are several ways you can enter the kernel debugger + manually; all involve using the `SysRq-G`, which means you must have + enabled ``CONFIG_MAGIC_SYSRQ=y`` in your kernel config. + + - When logged in as root or with a super user session you can run:: + + echo g > /proc/sysrq-trigger + + - Example using minicom 2.2 + + Press: `CTRL-A` `f` `g` + + - When you have telneted to a terminal server that supports sending + a remote break + + Press: `CTRL-]` + + Type in: ``send break`` + + Press: `Enter` `g` + +3. From the kdb prompt you can run the ``help`` command to see a complete + list of the commands that are available. + + Some useful commands in kdb include: + + =========== ================================================================= + ``lsmod`` Shows where kernel modules are loaded + ``ps`` Displays only the active processes + ``ps A`` Shows all the processes + ``summary`` Shows kernel version info and memory usage + ``bt`` Get a backtrace of the current process using dump_stack() + ``dmesg`` View the kernel syslog buffer + ``go`` Continue the system + =========== ================================================================= + +4. When you are done using kdb you need to consider rebooting the system + or using the ``go`` command to resuming normal kernel execution. If you + have paused the kernel for a lengthy period of time, applications + that rely on timely networking or anything to do with real wall clock + time could be adversely affected, so you should take this into + consideration when using the kernel debugger. + +Quick start for kdb using a keyboard connected console +------------------------------------------------------ + +This is a quick example of how to use kdb with a keyboard. + +1. Configure kgdboc at boot using kernel parameters:: + + kgdboc=kbd + + OR + + Configure kgdboc after the kernel has booted:: + + echo kbd > /sys/module/kgdboc/parameters/kgdboc + +2. Enter the kernel debugger manually or by waiting for an oops or + fault. There are several ways you can enter the kernel debugger + manually; all involve using the `SysRq-G`, which means you must have + enabled ``CONFIG_MAGIC_SYSRQ=y`` in your kernel config. + + - When logged in as root or with a super user session you can run:: + + echo g > /proc/sysrq-trigger + + - Example using a laptop keyboard: + + Press and hold down: `Alt` + + Press and hold down: `Fn` + + Press and release the key with the label: `SysRq` + + Release: `Fn` + + Press and release: `g` + + Release: `Alt` + + - Example using a PS/2 101-key keyboard + + Press and hold down: `Alt` + + Press and release the key with the label: `SysRq` + + Press and release: `g` + + Release: `Alt` + +3. Now type in a kdb command such as ``help``, ``dmesg``, ``bt`` or ``go`` to + continue kernel execution. + +Using kgdb / gdb +================ + +In order to use kgdb you must activate it by passing configuration +information to one of the kgdb I/O drivers. If you do not pass any +configuration information kgdb will not do anything at all. Kgdb will +only actively hook up to the kernel trap hooks if a kgdb I/O driver is +loaded and configured. If you unconfigure a kgdb I/O driver, kgdb will +unregister all the kernel hook points. + +All kgdb I/O drivers can be reconfigured at run time, if +``CONFIG_SYSFS`` and ``CONFIG_MODULES`` are enabled, by echo'ing a new +config string to ``/sys/module/<driver>/parameter/<option>``. The driver +can be unconfigured by passing an empty string. You cannot change the +configuration while the debugger is attached. Make sure to detach the +debugger with the ``detach`` command prior to trying to unconfigure a +kgdb I/O driver. + +Connecting with gdb to a serial port +------------------------------------ + +1. Configure kgdboc + + Configure kgdboc at boot using kernel parameters:: + + kgdboc=ttyS0,115200 + + OR + + Configure kgdboc after the kernel has booted:: + + echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc + +2. Stop kernel execution (break into the debugger) + + In order to connect to gdb via kgdboc, the kernel must first be + stopped. There are several ways to stop the kernel which include + using kgdbwait as a boot argument, via a `SysRq-G`, or running the + kernel until it takes an exception where it waits for the debugger to + attach. + + - When logged in as root or with a super user session you can run:: + + echo g > /proc/sysrq-trigger + + - Example using minicom 2.2 + + Press: `CTRL-A` `f` `g` + + - When you have telneted to a terminal server that supports sending + a remote break + + Press: `CTRL-]` + + Type in: ``send break`` + + Press: `Enter` `g` + +3. Connect from gdb + + Example (using a directly connected port):: + + % gdb ./vmlinux + (gdb) set serial baud 115200 + (gdb) target remote /dev/ttyS0 + + + Example (kgdb to a terminal server on TCP port 2012):: + + % gdb ./vmlinux + (gdb) target remote 192.168.2.2:2012 + + + Once connected, you can debug a kernel the way you would debug an + application program. + + If you are having problems connecting or something is going seriously + wrong while debugging, it will most often be the case that you want + to enable gdb to be verbose about its target communications. You do + this prior to issuing the ``target remote`` command by typing in:: + + set debug remote 1 + +Remember if you continue in gdb, and need to "break in" again, you need +to issue an other `SysRq-G`. It is easy to create a simple entry point by +putting a breakpoint at ``sys_sync`` and then you can run ``sync`` from a +shell or script to break into the debugger. + +kgdb and kdb interoperability +============================= + +It is possible to transition between kdb and kgdb dynamically. The debug +core will remember which you used the last time and automatically start +in the same mode. + +Switching between kdb and kgdb +------------------------------ + +Switching from kgdb to kdb +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There are two ways to switch from kgdb to kdb: you can use gdb to issue +a maintenance packet, or you can blindly type the command ``$3#33``. +Whenever the kernel debugger stops in kgdb mode it will print the +message ``KGDB or $3#33 for KDB``. It is important to note that you have +to type the sequence correctly in one pass. You cannot type a backspace +or delete because kgdb will interpret that as part of the debug stream. + +1. Change from kgdb to kdb by blindly typing:: + + $3#33 + +2. Change from kgdb to kdb with gdb:: + + maintenance packet 3 + + .. note:: + + Now you must kill gdb. Typically you press `CTRL-Z` and issue + the command:: + + kill -9 % + +Change from kdb to kgdb +~~~~~~~~~~~~~~~~~~~~~~~ + +There are two ways you can change from kdb to kgdb. You can manually +enter kgdb mode by issuing the kgdb command from the kdb shell prompt, +or you can connect gdb while the kdb shell prompt is active. The kdb +shell looks for the typical first commands that gdb would issue with the +gdb remote protocol and if it sees one of those commands it +automatically changes into kgdb mode. + +1. From kdb issue the command:: + + kgdb + +2. At the kdb prompt, disconnect the terminal program and connect gdb in + its place. + +Running kdb commands from gdb +----------------------------- + +It is possible to run a limited set of kdb commands from gdb, using the +gdb monitor command. You don't want to execute any of the run control or +breakpoint operations, because it can disrupt the state of the kernel +debugger. You should be using gdb for breakpoints and run control +operations if you have gdb connected. The more useful commands to run +are things like lsmod, dmesg, ps or possibly some of the memory +information commands. To see all the kdb commands you can run +``monitor help``. + +Example:: + + (gdb) monitor ps + 1 idle process (state I) and + 27 sleeping system daemon (state M) processes suppressed, + use 'ps A' to see all. + Task Addr Pid Parent [*] cpu State Thread Command + + 0xc78291d0 1 0 0 0 S 0xc7829404 init + 0xc7954150 942 1 0 0 S 0xc7954384 dropbear + 0xc78789c0 944 1 0 0 S 0xc7878bf4 sh + (gdb) + +kgdb Test Suite +=============== + +When kgdb is enabled in the kernel config you can also elect to enable +the config parameter ``KGDB_TESTS``. Turning this on will enable a special +kgdb I/O module which is designed to test the kgdb internal functions. + +The kgdb tests are mainly intended for developers to test the kgdb +internals as well as a tool for developing a new kgdb architecture +specific implementation. These tests are not really for end users of the +Linux kernel. The primary source of documentation would be to look in +the ``drivers/misc/kgdbts.c`` file. + +The kgdb test suite can also be configured at compile time to run the +core set of tests by setting the kernel config parameter +``KGDB_TESTS_ON_BOOT``. This particular option is aimed at automated +regression testing and does not require modifying the kernel boot config +arguments. If this is turned on, the kgdb test suite can be disabled by +specifying ``kgdbts=`` as a kernel boot argument. + +Kernel Debugger Internals +========================= + +Architecture Specifics +---------------------- + +The kernel debugger is organized into a number of components: + +1. The debug core + + The debug core is found in ``kernel/debugger/debug_core.c``. It + contains: + + - A generic OS exception handler which includes sync'ing the + processors into a stopped state on an multi-CPU system. + + - The API to talk to the kgdb I/O drivers + + - The API to make calls to the arch-specific kgdb implementation + + - The logic to perform safe memory reads and writes to memory while + using the debugger + + - A full implementation for software breakpoints unless overridden + by the arch + + - The API to invoke either the kdb or kgdb frontend to the debug + core. + + - The structures and callback API for atomic kernel mode setting. + + .. note:: kgdboc is where the kms callbacks are invoked. + +2. kgdb arch-specific implementation + + This implementation is generally found in ``arch/*/kernel/kgdb.c``. As + an example, ``arch/x86/kernel/kgdb.c`` contains the specifics to + implement HW breakpoint as well as the initialization to dynamically + register and unregister for the trap handlers on this architecture. + The arch-specific portion implements: + + - contains an arch-specific trap catcher which invokes + kgdb_handle_exception() to start kgdb about doing its work + + - translation to and from gdb specific packet format to struct pt_regs + + - Registration and unregistration of architecture specific trap + hooks + + - Any special exception handling and cleanup + + - NMI exception handling and cleanup + + - (optional) HW breakpoints + +3. gdbstub frontend (aka kgdb) + + The gdbstub is located in ``kernel/debug/gdbstub.c``. It contains: + + - All the logic to implement the gdb serial protocol + +4. kdb frontend + + The kdb debugger shell is broken down into a number of components. + The kdb core is located in kernel/debug/kdb. There are a number of + helper functions in some of the other kernel components to make it + possible for kdb to examine and report information about the kernel + without taking locks that could cause a kernel deadlock. The kdb core + implements the following functionality. + + - A simple shell + + - The kdb core command set + + - A registration API to register additional kdb shell commands. + + - A good example of a self-contained kdb module is the ``ftdump`` + command for dumping the ftrace buffer. See: + ``kernel/trace/trace_kdb.c`` + + - For an example of how to dynamically register a new kdb command + you can build the kdb_hello.ko kernel module from + ``samples/kdb/kdb_hello.c``. To build this example you can set + ``CONFIG_SAMPLES=y`` and ``CONFIG_SAMPLE_KDB=m`` in your kernel + config. Later run ``modprobe kdb_hello`` and the next time you + enter the kdb shell, you can run the ``hello`` command. + + - The implementation for kdb_printf() which emits messages directly + to I/O drivers, bypassing the kernel log. + + - SW / HW breakpoint management for the kdb shell + +5. kgdb I/O driver + + Each kgdb I/O driver has to provide an implementation for the + following: + + - configuration via built-in or module + + - dynamic configuration and kgdb hook registration calls + + - read and write character interface + + - A cleanup handler for unconfiguring from the kgdb core + + - (optional) Early debug methodology + + Any given kgdb I/O driver has to operate very closely with the + hardware and must do it in such a way that does not enable interrupts + or change other parts of the system context without completely + restoring them. The kgdb core will repeatedly "poll" a kgdb I/O + driver for characters when it needs input. The I/O driver is expected + to return immediately if there is no data available. Doing so allows + for the future possibility to touch watchdog hardware in such a way + as to have a target system not reset when these are enabled. + +If you are intent on adding kgdb architecture specific support for a new +architecture, the architecture should define ``HAVE_ARCH_KGDB`` in the +architecture specific Kconfig file. This will enable kgdb for the +architecture, and at that point you must create an architecture specific +kgdb implementation. + +There are a few flags which must be set on every architecture in their +``asm/kgdb.h`` file. These are: + +- ``NUMREGBYTES``: + The size in bytes of all of the registers, so that we + can ensure they will all fit into a packet. + +- ``BUFMAX``: + The size in bytes of the buffer GDB will read into. This must + be larger than NUMREGBYTES. + +- ``CACHE_FLUSH_IS_SAFE``: + Set to 1 if it is always safe to call + flush_cache_range or flush_icache_range. On some architectures, + these functions may not be safe to call on SMP since we keep other + CPUs in a holding pattern. + +There are also the following functions for the common backend, found in +``kernel/kgdb.c``, that must be supplied by the architecture-specific +backend unless marked as (optional), in which case a default function +maybe used if the architecture does not need to provide a specific +implementation. + +.. kernel-doc:: include/linux/kgdb.h + :internal: + +kgdboc internals +---------------- + +kgdboc and uarts +~~~~~~~~~~~~~~~~ + +The kgdboc driver is actually a very thin driver that relies on the +underlying low level to the hardware driver having "polling hooks" to +which the tty driver is attached. In the initial implementation of +kgdboc the serial_core was changed to expose a low level UART hook for +doing polled mode reading and writing of a single character while in an +atomic context. When kgdb makes an I/O request to the debugger, kgdboc +invokes a callback in the serial core which in turn uses the callback in +the UART driver. + +When using kgdboc with a UART, the UART driver must implement two +callbacks in the struct uart_ops. +Example from ``drivers/8250.c``:: + + + #ifdef CONFIG_CONSOLE_POLL + .poll_get_char = serial8250_get_poll_char, + .poll_put_char = serial8250_put_poll_char, + #endif + + +Any implementation specifics around creating a polling driver use the +``#ifdef CONFIG_CONSOLE_POLL``, as shown above. Keep in mind that +polling hooks have to be implemented in such a way that they can be +called from an atomic context and have to restore the state of the UART +chip on return such that the system can return to normal when the +debugger detaches. You need to be very careful with any kind of lock you +consider, because failing here is most likely going to mean pressing the +reset button. + +kgdboc and keyboards +~~~~~~~~~~~~~~~~~~~~~~~~ + +The kgdboc driver contains logic to configure communications with an +attached keyboard. The keyboard infrastructure is only compiled into the +kernel when ``CONFIG_KDB_KEYBOARD=y`` is set in the kernel configuration. + +The core polled keyboard driver for PS/2 type keyboards is in +``drivers/char/kdb_keyboard.c``. This driver is hooked into the debug core +when kgdboc populates the callback in the array called +:c:expr:`kdb_poll_funcs[]`. The kdb_get_kbd_char() is the top-level +function which polls hardware for single character input. + +kgdboc and kms +~~~~~~~~~~~~~~~~~~ + +The kgdboc driver contains logic to request the graphics display to +switch to a text context when you are using ``kgdboc=kms,kbd``, provided +that you have a video driver which has a frame buffer console and atomic +kernel mode setting support. + +Every time the kernel debugger is entered it calls +kgdboc_pre_exp_handler() which in turn calls con_debug_enter() +in the virtual console layer. On resuming kernel execution, the kernel +debugger calls kgdboc_post_exp_handler() which in turn calls +con_debug_leave(). + +Any video driver that wants to be compatible with the kernel debugger +and the atomic kms callbacks must implement the ``mode_set_base_atomic``, +``fb_debug_enter`` and ``fb_debug_leave operations``. For the +``fb_debug_enter`` and ``fb_debug_leave`` the option exists to use the +generic drm fb helper functions or implement something custom for the +hardware. The following example shows the initialization of the +.mode_set_base_atomic operation in +drivers/gpu/drm/i915/intel_display.c:: + + + static const struct drm_crtc_helper_funcs intel_helper_funcs = { + [...] + .mode_set_base_atomic = intel_pipe_set_base_atomic, + [...] + }; + + +Here is an example of how the i915 driver initializes the +fb_debug_enter and fb_debug_leave functions to use the generic drm +helpers in ``drivers/gpu/drm/i915/intel_fb.c``:: + + + static struct fb_ops intelfb_ops = { + [...] + .fb_debug_enter = drm_fb_helper_debug_enter, + .fb_debug_leave = drm_fb_helper_debug_leave, + [...] + }; + + +Credits +======= + +The following people have contributed to this document: + +1. Amit Kale <amitkale@linsyssoft.com> + +2. Tom Rini <trini@kernel.crashing.org> + +In March 2008 this document was completely rewritten by: + +- Jason Wessel <jason.wessel@windriver.com> + +In Jan 2010 this document was updated to include kdb. + +- Jason Wessel <jason.wessel@windriver.com> diff --git a/Documentation/process/debugging/media_specific_debugging_guide.rst b/Documentation/process/debugging/media_specific_debugging_guide.rst new file mode 100644 index 000000000000..c5a93bafaf67 --- /dev/null +++ b/Documentation/process/debugging/media_specific_debugging_guide.rst @@ -0,0 +1,180 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================ +Debugging and tracing in the media subsystem +============================================ + +This document serves as a starting point and lookup for debugging device +drivers in the media subsystem and to debug these drivers from userspace. + +.. contents:: + :depth: 3 + +General debugging advice +------------------------ + +For general advice see the :doc:`general advice document +</process/debugging/index>`. + +The following sections show you some of the available tools. + +dev_debug module parameter +-------------------------- + +Every video device provides a ``dev_debug`` parameter, which allows to get +further insights into the IOCTLs in the background.:: + + # cat /sys/class/video4linux/video3/name + rkvdec + # echo 0xff > /sys/class/video4linux/video3/dev_debug + # dmesg -wH + [...] videodev: v4l2_open: video3: open (0) + [ +0.000036] video3: VIDIOC_QUERYCAP: driver=rkvdec, card=rkvdec, + bus=platform:rkvdec, version=0x00060900, capabilities=0x84204000, + device_caps=0x04204000 + +For the full documentation see :ref:`driver-api/media/v4l2-dev:video device +debugging` + +dev_dbg() / v4l2_dbg() +---------------------- + +Two debug print statements, which are specific for devices and for the v4l2 +subsystem, avoid adding these to your final submission unless they have +long-term value for investigations. + +For a general overview please see the +:ref:`process/debugging/driver_development_debugging_guide:printk() & friends` +guide. + +- Difference between both? + + - v4l2_dbg() utilizes v4l2_printk() under the hood, which further uses + printk() directly, thus it cannot be targeted by dynamic debug + - dev_dbg() can be targeted by dynamic debug + - v4l2_dbg() has a more specific prefix format for the media subsystem, while + dev_dbg only highlights the driver name and the location of the log + +Dynamic debug +------------- + +A method to trim down the debug output to your needs. + +For general advice see the +:ref:`process/debugging/userspace_debugging_guide:dynamic debug` guide. + +Here is one example, that enables all available pr_debug()'s within the file:: + + $ alias ddcmd='echo $* > /proc/dynamic_debug/control' + $ ddcmd '-p; file v4l2-h264.c +p' + $ grep =p /proc/dynamic_debug/control + drivers/media/v4l2-core/v4l2-h264.c:372 [v4l2_h264]print_ref_list_b =p + "ref_pic_list_b%u (cur_poc %u%c) %s" + drivers/media/v4l2-core/v4l2-h264.c:333 [v4l2_h264]print_ref_list_p =p + "ref_pic_list_p (cur_poc %u%c) %s\n" + +Ftrace +------ + +An internal kernel tracer that can trace static predefined events, function +calls, etc. Very useful for debugging problems without changing the kernel and +understanding the behavior of subsystems. + +For general advice see the +:ref:`process/debugging/userspace_debugging_guide:ftrace` guide. + +DebugFS +------- + +This tool allows you to dump or modify internal values of your driver to files +in a custom filesystem. + +For general advice see the +:ref:`process/debugging/driver_development_debugging_guide:debugfs` guide. + +Perf & alternatives +------------------- + +Tools to measure the various stats on a running system to diagnose issues. + +For general advice see the +:ref:`process/debugging/userspace_debugging_guide:perf & alternatives` guide. + +Example for media devices: + +Gather statistics data for a decoding job: (This example is on a RK3399 SoC +with the rkvdec codec driver using the `fluster test suite +<https://github.com/fluendo/fluster>`__):: + + perf stat -d python3 fluster.py run -d GStreamer-H.264-V4L2SL-Gst1.0 -ts + JVT-AVC_V1 -tv AUD_MW_E -j1 + ... + Performance counter stats for 'python3 fluster.py run -d + GStreamer-H.264-V4L2SL-Gst1.0 -ts JVT-AVC_V1 -tv AUD_MW_E -j1 -v': + + 7794.23 msec task-clock:u # 0.697 CPUs utilized + 0 context-switches:u # 0.000 /sec + 0 cpu-migrations:u # 0.000 /sec + 11901 page-faults:u # 1.527 K/sec + 882671556 cycles:u # 0.113 GHz (95.79%) + 711708695 instructions:u # 0.81 insn per cycle (95.79%) + 10581935 branches:u # 1.358 M/sec (15.13%) + 6871144 branch-misses:u # 64.93% of all branches (95.79%) + 281716547 L1-dcache-loads:u # 36.144 M/sec (95.79%) + 9019581 L1-dcache-load-misses:u # 3.20% of all L1-dcache accesses (95.79%) + <not supported> LLC-loads:u + <not supported> LLC-load-misses:u + + 11.180830431 seconds time elapsed + + 1.502318000 seconds user + 6.377221000 seconds sys + +The availability of events and metrics depends on the system you are running. + +Error checking & panic analysis +------------------------------- + +Various Kernel configuration options to enhance error detection of the Linux +Kernel with the cost of lowering performance. + +For general advice see the +:ref:`process/debugging/driver_development_debugging_guide:kasan, ubsan, +lockdep and other error checkers` guide. + +Driver verification with v4l2-compliance +---------------------------------------- + +To verify, that a driver adheres to the v4l2 API, the tool v4l2-compliance is +used, which is part of the `v4l_utils +<https://git.linuxtv.org/v4l-utils.git>`__, a suite of userspace tools to work +with the media subsystem. + +To see the detailed media topology (and check it) use:: + + v4l2-compliance -M /dev/mediaX --verbose + +You can also run a full compliance check for all devices referenced in the +media topology with:: + + v4l2-compliance -m /dev/mediaX + +Debugging problems with receiving video +--------------------------------------- + +Implementing vidioc_log_status in the driver: this can log the current status +to the kernel log. It's called by v4l2-ctl --log-status. Very useful for +debugging problems with receiving video (TV/S-Video/HDMI/etc) since the video +signal is external (so unpredictable). Less useful with camera sensor inputs +since you have control over what the camera sensor does. + +Usually you can just assign the default:: + + .vidioc_log_status = v4l2_ctrl_log_status, + +But you can also create your own callback, to create a custom status log. + +You can find an example in the cobalt driver +(`drivers/media/pci/cobalt/cobalt-v4l2.c <https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/pci/cobalt/cobalt-v4l2.c#L567>`__). + +**Copyright** ©2024 : Collabora diff --git a/Documentation/process/debugging/userspace_debugging_guide.rst b/Documentation/process/debugging/userspace_debugging_guide.rst new file mode 100644 index 000000000000..db7396261e07 --- /dev/null +++ b/Documentation/process/debugging/userspace_debugging_guide.rst @@ -0,0 +1,280 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================== +Userspace debugging advice +========================== + +This document provides a brief overview of common tools to debug the Linux +Kernel from userspace. +For debugging advice aimed at driver developers go :doc:`here +</process/debugging/driver_development_debugging_guide>`. +For general debugging advice, see :doc:`general advice document +</process/debugging/index>`. + +.. contents:: + :depth: 3 + +The following sections show you the available tools. + +Dynamic debug +------------- + +Mechanism to filter what ends up in the kernel log by dis-/en-abling log +messages. + +Prerequisite: ``CONFIG_DYNAMIC_DEBUG`` + +Dynamic debug is only able to target: + +- pr_debug() +- dev_dbg() +- print_hex_dump_debug() +- print_hex_dump_bytes() + +Therefore the usability of this tool is, as of now, quite limited as there is +no uniform rule for adding debug prints to the codebase, resulting in a variety +of ways these prints are implemented. + +Also, note that most debug statements are implemented as a variation of +dprintk(), which have to be activated via a parameter in respective module, +dynamic debug is unable to do that step for you. + +Here is one example, that enables all available pr_debug()'s within the file:: + + $ alias ddcmd='echo $* > /proc/dynamic_debug/control' + $ ddcmd '-p; file v4l2-h264.c +p' + $ grep =p /proc/dynamic_debug/control + drivers/media/v4l2-core/v4l2-h264.c:372 [v4l2_h264]print_ref_list_b =p + "ref_pic_list_b%u (cur_poc %u%c) %s" + drivers/media/v4l2-core/v4l2-h264.c:333 [v4l2_h264]print_ref_list_p =p + "ref_pic_list_p (cur_poc %u%c) %s\n" + +**When should you use this over Ftrace ?** + +- When the code contains one of the valid print statements (see above) or when + you have added multiple pr_debug() statements during development +- When timing is not an issue, meaning if multiple pr_debug() statements in + the code won't cause delays +- When you care more about receiving specific log messages than tracing the + pattern of how a function is called + +For the full documentation see :doc:`/admin-guide/dynamic-debug-howto` + +Ftrace +------ + +Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` + +This tool uses the tracefs file system for the control files and output files. +That file system will be mounted as a ``tracing`` directory, which can be found +in either ``/sys/kernel/`` or ``/sys/debug/kernel/``. + +Some of the most important operations for debugging are: + +- You can perform a function trace by adding a function name to the + ``set_ftrace_filter`` file (which accepts any function name found within the + ``available_filter_functions`` file) or you can specifically disable certain + functions by adding their names to the ``set_ftrace_notrace`` file (more info + at: :ref:`trace/ftrace:dynamic ftrace`). +- In order to find out where calls originate from you can activate the + ``func_stack_trace`` option under ``options/func_stack_trace``. +- Tracing the children of a function call and showing the return values are + possible by adding the desired function in the ``set_graph_function`` file + (requires config ``FUNCTION_GRAPH_RETVAL``); more info at + :ref:`trace/ftrace:dynamic ftrace with the function graph tracer`. + +For the full Ftrace documentation see :doc:`/trace/ftrace` + +Or you could also trace for specific events by :ref:`using event tracing +<trace/events:2. using event tracing>`, which can be defined as described here: +:ref:`Creating a custom Ftrace tracepoint +<process/debugging/driver_development_debugging_guide:ftrace>`. + +For the full Ftrace event tracing documentation see :doc:`/trace/events` + +.. _read_ftrace_log: + +Reading the ftrace log +~~~~~~~~~~~~~~~~~~~~~~ + +The ``trace`` file can be read just like any other file (``cat``, ``tail``, +``head``, ``vim``, etc.), the size of the file is limited by the +``buffer_size_kb`` (``echo 1000 > buffer_size_kb``). The +:ref:`trace/ftrace:trace_pipe` will behave similarly to the ``trace`` file, but +whenever you read from the file the content is consumed. + +Kernelshark +~~~~~~~~~~~ + +A GUI interface to visualize the traces as a graph and list view from the +output of the `trace-cmd +<https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/>`__ application. + +For the full documentation see `<https://kernelshark.org/Documentation.html>`__ + +Perf & alternatives +------------------- + +The tools mentioned above provide ways to inspect kernel code, results, +variable values, etc. Sometimes you have to find out first where to look and +for those cases, a box of performance tracking tools can help you to frame the +issue. + +Why should you do a performance analysis? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A performance analysis is a good first step when among other reasons: + +- you cannot define the issue +- you do not know where it occurs +- the running system should not be interrupted or it is a remote system, where + you cannot install a new module/kernel + +How to do a simple analysis with linux tools? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For the start of a performance analysis, you can start with the usual tools +like: + +- ``top`` / ``htop`` / ``atop`` (*get an overview of the system load, see + spikes on specific processes*) +- ``mpstat -P ALL`` (*look at the load distribution among CPUs*) +- ``iostat -x`` (*observe input and output devices utilization and performance*) +- ``vmstat`` (*overview of memory usage on the system*) +- ``pidstat`` (*similar to* ``vmstat`` *but per process, to dial it down to the + target*) +- ``strace -tp $PID`` (*once you know the process, you can figure out how it + communicates with the Kernel*) + +These should help to narrow down the areas to look at sufficiently. + +Diving deeper with perf +~~~~~~~~~~~~~~~~~~~~~~~ + +The **perf** tool provides a series of metrics and events to further dial down +on issues. + +Prerequisite: build or install perf on your system + +Gather statistics data for finding all files starting with ``gcc`` in ``/usr``:: + + # perf stat -d find /usr -name 'gcc*' | wc -l + + Performance counter stats for 'find /usr -name gcc*': + + 1277.81 msec task-clock # 0.997 CPUs utilized + 9 context-switches # 7.043 /sec + 1 cpu-migrations # 0.783 /sec + 704 page-faults # 550.943 /sec + 766548897 cycles # 0.600 GHz (97.15%) + 798285467 instructions # 1.04 insn per cycle (97.15%) + 57582731 branches # 45.064 M/sec (2.85%) + 3842573 branch-misses # 6.67% of all branches (97.15%) + 281616097 L1-dcache-loads # 220.390 M/sec (97.15%) + 4220975 L1-dcache-load-misses # 1.50% of all L1-dcache accesses (97.15%) + <not supported> LLC-loads + <not supported> LLC-load-misses + + 1.281746009 seconds time elapsed + + 0.508796000 seconds user + 0.773209000 seconds sys + + + 52 + +The availability of events and metrics depends on the system you are running. + +For the full documentation see +`<https://perf.wiki.kernel.org/index.php/Main_Page>`__ + +Perfetto +~~~~~~~~ + +A set of tools to measure and analyze how well applications and systems perform. +You can use it to: + +* identify bottlenecks +* optimize code +* make software run faster and more efficiently. + +**What is the difference between perfetto and perf?** + +* perf is tool as part of and specialized for the Linux Kernel and has CLI user + interface. +* perfetto cross-platform performance analysis stack, has extended + functionality into userspace and provides a WEB user interface. + +For the full documentation see `<https://perfetto.dev/docs/>`__ + +Kernel panic analysis tools +--------------------------- + + To capture the crash dump please use ``Kdump`` & ``Kexec``. Below you can find + some advice for analysing the data. + + For the full documentation see the :doc:`/admin-guide/kdump/kdump` + + In order to find the corresponding line in the code you can use `faddr2line + <https://elixir.bootlin.com/linux/v6.11.6/source/scripts/faddr2line>`__; note + that you need to enable ``CONFIG_DEBUG_INFO`` for that to work. + + An alternative to using ``faddr2line`` is the use of ``objdump`` (and its + derivatives for the different platforms like ``aarch64-linux-gnu-objdump``). + Take this line as an example: + + ``[ +0.000240] rkvdec_device_run+0x50/0x138 [rockchip_vdec]``. + + We can find the corresponding line of code by executing:: + + aarch64-linux-gnu-objdump -dS drivers/staging/media/rkvdec/rockchip-vdec.ko | grep rkvdec_device_run\>: -A 40 + 0000000000000ac8 <rkvdec_device_run>: + ac8: d503201f nop + acc: d503201f nop + { + ad0: d503233f paciasp + ad4: a9bd7bfd stp x29, x30, [sp, #-48]! + ad8: 910003fd mov x29, sp + adc: a90153f3 stp x19, x20, [sp, #16] + ae0: a9025bf5 stp x21, x22, [sp, #32] + const struct rkvdec_coded_fmt_desc *desc = ctx->coded_fmt_desc; + ae4: f9411814 ldr x20, [x0, #560] + struct rkvdec_dev *rkvdec = ctx->dev; + ae8: f9418015 ldr x21, [x0, #768] + if (WARN_ON(!desc)) + aec: b4000654 cbz x20, bb4 <rkvdec_device_run+0xec> + ret = pm_runtime_resume_and_get(rkvdec->dev); + af0: f943d2b6 ldr x22, [x21, #1952] + ret = __pm_runtime_resume(dev, RPM_GET_PUT); + af4: aa0003f3 mov x19, x0 + af8: 52800081 mov w1, #0x4 // #4 + afc: aa1603e0 mov x0, x22 + b00: 94000000 bl 0 <__pm_runtime_resume> + if (ret < 0) { + b04: 37f80340 tbnz w0, #31, b6c <rkvdec_device_run+0xa4> + dev_warn(rkvdec->dev, "Not good\n"); + b08: f943d2a0 ldr x0, [x21, #1952] + b0c: 90000001 adrp x1, 0 <rkvdec_try_ctrl-0x8> + b10: 91000021 add x1, x1, #0x0 + b14: 94000000 bl 0 <_dev_warn> + *bad = 1; + b18: d2800001 mov x1, #0x0 // #0 + ... + + Meaning, in this line from the crash dump:: + + [ +0.000240] rkvdec_device_run+0x50/0x138 [rockchip_vdec] + + I can take the ``0x50`` as offset, which I have to add to the base address + of the corresponding function, which I find in this line:: + + 0000000000000ac8 <rkvdec_device_run>: + + The result of ``0xac8 + 0x50 = 0xb18`` + And when I search for that address within the function I get the + following line:: + + *bad = 1; + b18: d2800001 mov x1, #0x0 + +**Copyright** ©2024 : Collabora |