The document discusses various techniques for debugging Linux kernel modules and device drivers, including:
1) Using printk statements to output debug messages from kernel space.
2) Watching system calls with strace to debug interactions between user and kernel space.
3) Adding /proc file system entries and write functions to dynamically modify driver values at runtime.
4) Enabling source-level debugging with tools like kgdb to debug at the level of C source code.
Report
Share
Report
Share
1 of 35
More Related Content
Troubleshooting Linux Kernel Modules And Device Drivers
1. Troubleshooting Linux Kernel Modules and Device Drivers Mike Anderson Chief Scientist The PTR Group, Inc. [email_address] Source: www.crobike.de
2. What We’ll Talk About How do errors show up in the kernel? Watching kernel/user-space interaction via strace Debugging with printk Using the /proc file system Using the kgdb debugger Debugging with hardware ala LEDs or a JTAG unit
3. Challenges of Kernel Debugging There are many features that can make kernel debugging especially difficult Optimizing compilers can rearrange code Instruction pointer seems to jump around The use of the MMU can obfuscate addresses Physical vs. virtual addresses Startup code is particularly difficult to debug because of its closeness to the “metal” No equivalent for the user-space gdbserver for drivers Early kernel debugging may require hardware assistance
4. Device Drivers/Kernel Modules Assuming that your kernel is otherwise working, most of the problems that you’ll encounter are related to device drivers Drivers can either be statically linked or dynamically loaded to the kernel A dynamically loaded driver takes the form of a kernel module Can be dynamically loaded and unloaded at kernel run time Frequently handled by daemons such as udev
6. When Things go Wrong… Problems in device drivers typically manifest themselves in one of three ways Kernel panic Fatal to the system Kernel oops Near fatal to the system Hardware just doesn’t work correctly Could be fatal to you! Source: picasaweb.google.com
7. Kernel Panic When the Linux kernel determines that a fatal error has occurred, and no recovery is possible, it “panics” Frequently, an exception in an interrupt context Panic outputs a message to the console The output will help you find the source of the bug Typically results in a system reboot on an embedded Linux target Or, blinking keyboard LEDs on some desktop versions of Linux Source: regmedia.co.uk
9. Kernel oops An oops message is displayed when a recoverable error has occurred in kernel space: Access to bad address, e.g., through a NULL pointer Illegal or invalid instruction Etc… The calling user process is killed The system should be considered unstable at this point The oops message displays: The state of the processor at the time of the fault, including registers and address of faulting instruction function call stack traceback The addresses are replaced with symbols if the kallsyms kernel configuration option is selected at kernel compile time
13. Module Debugging Techniques Examine the interaction with the kernel via strace The next line of defense is printk There may be additional output you’re not seeing Next, we can try adding /proc file system entries Instrument the driver for debugging Enable source debugging via kgdb Using hardware debuggers and “blinky lights”
14. Using strace to Watch System Calls When debugging what appears to be a kernel-space error, it can be helpful to watch the system calls that are made from user-space See what events lead to the error strace displays all system calls made by a program Can display timestamp information per system call as well
15. Using strace to Watch System Calls #2 strace displays each system call’s arguments and return values string arguments are printed – very helpful! errno values displayed symbolically The program being traced runs normally Not under control of a debugger No need to specially compile the user application You can attach to a running program And trace forked applications as well…
17. Debugging with printk printk debugging is the debug method preferred by Linus At least, according to his email traffic… Insert messages to be displayed at points of interest in kernel-space code E.g., printk(KERN_INFO “my_x= %d”, my_x); Printk works like printf does in user-space except that printk can only print integers, strings & addresses printk can also be called from within ISRs Messages can have “importance” settings that allow filtering Importance is set by prepending a 3-character string to the output message: “< n >”
18. Debugging with printk #2 The messages are placed in a circular buffer that can be retrieved post mortem if needed The “importance” string that is prepended to the printk message can be found in include/linux/kernel.h : #define KERN_EMERG "<0>" /* system is unusable */ #define KERN_ALERT "<1>" /* action to be taken immediately */ #define KERN_CRIT "<2>" /* critical conditions */ #define KERN_ERR "<3>" /* error conditions */ #define KERN_WARNING "<4>" /* warning conditions */ #define KERN_NOTICE "<5>" /* normal but significant condition */ #define KERN_INFO "<6>" /* informational */ #define KERN_DEBUG "<7>" /* debug-level messages */
19. Debugging with printk #3 Some level of control for printk output can be found in /proc/sys/kernel/printk Let’s look at the following output: # cat /proc/sys/kernel/printk 7 4 1 7 This indicates: The console_loglevel is 7, so messages with importance of 0..6 will currently go to the console The default message log level is 4, so messages that do not specify an importance are treated as level 4 The minimum console log level is 1, so console_loglevel cannot be set to any value less than 1 The default console log level is 7, so console_loglevel starts out set to 7
20. Debugging with printk #4 You may control the console_loglevel by writing to /proc/sys/kernel/printk To enable all printk messages with importance levels 0..7: # echo 8 > /proc/sys/kernel/printk If the kernel command line contains the word “debug”, the console_loglevel starts with a value of 10
21. Using the /proc File System Use /proc entries for driver instrumentation It is possible to register “write” functions that allow us to dynamically modify values in the kernel (or drivers, or modules) Write values to the /proc file system entry Read /proc entries allow us to retrieve information from a running kernel entity Information is provided “live” Look at the driver source, there may already be /proc entries that can help you
22. Techniques for Source Debugging The two primary ways to provide source debugging in the Linux kernel are based on either kgdb or on the use of a hardware JTAG probe Unfortunately, kgdb is not a standard feature of the kernel a/o 2.6.25.8 You’ll have to patch your kernel to enable it Either technique will require the use of a kernel image that is compiled with debugging symbols Unless you like debugging in assembly language
23. Compiling the Kernel with Debug Info This will increase the size of the debug kernel image by about 30% However, you don’t need to load the debug version of the kernel Load the non-debug version to the target, but use the debug version for the debugger/JTAG probe Save off the vmlinux and System.map file because these are used by the debugger or by you to find key addresses The (b)zImage can be loaded on the target as normal
25. Kernel gdb (kgdb) If you are using a stock kernel, kgdb is not included Linus doesn’t believe in a source debugger in the kernel Many commercial Linux vendors do include it in their distributions though kgdb can be downloaded from: http://kgdb.linsyssoft.com/downloads.htm or http://sourceforge.net and look for kgdb You’ll need to patch the kernel A new kgdb light is in the works for 2.6.26 Kgdb over the system console
27. Kgdb Light in 2.6.26-rc8 Uses system console for I/O
28. kgdb Lash up kgdb supports debugging via the serial port The gdb debugger is running on a second machine using the vmlinux you compiled with debugging symbols You attach to the system being debugged using gdb’s “target remote” command Host Target Network Ethernet RS-232
29. Hardware-assisted Debugging There are a number of devices that can help with debugging LEDs, JTAGs, logic analyzers, oscilloscopes, bus analyzers and more These can range from a few cents to implement to several 10s of thousands of dollars You typically get what you pay for
30. Debugging with LEDs Very simple: Blink on/off in various code sections under debug Blink in sequences Can display multiple-bit codes if multiple LEDs are available Very fast, little impact on run-time performance Adding LED debug code will likely not “make the problem go away” These may be the only option for debugging early x86 code
31. Debugging with LEDs, Caveats The LED(s) must be free for use Not tied in hardware to a network PHY or to displaying power status, for example LED are not very verbose You must decipher what the blinking means Can be difficult to determine where you are in the code You can also attach an oscilloscope to the GPIO pins found on many processors for more information
32. Hardware Debuggers In the past, in-circuit emulators (ICE) where the debugger of choice You pulled the CPU, plugged the ICE in and plugged the CPU into the ICE But, these where $80K+ each Logic analyzers are also good to have But, they are $35K+ for an empty mainframe PC-based versions can be had for < $1K IEEE 1149.1 (JTAG) has become the debugger da jour JTAG uses a boundary-scan protocol These range from $70 to $20K depending on model and features At a minimum, a JTAG is really a “must-have” for firmware and board bring up
33. Debugging with a JTAG Probe Debugging with a JTAG unit is much less involved than using kgdb Compile the kernel with debugging enabled No need to patch the kernel for kgdb Assumes your platform supports a JTAG interface You connect to the JTAG unit using whatever technique your JTAG probe requires Your JTAG GUI is dependent on the vendor For those JTAG units that are gdb-aware, use the appropriate target remote commands
34. Example JTAG Usage Connect the JTAG to the target and the host Start the host application to control the JTAG Reset the target and load the register configuration settings to the JTAG unit Load code and enjoy! Useful for debugging drivers as well as bringing up new firmware and BSPs
35. Summary “ Real developers” use printk – or at least Linus does Tools like strace allow you see the flow of execution The /proc filesystem gives you a window into the kernel/drivers Kgdb uses familiar gdb technology but at the kernel level LEDs are a fast and easy way to get info out of the machine as well Hardware JTAG debug tools may be available These can be invaluable if you can get one