Linux Kernel Programming
Linux Kernel Programming
Benedikt Waldvogel
Why kernel-space?
understand how Linux works . . . or even change kernel behaviour (scheduling, networking) hardware access (interrupts, DMA) network/character/block devices root-kits ;-)
Benedikt Waldvogel
Prerequisites
C know-how compiler collection virtual environment for testing (VMware, XEN) kernel headers and/or sources recommended: compile kernel by yourself (debugging options)
Benedikt Waldvogel
What is a module
piece of code loaded/unloaded into the running kernel make sure CONFIG_MODULES=y modules are installed in /lib/modules/ # cat /proc/modules to see which modules are loaded or /sys/module/...
Benedikt Waldvogel
Benedikt Waldvogel
# modprobe module name resolves dependencies (/lib/modules/2.6.xyz/modules.dep) Example: # modprobe msdos becomes insmod /lib/modules/.../fs/fat/fat.ko insmod /lib/modules/.../fs/msdos/msdos.ko
Benedikt Waldvogel
Hello World
helloworld.c
# include < l i n u x / module . h> i n t i n i t m o d u l e ( void ) { p r i n t k ( Hello world ! \ n ) ; r e t u r n 0 ; / success / } void cleanup module ( void ) { p r i n t k ( Goodby w o r l d . \ n ) ; }
Benedikt Waldvogel
the Makele
obj-m += helloworld.o
all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean -rm *.ko -rm *.o
Benedikt Waldvogel
Metadata
modules are usually enhanced with metadata you should at least provide the license #include <linux/module.h> MODULE_AUTHOR("Benedikt Waldvogel"); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("unfug sample module"); ...
Benedikt Waldvogel
Benedikt Waldvogel
if you forget the cleanup method, the module becomes permanent and can not be unloaded1 # lsmod Module helloworld usb_storage ... usbcore
1
Use counter
void f o o ( void ) { / i n c r e a s e use c o u n t e r / t r y m o d u l e g e t ( THIS MODULE ) ; ... } void bar ( void ) { ... / decrease use c o u n t e r / module put ( THIS MODULE ) ; } / unload succeeds i f use c o u n t e r i s 0 / i n t cleanup module ( void ) { ... }
Coding style
First off, Id suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, its a great symbolic gesture. /usr/src/linux/Documentation/CodingStyle
Benedikt Waldvogel
Benedikt Waldvogel
Naming
descriptive global functions/vars (count active users() instead of cntusr()) short names for local variables dont use Hungarian notation (eg ui varname for unsigned int)
Benedikt Waldvogel
Functions, Commenting
up to ca. 50 lines, maximum dont inline functions with more than 3 lines max. 5-10 local variables explain what your code does, not how its done all comments at function-head
Benedikt Waldvogel
Preemption
kernel-space preemption possible since 2.6 (CONFIG PREEMPT)
Benedikt Waldvogel
Dynamic memory
kmalloc/kfree allocate up to 128k physically continuous memory kcalloc(. . . ) same as kmalloc, but memory is zeroed vmalloc/vfree allocate memory in virtual address space (more than 128k possible but no DMA!)
Benedikt Waldvogel
ags declared in <linux/mm.h> GFP USER associated userspace process sleeps until free memory available GFP KERNEL associated kernel function sleeps until free memory available GFP ATOMIC doesnt sleep (used in ISRs)
Benedikt Waldvogel
The term jiffy (or jife) is used in different applications for various different short periods of time. In general parlance, the term means any unspecied short period of time, or a moment, and is often used in the sense of the time taken to complete a task. The origin of the word is unknown, but it is believed to have rst appeared in 1779. http://en.wikipedia.org/wiki/Jiffy
Benedikt Waldvogel
# cat /proc/jiffies && sleep 1s && cat /proc/jiffies jiffies: 4025402 jiffies: 4025654 difference: 252 # cat /proc/jiffies && sleep 1s && cat /proc/jiffies jiffies: 4034105 jiffies: 4034358 difference: 253 no coincidence! HZ + some overhead ...
Benedikt Waldvogel
Benedikt Waldvogel
current
current is a macro which returns a struct task_struct* it points to the current process. Example: printk("current process pid: %d\n", current->pid) /usr/include/linux/sched.h
struct t a s k s t r u c t { / 1 unrunnable , 0 runnable , >0 stopped / v o l a t i l e long s t a t e ; struct thread info thread info ; int prio , s t a t i c p r i o ; unsigned long long timestamp , l a s t r a n ; unsigned long long sched time ; / t i m e spent r u n n i n g / s t r u c t t a s k s t r u c t p a r e n t ; / p a r e n t process / struct l i s t h e a d children ; struct l i s t h e a d s i b l i n g ; ... }
Benedikt Waldvogel Linux kernel-space programming
#include <linux/fs.h> struct file_operations fops = { .read = device_read, .write = device_write, .open = device_open, .release = device_release }; register_chrdev(240, "mydev", &fops); # mknod /dev/mydev c 240 0 # cat /proc/devices | grep mydev 240 mydev
3 4 5
Major / Minor
Major associates a device with a certain kernel module/driver Minor a driver is able to handle multiple devices with it Example: # ls /dev/my* crw-r--r-- 1 root root 240, 0 26. Nov 11:45 /dev/mydev crw-r--r-- 1 root root 240, 1 26. Nov 11:54 /dev/mydev2 mydev, mydev2 are both handled by driver which registered 240 use Major numbers 240-254 for you own (local) devices
Benedikt Waldvogel Linux kernel-space programming
Network devices
register_netdev(...) unregister_netdev(...) handle interrupts similar to character devices see /usr/src/linux/drivers/net/isa-skeleton.c which is a network driver outline
Benedikt Waldvogel
Procfs
designed to export process information to userland
1
create /proc/example entry = create_proc_entry("example", 0644, 0) assign read/write handlers: entry->read_proc = procfile_read; entry->write_proc = procfile_write;
int procle read(char *buffer, char **buffer location, off t offset, int buffer length, int *eof, void *data) { . . . }
int procle write(struct le *le, const char *buffer, unsigned long count, void *data) { . . . }
Benedikt Waldvogel
Simple example
i n t p r o c f i l e r e a d ( char buf , char l o c , o f f t o f f , i n t b u f l e n , i n t eof , void data ) { r e t u r n s c n p r i n t f ( buf , b u f l e n 1, H e l l o w o r l d \ n ) ; } i n t m o d u l e i n i t ( void ) { struct p r o c d i r e n t r y bl proc = c r e a t e p r o c e n t r y ( example , 0444 , NULL ) ; bl bl bl bl proc >r e a d p r o c = p r o c f i l e r e a d ; proc > w r i t e p r o c = NULL ; / no w r i t i n g p o s s i b l e / proc >owner = THIS MODULE ; proc >s i z e = 4 0 ;
return 0; }
Benedikt Waldvogel Linux kernel-space programming
Sysfs
sysfs represents the kernel device model single key/value pairs instead of complex structures kobject object management for your C code (eg. reference counting) exports itself to /sys an object is represented by a folder its attributes as les in that folder
see /usr/src/linux/Documentation/kobject.txt
Benedikt Waldvogel
Module parameters
either seciy the parameter while insmodding: # insmod helloworld.ko parm1=23 or use sysfs: # cat /sys/module/helloworld/parameters/parm1 0644 allows writing for root
Benedikt Waldvogel
Hardware access
Benedikt Waldvogel
#include <linux/ioport.h> request_region(0x378, 3, "unfug"); check # cat /proc/ioports | grep unfug 0378-037a : unfug dont forget to release the region: release_region(0x378, 3);
Benedikt Waldvogel
read/write data
. . . then we can read/write data to hardware registers char *b = 0x378; *b = 0x23; not possible on x86 use special I/O commands instead inb(), outb(), outw(), outl() etc use outb(0x378, 0x23); to write the byte 0x23 char i = inb(0x378);
Benedikt Waldvogel
Benedikt Waldvogel
Benedikt Waldvogel
int isr(int irq, void *dev_id, struct pt_regs *regs) { printk("interrupt on irq %d\n", irq); return 0; } request_irq(7, isr, SA_INTERRUPT, "unfug", id); check /proc/interrupts 0: 7: 138133 0 timer unfug
2 3
dont forget to free the irq: free_irq(id); note: no cpu regs parameter (pt_regs) since 2.6.19
Benedikt Waldvogel Linux kernel-space programming
request_irq() takes a ags to set ISR behaviour SA SHIRQ interrupt is shared between drivers SA INTERRUPT deactivate other interrupts SA SAMPLE RANDOM use the interrupt to feed entropy pool example:
request_irq(7, my_handler, SA_SHIRQ | SA_INTERRUPT | SA_SAMPLE_RANDOM, devid);
Benedikt Waldvogel
if SA INTERRUPT is set
Solution the ISR is divided in upper halves and bottom halves upper half is minimal, initiates a bottom half bottom halves (BH) do the complex part BHs in 2.4+ replaced by: timers, tasklets, threads
Benedikt Waldvogel Linux kernel-space programming
Timers
delay the execution of a certain function (once)
1 2 3 4 5 6
#include <linux/timer.h> static struct timer_list simple_timer; init_timer(&simple_timer); int timed_function(long data) { ... } simple_timer.function = timed_function; delay the execution by 5 seconds simple_timer.expires = jiffies + 5*HZ; add_timer(&simple_timer); if module is unloaded before timer expired, delete it del_timer(&simple_timer);
Benedikt Waldvogel Linux kernel-space programming
7 8
Tasklets
queued execution - the tasklet is executed as soon as possible
1 2 3 4
#include <linux/interrupt.h> void do_something (unsigned long data) { ... } DECLARE_TASKLET(my_tasklet, do_something, data); int interrupt_routine(...) { /* do something fast */ ... /* queue the more complex part */ tasklet_schedule(&my_tasklet); return IRQ_HANDLED; }
Benedikt Waldvogel Linux kernel-space programming
Threads
1 2
#include <linux/sched.h> static int thread_function(void *data) { daemonize("unfug_thread"); allow_signal(SIGTERM); /* do something or sleep until SIGTERM */ ... return 0; /* exit thread */ }
thread_id = kernel_thread(thread_function, 0, CLONE_KERNEL);
3 4
Output to syslog
#include <linux/kernel.h> printk(KERN_INFO "info text\n") instead of KERN_INFO also possible (ordered by priority):
KERN_NOTICE, KERN_WARNING, KERN_ERR, KERN_CRIT, KERN_ALERT
Benedikt Waldvogel
Benedikt Waldvogel
Benedikt Waldvogel
Blinkenlights example
# insmod blinkenlights.ko speed=25 /proc/blinkenlights /sys/modules/blinkenlights/parameters/speed /proc/ioports # insmod interrupt.ko connect pin 10 with 22 and see /var/log/messages
Benedikt Waldvogel
__builtin_expect(!!(x), 1) __builtin_expect(!!(x), 0)
export.c
void f o o ( void ) { p r i n f o ( f o o c a l l e d \n ) ; } EXPORT SYMBOL( f o o ) ;
import.c
s t a t i c i n t i n i t m o d u l e ( void ) { foo ( ) ; ... }
see /proc/kallsyms for kernel symbols # insmod export.ko now you can insmod import.ko, which depends on export # lsmod Module import export
Benedikt Waldvogel
Used by 0 1 import
Introduction Modules From module to driver Hardware Misc # include # include # include # include # include # include
<sys / t y p e s . h> <sys / module . h> <sys / systm . h> / u p r i n t f / <sys / e r r n o . h> <sys / param . h> / d e f i n e s used i n k e r n e l . h / <sys / k e r n e l . h> / t y p e s used i n module i n i t i a l i z a t i o n /
s t a t i c i n t s k e l l o a d e r ( s t r u c t module m, i n t what , void arg ) { switch ( what ) { case MOD LOAD: / k l d l o a d / p r i n t f ( S k e l e t o n KLD loaded . \ n ) ; break ; case MOD UNLOAD: p r i n t f ( S k e l e t o n KLD unloaded . \ n ) ; break ; default : r e t u r n EINVAL ; } return 0; } / Declare t h i s module t o t h e r e s t o f t h e k e r n e l / s t a t i c m o d u l e d a t a t skel mod = { s k e l , s k e l l o a d e r , NULL } ; DECLARE MODULE( s k e l e t o n , skel mod , SI SUB KLD , SI ORDER ANY ) ;
see http://www.captain.at/programming/freebsd/
Benedikt Waldvogel Linux kernel-space programming
Links
http://lxr.linux.no/ Cross-Referencing Linux http://fxr.watson.org/ FreeBSD and Linux Kernel Cross-Reference http://www.kernelnewbies.org http://www.lwn.net /usr/src/linux/Documentation/ Memory Management in Linux http://www.cse.psu.edu/anand/spring01/linux/memory.ppt
Benedikt Waldvogel
Sources
Linux Treiber entwickeln http://ezs.kr.hsnr.de/TreiberBuch/html/ Linux-Geraetetreiber (kernel 2.4) http://www.oreilly.de/german/freebooks/linuxdrive2ger/book1.html http://www.linux-magazin.de/Artikel/ausgabe/2004/05/ 094 kerntechnik10/kerntechnik10.html The Linux Kernel Module Programming Guide http://tldp.org/LDP/lkmpg/2.6/html/index.htm
Benedikt Waldvogel
cleanup module()
Benedikt Waldvogel