forked from brendangregg/FlameGraph
-
Notifications
You must be signed in to change notification settings - Fork 0
rousya/FlameGraph
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Flame Graphs visualize hot-CPU code-paths. Using DTrace, see: http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/ Using perf_events or SystemTap, see: http://dtrace.org/blogs/brendan/2012/03/17/linux-kernel-performance-flame-graphs/ Using XCode Instruments, see: http://schani.wordpress.com/2012/11/16/flame-graphs-for-instruments/ These can be created in three steps: 1. Capture stacks 2. Fold stacks 3. flamegraph.pl 1. Capture stacks ================= Stack samples can be captured using DTrace, perf_events or SystemTap. Using DTrace to capture 60 seconds of kernel stacks at 997 Hertz: # dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); } tick-60s { exit(0); }' -o out.kern_stacks Using DTrace to capture 60 seconds of user-level stacks for PID 12345 at 97 Hertz: # dtrace -x ustackframes=100 -n 'profile-97 /PID == 12345 && arg1/ { @[ustack()] = count(); } tick-60s { exit(0); }' -o out.user_stacks Using DTrace to capture 60 seconds of user-level stacks, including while time is spent in the kernel, for PID 12345 at 97 Hertz: # dtrace -x ustackframes=100 -n 'profile-97 /PID == 12345/ { @[ustack()] = count(); } tick-60s { exit(0); }' -o out.user_stacks Switch ustack() for jstack() if the application has a ustack helper to include translated frames (eg, node.js frames; see: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/). The rate for user-level stack collection is deliberately slower than kernel, which is especially important when using jstack() as it performs additional work to translate frames. 2. Fold stacks ============== Use the stackcollapse programs to fold stack samples into single lines. The programs provided are: - stackcollapse.pl: for DTrace stacks - stackcollapse-perf.pl: for perf_events "perf script" output - stackcollapse-stap.pl: for SystemTap stacks - stackcollapse-instruments.pl: for XCode Instruments Usage example: $ ./stackcollapse.pl out.kern_stacks > out.kern_folded The output looks like this: unix`_sys_sysenter_post_swapgs 1401 unix`_sys_sysenter_post_swapgs;genunix`close 5 unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf 85 unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;c2audit`audit_closef 26 unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;c2audit`audit_setf 5 unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`audit_getstate 6 unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`audit_unfalloc 2 unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`closef 48 [...] 3. flamegraph.pl ================ Use flamegraph.pl to render a SVG. $ ./flamegraph.pl out.kern_folded > kernel.svg An advantage of having the folded input file (and why this is separate to flamegraph.pl) is that you can use grep for functions of interest. Eg: $ grep cpuid out.kern_folded | ./flamegraph.pl > cpuid.svg Provided Example ================ An example output from DTrace is included, both the captured stacks and the resulting Flame Graph. You can generate it yourself using: $ ./stackcollapse.pl example-stacks.txt | ./flamegraph.pl > example.svg This was from a particular performance investigation: the Flame Graph identified that CPU time was spent in the lofs module, and quantified that time. Options ======= See the USAGE message (--help) for options: USAGE: ./flamegraph.pl [options] infile > outfile.svg --titletext # change title text --width # width of image (default 1200) --height # height of each frame (default 16) --minwidth # omit smaller functions (default 0.1 pixels) --fonttype # font type (default "Verdana") --fontsize # font size (default 12) --countname # count type label (default "samples") --nametype # name type label (default "Function:") eg, ./flamegraph.pl --titletext="Flame Graph: malloc()" trace.txt > graph.svg As suggested in the example, flame graphs can process traces of any event, such as malloc()s, provided stack traces are gathered.
About
stack trace visualizer
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Perl 97.0%
- D 3.0%