perf-tools
Performance analysis tools based on Linux perf_events (aka perf) and ftrace
Top Related Projects
BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!
System monitoring dashboard for terminal
A monitor of resources
htop - an interactive process viewer
Quick Overview
The perf-tools repository by Brendan Gregg is a collection of performance analysis tools for Linux systems. These tools are designed to help developers and system administrators diagnose and troubleshoot performance issues in Linux environments, utilizing various kernel tracing technologies such as ftrace and perf_events.
Pros
- Comprehensive set of tools for various performance analysis scenarios
- Lightweight and easy to use, with minimal dependencies
- Provides valuable insights into system and application performance
- Actively maintained and regularly updated
Cons
- Primarily focused on Linux systems, limiting use on other platforms
- Some tools may require root access, which can be a security concern
- Learning curve for understanding and interpreting the output of various tools
- May not be as feature-rich as some commercial performance analysis solutions
Getting Started
To get started with perf-tools:
-
Clone the repository:
git clone https://github.com/brendangregg/perf-tools.git
-
Navigate to the perf-tools directory:
cd perf-tools
-
Most tools can be run directly from the command line. For example, to use the iosnoop tool:
sudo ./iosnoop
-
Some tools may require additional setup or dependencies. Refer to the README.md file in the repository for specific instructions and requirements for each tool.
Note: Many of these tools require root access or special permissions to access kernel tracing facilities. Always use caution when running performance analysis tools on production systems.
Competitor Comparisons
BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Pros of BCC
- More powerful and flexible, leveraging eBPF for advanced tracing capabilities
- Supports multiple programming languages (Python, Lua, C++) for writing tracing tools
- Actively maintained with frequent updates and a larger community
Cons of BCC
- Steeper learning curve due to eBPF complexity and multiple language support
- Requires newer kernel versions (4.1+) for full functionality
- Heavier resource usage compared to simpler perf-tools scripts
Code Comparison
perf-tools example (bash):
#!/bin/bash
perf record -e block:block_rq_issue -ag sleep 5
perf script | awk '{ print $6 }' | sort | uniq -c | sort -nr | head -n5
BCC example (Python):
from bcc import BPF
b = BPF(text='BPF_HASH(start, u64);')
b.attach_kprobe(event="blk_start_request", fn_name="trace_start")
b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_start")
Both tools provide valuable insights into system performance, but BCC offers more advanced capabilities at the cost of increased complexity. perf-tools is simpler and works on older systems, while BCC leverages modern eBPF technology for deeper analysis.
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
Pros of Glances
- User-friendly, colorful, and interactive interface for real-time system monitoring
- Cross-platform support (Linux, macOS, Windows) with a web-based interface option
- Extensive plugin system for customization and additional metrics
Cons of Glances
- Less focused on detailed performance analysis and debugging
- May have higher resource overhead due to its comprehensive monitoring approach
- Limited low-level system profiling capabilities compared to perf-tools
Code Comparison
Glances (Python):
from glances import Glances
glances = Glances()
stats = glances.get_stats()
print(stats['cpu'])
perf-tools (Shell):
#!/bin/bash
perf stat -e cycles,instructions,cache-misses ./myapp
Glances provides a high-level, user-friendly approach to system monitoring with a focus on real-time visualization and cross-platform support. It's suitable for general system monitoring and quick overviews.
perf-tools offers low-level, detailed performance analysis tools specifically for Linux systems. It's more appropriate for in-depth performance debugging and profiling of specific applications or system components.
While Glances is written in Python and offers a programmable interface, perf-tools consists of shell scripts and leverages Linux kernel features for performance data collection.
Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!
Pros of netdata
- Provides a real-time, interactive web dashboard for system monitoring
- Offers a wide range of metrics and visualizations out-of-the-box
- Supports automatic detection and configuration of various system components
Cons of netdata
- Higher resource consumption due to continuous data collection and web interface
- May require more setup and configuration for complex environments
- Less focused on specific performance analysis tasks compared to perf-tools
Code Comparison
netdata configuration example:
[global]
update every = 1
memory mode = ram
history = 3600
perf-tools usage example:
./iolatency -Q 1 10
./execsnoop
While netdata focuses on providing a comprehensive monitoring solution with a web interface, perf-tools offers targeted command-line utilities for specific performance analysis tasks. netdata is better suited for ongoing system monitoring, while perf-tools excels at on-demand, low-overhead performance investigations.
System monitoring dashboard for terminal
Pros of gtop
- User-friendly graphical interface for system monitoring
- Real-time updates and interactive charts
- Easy to install and use with npm
Cons of gtop
- Limited to system-wide monitoring, lacks detailed per-process analysis
- Fewer advanced performance analysis features compared to perf-tools
- Higher resource consumption due to graphical interface
Code Comparison
gtop (JavaScript):
const si = require('systeminformation');
const blessed = require('blessed');
const contrib = require('blessed-contrib');
// ... (UI setup and data collection logic)
perf-tools (Shell):
#!/bin/bash
# Example: funccount tool
if [ $# -ne 2 ]; then
echo "USAGE: $0 duration command"
exit 1
fi
# ... (perf command execution and data processing)
gtop focuses on providing a visually appealing, real-time system monitoring interface using Node.js and blessed library. perf-tools, on the other hand, offers a collection of command-line tools written in shell scripts, leveraging the Linux perf utility for in-depth performance analysis.
While gtop excels in ease of use and visual representation, perf-tools provides more comprehensive and low-level performance analysis capabilities, making it more suitable for advanced users and detailed system profiling tasks.
A monitor of resources
Pros of btop
- User-friendly graphical interface with real-time system monitoring
- Customizable themes and layout options
- Cross-platform support (Linux, macOS, FreeBSD)
Cons of btop
- Less focused on detailed performance analysis and debugging
- May consume more system resources due to its graphical nature
- Limited command-line options compared to perf-tools
Code Comparison
btop (C++):
void Cpu::draw(int h, bool force) {
if (force) {
Cpu::collect();
redraw = true;
}
if (not redraw) return;
// ... (drawing logic)
}
perf-tools (Shell):
#!/bin/bash
# funccount - count kernel function calls matching a pattern
# Uses Linux ftrace.
[[ "$1" == "-h" || "$1" == "--help" ]] && exec grep '^#' "$0"
[[ "$1" == "-a" ]] && CALL=1 && shift
[[ "$1" == "" ]] && echo "ERROR: missing pattern" && exit 1
Both repositories offer performance monitoring tools, but with different approaches. btop provides a comprehensive system monitor with a graphical interface, while perf-tools focuses on command-line utilities for specific performance analysis tasks. btop is more user-friendly for general system monitoring, while perf-tools offers more specialized tools for in-depth performance debugging and analysis.
htop - an interactive process viewer
Pros of htop
- Interactive and user-friendly interface for real-time system monitoring
- Customizable display with color-coded metrics and process tree view
- Cross-platform support (Linux, macOS, FreeBSD)
Cons of htop
- Limited to system-wide performance monitoring
- Lacks advanced performance analysis features for specific subsystems
- Not designed for automated performance data collection or scripting
Code Comparison
htop (C):
static void Process_writeCommand(Process* this, int attr) {
int start = RichString_size(this->cmdline);
Process_writeCommandMaybeHighlighted(this, attr);
int end = RichString_size(this->cmdline);
RichString_setAttrn(&this->cmdline, CRT_colors[PROCESS_SHADOW], start, end - start);
}
perf-tools (Shell):
#!/bin/bash
#
# iosnoop - trace block device I/O.
# Written using Linux ftrace.
#
# This traces block I/O events as they are issued from the block I/O interface.
# It can be used to show details about I/O for performance analysis and more.
The htop code snippet demonstrates its C-based implementation for process command display, while perf-tools uses shell scripts for various performance analysis tasks.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
perf-tools
A miscellaneous collection of in-development and unsupported performance analysis tools for Linux ftrace and perf_events (aka the "perf" command). Both ftrace and perf are core Linux tracing tools, included in the kernel source. Your system probably has ftrace already, and perf is often just a package add (see Prerequisites).
These tools are designed to be easy to install (fewest dependencies), provide advanced performance observability, and be simple to use: do one thing and do it well. This collection was created by Brendan Gregg (author of the DTraceToolkit).
Many of these tools employ workarounds so that functionality is possible on existing Linux kernels. Because of this, many tools have caveats (see man pages), and their implementation should be considered a placeholder until future kernel features, or new tracing subsystems, are added.
These are intended for Linux 3.2 and newer kernels. For Linux 2.6.x, see Warnings.
Presentation
These tools were introduced in the USENIX LISA 2014 presentation: Linux Performance Analysis: New Tools and Old Secrets
- slides: http://www.slideshare.net/brendangregg/linux-performance-analysis-new-tools-and-old-secrets
- video: https://www.usenix.org/conference/lisa14/conference-program/presentation/gregg
Contents
Using ftrace:
- iosnoop: trace disk I/O with details including latency. Examples.
- iolatency: summarize disk I/O latency as a histogram. Examples.
- execsnoop: trace process exec() with command line argument details. Examples.
- opensnoop: trace open() syscalls showing filenames. Examples.
- killsnoop: trace kill() signals showing process and signal details. Examples.
- fs/cachestat: basic cache hit/miss statistics for the Linux page cache. Examples.
- net/tcpretrans: show TCP retransmits, with address and other details. Examples.
- system/tpoint: trace a given tracepoint. Examples.
- kernel/funccount: count kernel function calls, matching a string with wildcards. Examples.
- kernel/functrace: trace kernel function calls, matching a string with wildcards. Examples.
- kernel/funcslower: trace kernel functions slower than a threshold. Examples.
- kernel/funcgraph: trace a graph of kernel function calls, showing children and times. Examples.
- kernel/kprobe: dynamically trace a kernel function call or its return, with variables. Examples.
- user/uprobe: dynamically trace a user-level function call or its return, with variables. Examples.
- tools/reset-ftrace: reset ftrace state if needed. Examples.
Using perf_events:
- misc/perf-stat-hist: power-of aggregations for tracepoint variables. Examples.
- syscount: count syscalls by syscall or process. Examples.
- disk/bitesize: histogram summary of disk I/O size. Examples.
Using eBPF:
- As a preview of things to come, see the bcc tracing Tools section. These use bcc, a front end for using eBPF. bcc+eBPF will allow some of these tools to be rewritten and improved, and additional tools to be created.
Screenshots
Showing new processes and arguments:
# ./execsnoop Tracing exec()s. Ctrl-C to end. PID PPID ARGS 22898 22004 man ls 22905 22898 preconv -e UTF-8 22908 22898 pager -s 22907 22898 nroff -mandoc -rLL=164n -rLT=164n -Tutf8 22906 22898 tbl 22911 22910 locale charmap 22912 22907 groff -mtty-char -Tutf8 -mandoc -rLL=164n -rLT=164n 22913 22912 troff -mtty-char -mandoc -rLL=164n -rLT=164n -Tutf8 22914 22912 grotty
Measuring block device I/O latency from queue insert to completion:
# ./iolatency -Q Tracing block I/O. Output every 1 seconds. Ctrl-C to end. >=(ms) .. <(ms) : I/O |Distribution | 0 -> 1 : 1913 |######################################| 1 -> 2 : 438 |######### | 2 -> 4 : 100 |## | 4 -> 8 : 145 |### | 8 -> 16 : 43 |# | 16 -> 32 : 43 |# | 32 -> 64 : 1 |# | [...]
Tracing the block:block_rq_insert tracepoint, with kernel stack traces, and only for reads:
# ./tpoint -s block:block_rq_insert 'rwbs ~ "*R*"' cksum-11908 [000] d... 7269839.919098: block_rq_insert: 202,1 R 0 () 736560 + 136 [cksum] cksum-11908 [000] d... 7269839.919107:=> __elv_add_request => blk_flush_plug_list => blk_finish_plug => __do_page_cache_readahead => ondemand_readahead => page_cache_async_readahead => generic_file_read_iter => new_sync_read => vfs_read => SyS_read => system_call_fastpath [...]
Count kernel function calls beginning with "bio_", summarize every second:
# ./funccount -i 1 'bio_*' Tracing "bio_*"... Ctrl-C to end. FUNC COUNT bio_attempt_back_merge 26 bio_get_nr_vecs 361 bio_alloc 536 bio_alloc_bioset 536 bio_endio 536 bio_free 536 bio_fs_destructor 536 bio_init 536 bio_integrity_enabled 536 bio_put 729 bio_add_page 1004 [...]
There are many more examples in the examples directory. Also see the man pages.
Prerequisites
The intent is as few as possible. Eg, a Linux 3.2 server without debuginfo. See the tool man page for specifics.
ftrace
FTRACE configured in the kernel. You may already have this configured and available in your kernel version, as FTRACE was first added in 2.6.27. This requires CONFIG_FTRACE and other FTRACE options depending on the tool. Some tools (eg, funccount) require CONFIG_FUNCTION_PROFILER.
perf_events
Requires the "perf" command to be installed. This is in the linux-tools-common package. After installing that, perf may tell you to install an additional linux-tools package (linux-tools-kernel_version). perf can also be built under tools/perf in the kernel source. See perf_events Prerequisites for more details about getting perf_events to work fully.
debugfs
Requires a kernel with CONFIG_DEBUG_FS option enabled. As with FTRACE, this may already be enabled (debugfs was added in 2.6.10-rc3). The debugfs also needs to be mounted:
# mount -t debugfs none /sys/kernel/debug
awk
Many of there scripts use awk, and will try to use either mawk or gawk depending on the desired behavior: mawk for buffered output (because of its speed), and gawk for synchronous output (as fflush() works, allowing more efficient grouping of writes).
Install
These are just scripts. Either grab everything:
git clone --depth 1 https://github.com/brendangregg/perf-tools
Or use the raw links on github to download individual scripts. Eg:
wget https://raw.githubusercontent.com/brendangregg/perf-tools/master/iosnoop
This preserves tabs (which copy-n-paste can mess up).
Warnings
Ftrace was first added to Linux 2.6.27, and perf_events to Linux 2.6.31. These early versions had kernel bugs, and lockups and panics have been reported on 2.6.32 series kernels. This includes CentOS 6.x. If you must analyze older kernels, these tools may only be useful in a fault-tolerant environment, such as a lab with simulated issues. These tools have been primarily developed on Linux 3.2 and later kernels.
Depending on the tool, there may also be overhead incurred. See the next section.
Internals and Overhead
perf_events is evolving. This collection began development circa Linux 3.16, with Linux 3.2 servers as the main target, at a time when perf_events lacks certain programmatic capabilities (eg, custom in-kernel aggregations). It's possible these will be added in a forthcoming kernel release. Until then, many of these tools employ workarounds, tricks, and hacks in order to work. Some of these tools pass event data to user space for post-processing, which costs much higher overhead than in-kernel aggregations. The overhead of each tool is described in its man page.
WARNING: In extreme cases, your target application may run 5x slower when using these tools. Depending on the tool and kernel version, there may also be the risk of kernel panics. Read the program header for warnings, and test before use.
If the overhead is a problem, these tools can be improved. If a tool doesn't already, it could be rewritten in C to use perf_events_open() and mmap() for the trace buffer. It could also implement frequency counts in C, and operate on mmap() directly, rather than using awk/Perl/Python. Additional improvements are possible for ftrace-based tools, such as use of snapshots and per-instance buffers.
Some of these tools are intended as short-term workarounds until more kernel capabilities exist, at which point they can be substantially rewritten. Older versions of these tools will be kept in this repository, for older kernel versions.
As my main target is a fleet of Linux 3.2 servers that do not have debuginfo, these tools try not to require it. At times, this makes the tool more brittle than it needs to be, as I'm employing workarounds (that may be kernel version and platform specific) instead of using debuginfo information (which can be generic). See the man page for detailed prerequisites for each tool.
I've tried to use perf_events ("perf") where possible, since that interface has been developed for multi-user use. For various reasons I've often needed to use ftrace instead. ftrace is surprisingly powerful (thanks Steven Rostedt!), and not all of its features are exposed via perf, or in common usage. This tool collection is in some ways a demonstration of hidden Linux features using ftrace.
Since things are changing, it's very possible you may find some tools don't work on your Linux kernel version. Some expertise and assembly will be required to fix them.
Links
A case study and summary:
- 13 Aug 2014: http://lwn.net/Articles/608497 Ftrace: The hidden light switch
Related articles:
- 28 Jun 2015: http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html
- 31 Dec 2014: http://www.brendangregg.com/blog/2014-12-31/linux-page-cache-hit-ratio.html
- 06 Sep 2014: http://www.brendangregg.com/blog/2014-09-06/linux-ftrace-tcp-retransmit-tracing.html
- 28 Jul 2014: http://www.brendangregg.com/blog/2014-07-28/execsnoop-for-linux.html
- 25 Jul 2014: http://www.brendangregg.com/blog/2014-07-25/opensnoop-for-linux.html
- 23 Jul 2014: http://www.brendangregg.com/blog/2014-07-23/linux-iosnoop-latency-heat-maps.html
- 16 Jul 2014: http://www.brendangregg.com/blog/2014-07-16/iosnoop-for-linux.html
- 10 Jul 2014: http://www.brendangregg.com/blog/2014-07-10/perf-hacktogram.html
Top Related Projects
BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!
System monitoring dashboard for terminal
A monitor of resources
htop - an interactive process viewer
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot