Program Profiling with GNU gprof


Optimizing code performance is a fundamental aspect of software development, and profiling stands out as a pivotal technique in this pursuit. A notable tool for this purpose is GNU gprof↗.

Profiling gives valuable information about the processes within a program. How long takes the execution of a function? How often is this function being called? This knowledge can be used for targeted optimization.

This article aims to provide a concise introduction, shedding light on the essence of gprof, its functioning, and offering a brief example.

Understanding gprof

What is gprof?

Gprof is a profiling tool that comes bundled with GCC. It allows developers to analyze the execution time of functions in a program, helping to pinpoint which functions consume the most resources.

How does it work?

Gprof works by instrumenting the binary code of your program. It inserts code to count the number of times each function is called and the time spent in each function. After the program finishes running, gprof generates a detailed report, providing insights into the time distribution across different functions.

A Simple Example

Let's dive into a simple example to illustrate how gprof can be used.

#include <stdio.h>

    for (int i = 0; i < 1000000; i++)
        /* Some time-consuming operation */

    printf("This function is fast!\n");


    return 0;
Listing 1: Example code

Compiling with gprof

To compile the code with gprof, use the following command:

$ gcc -pg -o my_program my_program.c
Command 1: Compilation

This command tells the compiler to include profiling information in the binary.

Run the Program

Execute the compiled program:

$ ./my_program
Command 2: Execution

This will generate a file named `gmon.out` containing profiling information.

Viewing the gprof report:

Finally, run gprof to generate and view the report:

$ gprof my_program gmon.out > analysis.txt
Command 3: Analysis

Open `analysis.txt` to see a detailed breakdown of the time spent in each function.

Flat profile:
Each sample counts as 0.01 seconds.
 no time accumulated

  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
  0.00      0.00     0.00        2     0.00     0.00  slow_function
  0.00      0.00     0.00        1     0.00     0.00  fast_function

                     Call graph (explanation follows)

granularity: each sample hit covers 4 byte(s) no time propagated

index % time    self  children    called     name
                0.00    0.00       2/2           main [8]
[1]      0.0    0.00    0.00       2         slow_function [1]
                0.00    0.00       1/1           main [8]
[2]      0.0    0.00    0.00       1         fast_function [2]

Index by function name

   [2] fast_function           [1] slow_function
Listing 2: Shortened analysis of the example code


In this brief example, I've scratched the surface of GNU gprof's capabilities. Profiling tools like gprof are essential for understanding code behavior and identifying areas for optimization. As you delve deeper into the world of performance tuning, gprof becomes a valuable companion in your toolkit. Experiment with it on your own codebase and uncover opportunities to make your programs faster and more efficient.

Happy profiling!