Memory Profiling
When using a non-standard memory allocator such as jemalloc, the allocator may provide profiling capabilities that can be useful for understanding memory usage patterns. For Chapel programs, this can be used to augment the existing Chapel runtime’s memory tracking capabilities. For the Chapel compiler, this can be used to understand the memory usage of the compiler itself.
Profiling the compiler with jemalloc
To profile the compiler with jemalloc, make sure you build the compiler with the following environment variables set:
export CHPL_HOST_MEM=jemalloc
export CHPL_HOST_JEMALLOC="bundled"
export CHPL_JEMALLOC_ENABLE_PROFILING=1
Then the compiler can be run as follows:
MALLOC_CONF='prof:true,prof_prefix:jeprof.out,prof_final:true' \\
chpl myProgram.chpl
The value of prof_prefix
is the prefix for the output files.
Note
It may be useful to pass --no-compiler-driver
to the compiler. Without this
flag, the compiler functions as a driver and spawns several subprocesses for
different phases of the compiler. This will results in multiple profiling
output files, one for each subprocess. If you want to profile the compiler as a
single process, use --no-compiler-driver
.
Output files can be interpreted using jeprof
(which is built with
jemalloc). For example, the following will generate a simple text report,
ordered by cumulative memory usage:
JEMALLOC_SUBDIR=$(./util/chplenv/chpl_jemalloc.py --host --cfg-path)
./third-party/jemalloc/install/$JEMALLOC_SUBDIR/bin/jeprof \\
--text --cum $(which chpl) path/to/profile/data/jeprof.out
Profiling Chapel programs with jemalloc
Warning
When writing these docs (as of Chapel 2.6), I was unable to get jemalloc’s profiling to give good results for Chapel programs. jemalloc would always report 0 allocations, which is not correct or useful. I believe there are changes to the runtime that need to be made before this will work correctly. However, I am including these docs here for completeness.
Similar steps for profiling the compiler can be used for profiling Chapel programs. The compiler/runtime must be built with the following environment variables set:
export CHPL_HOST_MEM=jemalloc
export CHPL_HOST_JEMALLOC="bundled"
export CHPL_JEMALLOC_ENABLE_PROFILING=1
Then, after compiling your Chapel program, you can run it with the following command:
CHPL_JE_MALLOC_CONF='prof:true,prof_prefix:jeprof.out,prof_final:true' \\
./myProgram
The resulting profiling data can be analyzed using jeprof
in a similar way
as described above for the compiler:
JEMALLOC_SUBDIR=$(./util/chplenv/chpl_jemalloc.py --target --cfg-path)
./third-party/jemalloc/install/$JEMALLOC_SUBDIR/bin/jeprof \\
--text --cum ./myProgram path/to/profile/data/jeprof.out