Chapel Performance Tips
Use the --fast
flag
Once you have a Chapel program that you believe to be correct and want
to run performance timings on it, make sure to compile with the
--fast
flag. This turns off a number of execution-time
checks, turns on back-end compiler optimizations, and is key to
achieving competitive performance with Chapel. See its entry on the
compiler
man page for details.
Why don't we compile Chapel programs with --fast
by
default? Because then most user coding errors like out-of-bounds
indexing or nil-dereferences would get reported to us as bugs. We
believe that having Chapel catch such errors by default and then
requiring users take off the safety belt once they're ready to go fast
is most productive for everyone.
Check your Communications
Most bad performance for multi-locale Chapel programs is due to
inadvertently doing too much communication. Though Chapel makes it
trivial to refer to remote values, doing so frequently can kill
performance. You can instrument your program to see where
communication is being introduced using the CommDiagnostics
module or the chplvis
tool.
Once you've found a section of your Chapel program that communicates
more than it should, there are a variety of ways of fixing it
including caching remote values manually or using advanced language
features for asserting locality. If you need help with these...
Engage the Chapel Team
Though Chapel performance has improved by leaps and bounds in recent
years and can typically be made competitive with C + MPI + OpenMP,
there are still plenty of cases where our implementation doesn't do as
well as it should. Because it can be difficult to tell whether a
performance problem is due to a problem on your side or ours, please
don't hesitate to contact the Chapel development team through the
channels available to users in order to
get help with performance debugging your code. In addition to saving
you time and frustration, this can provide us with valuable feedback
about where we should do further tuning and optimization.