The Chapel developer community is excited to announce the release of Chapel version 1.32! To obtain a copy, please refer to the Downloading Chapel page on the Chapel website.

Highlights of Chapel 1.32

Chapel 2.0 Release Candidate

The main highlight of Chapel 1.32 is that it is a release candidate for our forthcoming Chapel 2.0 release! If you’re not familiar with the concept of Chapel 2.0, it is intended to be a release that declares a core subset of the language and library features as ‘stable’. These features are ones that we intend to support in their current form going forward, such that code relying on them will not break across releases. Meanwhile, other features will be considered ‘unstable’, implying that they are ones where we are still learning from user experiences and refining interfaces before considering them to be stabilized. Unstable features may continue evolving after the 2.0 release, either by improving them until they too are stable, or replacing them with other, more stable features.

Chapel 1.32 being a 2.0 release candidate means that this is a key time for Chapel users to give us feedback about aspects of our design that they would like to see change prior to the 2.0 release. Users may also want to compile their programs with the --warn-unstable flag in order to identify any unstable features that they are currently relying upon. Reliance on such features could motivate you to advocate for stabilizing those features sooner, or you could simply view it as an opportunity to be aware that those features may continue to evolve over time. We are generally interested in hearing about which unstable features user code is currently relying upon, to help with our own prioritization efforts.

Users with feedback about 2.0 readiness or the stability of current features are encouraged to share it with us on Chapel’s Discourse user forum or as a GitHub issue.

As part of the team’s push to make this a worthy Chapel 2.0 release candidate, Chapel 1.32 contains a large number of improvements to the language, compiler, and libraries. Some of these changes include:

For more information about these changes, and many others not summarized here, refer to the file, documentation for Chapel 1.32, or forthcoming release note slides.

GPU Improvements

Version 1.32 includes significant improvements to Chapel’s support for vendor-neutral GPU programming, both in terms of performance and capabilities.

Key performance improvements include:

The non-trivial impact of these optimizations can be seen in the following graphs, which show the improvements that have occurred in a Chapel port of the SHOC Sort benchmark on both NVIDIA and AMD GPUs. Note that the second graph includes data transfer times while the first does not.

Chapel’s support for AMD effectively reaches feature parity with NVIDIA in this release, largely due to the addition of a number of math routines that had not been supported for AMD in Chapel 1.31. In addition, the Chapel compiler’s --savec flag can now be used to inspect the assembly code generated when targeting AMD GPUs.

Meanwhile, when targeting NVIDIA GPUs, Chapel 1.32 adds support for generating multi-architecture binaries by setting CHPL_GPU_ARCH to a comma-separated list of target architectures.

See the latest GPU Programming technical note for additional details about these changes and Chapel’s overall support for GPUs in 1.32.

Support for Co-Locales

Since its inception, Chapel has preferred to represent each compute node as a single top-level locale, using multitasking to implement any intra-node parallelism. This approach has been beneficial in many problem domains where running a process per core could result in larger memory requirements or poor surface-to-volume effects due to the amount of [note:SPMD = Single Program, Multiple Data, a static and coarse-grained style of parallelism in which multiple copies of the same program are executed, e.g. one per processor core ] parallelism.

However, as modern compute nodes have begun to support multiple [note:NICs = Network Interface Chips, which permit processes to communicate with remote nodes ] this traditional approach has faced challenges. Specifically, it is unduly complicated to have a single locale (UNIX process) leverage multiple NICs effectively; yet using just one NIC leaves potential performance benefits on the floor by not exercising the network to its full capacity.

To address this, Chapel 1.32 introduces user-facing support for co-locales, in which multiple locales can be mapped to a single compute node. Using co-locales can lead to performance improvements by making better use of the network and/or reducing the number of memory references that cross between sockets. For example, the following charts show improvements to a pair of benchmarks when run using two locales per node on a dual-NIC HPE Cray EX system using Slingshot 11:

Current support is limited to running a locale per socket on a given compute node, and is also limited to certain platforms and configurations:

To opt-in to using co-locales, specify the number of locales for your Chapel program using a product of nodes and locales per node. For example, the following invocation:

$ ./myChapelProgram -nl 8x2

says to run the Chapel program on 8 nodes with 2 locales per node, for a total of 16 locales.

For more information on using co-locales with Chapel, please refer to the online documentation.

IO Serialization Framework

The IO serialization framework that was prototyped in Chapel 1.31 is now used by default for calls like writeln() and read(), and it is also available for use with types written by end-users.

As an illustration, consider the following example that prints an array in a couple of different formats:

use IO, JSON;

var A = [1, 2, 3, 4];

writeln(A);             // prints '1 2 3 4'  

var jsonWriter = stdout.withSerializer(jsonSerializer);  
jsonWriter.writeln(A);  // prints '[1, 2, 3, 4]'  

Line 5 uses a normal writeln() to print the array of integers to the standard console output (stdout) using Chapel’s traditional format—one element at a time, separated by spaces. Then, in line 7, we create a variant of stdout that uses the JSON serializer for all write()s called on it. The result is that when we write the array to this output stream in line 8, it is printed using standard JSON formatting. Other current serializers support binary, YAML, and Chapel syntax as alternate formats.

The new serialization framework also includes deserializers, which support reading values back in from the given format. And most importantly, users can now define their own methods specifying how their types should be written or read. This can be done in a format-neutral manner for simplicity, or in a way that’s sensitive to the output format when needed. For more information on defining these methods, please refer to their online documentation.

Improved ARM64 Support

Thanks to our colleagues on the Qthreads team at Sandia National Laboratories, support for ARM64 chips is significantly improved in Chapel 1.32. Specifically, this release bundles version 1.19 of Qthreads, in which task creation and switching have been re-implemented using assembly code for ARM64 chips. This can dramatically reduce multitasking overheads when using Chapel’s preferred CHPL_TASKS=qthreads mode.

As a simple illustration, the following table shows the impact of this fast task switching on a 16-node run of Bale Index Gather using various implementation strategies:

Approach w/out fast tasks with fast tasks improvement
ordered 70.7 MB/s/node 84.7 MB/s/node 1.20x
ordered, oversubscribed 86.3 MB/s/node 140.4 MB/s/node 1.63x
unordered 147.5 MB/s/node 152.3 MB/s/node 1.03x
aggregated 1352.0 MB/s/node 1448.5 MB/s/node 1.07x

In addition, Qthreads 1.19 also improved portability for ARM64-based platforms. This enables the use of CHPL_TASKS=qthreads on a wider variety of systems, such as M1/M2 Macs, where it is now the default.

And much more…

Beyond the highlights mentioned here, Chapel 1.32 contains numerous other improvements to Chapel’s features and interfaces, such as:

For a more complete list of changes in Chapel 1.32, please refer to its file.

For More Information

For questions about any of the changes in this release, please reach out to the developer community on Discourse.

As always, we’re interested in feedback on how we can help make the Chapel language, libraries, implementation, and tools more useful to you in your work.

And always, thanks to everyone who contributed to the Chapel 1.32 release!