Papers
Featured Publications
This paper uses Chapel in a novel knowledge-sharing setting to support a general parallel framework for calibrating distributed hydrologic models. The approach is unique due to the use of a novel search algorithm as well as its interoperability with C#, fault tolerance, parallelism, and reliability.
This paper presents a compiler optimization that targets irregular memory accesses patterns in Chapel programs. Specifically, it uses static analysis to identify irregular memory accesses to distributed arrays in parallel loops and employs code transformations to generate an inspector and executor to perform selective data replication at runtime.
This paper revisits the design and implementation of tree search algorithms dealing with multiple GPUs, in addition to scalability and productivity-awareness using Chapel. The proposed algorithm exploits Chapel's distributed iterators by combining a partial search strategy with pre-compiled CUDA kernels for more efficient exploitation of the intra-node parallelism.
This paper describes a Computational Fluid Dynamics framework being developed using Chapel by a team at Polytechnique Montreal. The use of Chapel is described, and scaling results are given on up to 9k cores of a Cray XC. Comparisons are made against well-established CFD software packages.
Recent Publications
This work presents a local search for automatic parameterization of ChapelBB, a distributed tree search application for solving combinatorial optimization problems written in Chapel. The main objective of the proposed heuristic is to overcome the limitation of manual parameterization, which covers a limited feasible space.
This paper describes an implementation of Chapel's arrays that leverages the language's support for user-defined data distributions to implement the array using fabric-attached memory (FAM) rather than simply local DRAM.
This work compares the performance of a Chapel-based fractal generation on shared- and distributed-memory platforms with corresponding OpenMP and MPI+X implementations.
Chapel Overviews
This is currently the best introduction to Chapel's history, motivating themes, and features. It also provides a brief summary of current and future activities at the time of writing. An early pre-print of this chapter was made available under the name A Brief Overview of Chapel.
This is an early overview of Chapel's themes and main language concepts.
Chapel Project Updates
This paper describes the progress that has been made with Chapel since the HPCS program wrapped up.
This paper provides a snapshot of the Chapel project at the juncture between the end of the HPCS project and the start of the next phase in Chapel's development. It covers past successes, current status, and future directions.
Chapel Optimizations
This paper describes a pair of recent compiler optimizations focused on reducing communication overheads in Chapel, leveraging Chapel's high-level abstractions—one that strength reduces local array accesses, and a second which aggregates communications to amortize overheads.
This paper describes an approach that can efficiently train machine learning models that can be used to improve application execution times and scalability on distributed memory systems. This is achieved by analyzing the fine-grained communication profile of the application with small input data, and then predicting the communication patterns for more realistic inputs and coarsening the communication.
This paper describes how LLVM passes can optimize communication in PGAS languages like Chapel. In particular, by representing potentially remote addresses using a distinct address space, existing LLVM optimization passes can be used to reduce communication
This paper describes an optimization implemented for Chapel in which the runtime library aggregates puts and gets in accordance with Chapel's memory consistency model in order to reduce the potential overhead of doing fine-grained communications.
Applications of Chapel
This paper compares Chapel with Julia, Python/Numba, and C+OpenMP in terms of performance, scalability and productivity. Two parallel metaheuristics are implemented for solving the 3D Quadratic Assignment Problem (Q3AP), using thread-based parallelism on a multi-core shared-memory computer. The paper also evaluates and compares the performance of the languages for a parallel fitness evaluation loop, using four different test functions with different computational characteristics. The authors provide feedback on the implementation and parallelization process in each language.
This paper applies hypergraph analytics over a gigascale DNS data using CHGL, performing compute-intensive calculations for data reduction and segmentation. Identified portions are then sent to HNX for both exploratory analysis and knowledge discovery targeting known tactics, techniques, and procedures.
This paper uses Chapel to study the design and implementation of distributed Branch-and-Bound algorithms for solving large combinatorial optimization problems. Experiments on the proposed algorithms are performed using the Flow-shop scheduling problem as a test-case. The Chapel-based application is compared to a state-of-the-art MPI+Pthreads-based counterpart in terms of performance, scalability, and productivity.
This paper compares implementations of Breadth-First Search and Triangle Counting in Chapel and UPC++
Multiresolution Chapel Features
This paper describes how users can create parallel iterators that support zippered iteration in Chapel, demonstrating them via several examples that partition iteration spaces statically and dynamically.
This paper builds on our HotPAR 2010 paper by describing the programmer's role in implementing user-defined distributions and layouts in Chapel.
This paper describes our approach and software framework for implementing user-defined distributions and memory layouts using Chapel's domain map concept.
Chapel Tools
This paper describes a tool that uses a combination of data-centric and code-centric information to relate performance profiling information back to user-level data structures and source code in Chapel programs.
This paper proposes a high-level, data-centric profiler to analyze how distributed arrays are used by each locale.
Chapel Explorations
This paper describes a high-level, easy-to-use language feature to improve data locality efficiently.
This paper explores the expression of parameterized diamond-shaped time-space tilings in Chapel, demonstrating competitive performance with C+OpenMP along with significant software engineering benefits due to Chapel's support for parallel iterators.
Chapel Historical Papers
This is the original Chapel paper which lays out some of our motivation and foundations for exploring the language. Note that the language has evolved significantly since this paper was published, but it remains an interesting historical artifact.