Task Parallelism and Synchronization¶

Chapel supports both task parallelism and data parallelism. This chapter details task parallelism as follows:

Tasks and Task Parallelism introduces tasks and task parallelism.
The Begin Statement describes the begin statement, an unstructured way to introduce concurrency into a program.
Synchronization Variables describes synchronization variables, an unstructured mechanism for synchronizing tasks.
Atomic Variables describes atomic variables, a mechanism for supporting atomic operations.
The Cobegin Statement describes the cobegin statement, a structured way to introduce concurrency into a program.
The Coforall Loop describes the coforall loop, another structured way to introduce concurrency into a program.
Task Intents specifies how variables from outer scopes are handled within begin, cobegin and coforall statements. Task-Private Variables are also available.
The Sync Statement describes the sync statement, a structured way to control parallelism.
The Serial Statement describes the serial statement, a structured way to suppress parallelism.
Atomic Statements describes the atomic statement, a construct to support atomic transactions.

Tasks and Task Parallelism¶

A Chapel task is a distinct context of execution that may be running concurrently with other tasks. Chapel provides a simple construct, the begin statement, to create tasks, introducing concurrency into a program in an unstructured way. In addition, Chapel introduces two type qualifiers, sync and single, for synchronization between tasks.

Chapel provides two constructs, the cobegin and coforall statements, to introduce concurrency in a more structured way. These constructs create multiple tasks but do not continue until these tasks have completed. In addition, Chapel provides two constructs, the sync and serial statements, to insert synchronization and suppress parallelism. All four of these constructs can be implemented through judicious uses of the unstructured task-parallel constructs described in the previous paragraph.

Tasks are considered to be created when execution reaches the start of a begin, cobegin, or coforall statement. When the tasks are actually executed depends on the Chapel implementation and run-time execution state.

A task is represented as a call to a task function, whose body contains the Chapel code for the task. Variables defined in outer scopes are considered to be passed into a task function by default intent, unless a different task intent is specified explicitly by a task-intent-clause.

Accesses to the same variable from different tasks are subject to the Memory Consistency Model (Memory Consistency Model). Such accesses can result from aliasing due to ref argument intents or task intents, among others.

The Begin Statement¶

The begin statement creates a task to execute a statement. The syntax for the begin statement is given by

begin-statement:
  'begin' task-intent-clause[OPT] statement

Control continues concurrently with the statement following the begin statement.

Example (beginUnordered.chpl).

The code
begin writeln("output from spawned task");
writeln("output from main task");
executes two writeln statements that output the strings to the terminal, but the ordering is purposely unspecified. There is no guarantee as to which statement will execute first. When the begin statement is executed, a new task is created that will execute the writeln statement within it. However, execution will continue immediately after task creation with the next statement.

A begin statement creates a single task function, whose body is the body of the begin statement. The handling of the outer variables within the task function and the role of task-intent-clause are defined in Task Intents.

Yield and return statements are not allowed in begin blocks. Break and continue statements may not be used to exit a begin block.

Synchronization Variables¶

Synchronization variables have a logical state associated with the value. The state of the variable is either full or empty. Normal reads of a synchronization variable cannot proceed until the variable’s state is full. Normal writes of a synchronization variable cannot proceed until the variable’s state is empty.

Chapel supports two types of synchronization variables: sync and single. Both types behave similarly, except that a single variable may only be written once. Consequently, when a sync variable is read, its state transitions to empty, whereas when a single variable is read, its state does not change. When either type of synchronization variable is written, its state transitions to full.

sync and single are type qualifiers and precede the type of the variable’s value in the declaration. sync and single are supported for the primitive types nothing, bool, int, uint, real, imag, complex, range, bytes, and string ( Primitive Types); for enumerated types ( Enumerated Types); and for class types (Class Types) and record types (Record Types). For sync variables of class type, the full/empty state applies to the reference to the class object, not to its member fields.

If a task attempts to read or write a synchronization variable that is not in the correct state, the task is suspended. When the variable transitions to the correct state, the task is resumed. If there are multiple tasks blocked waiting for the state transition:

for a sync variable, one task is non-deterministically selected to proceed and the others continue to wait

for a single variable, all tasks are selected to proceed.

A synchronization variable is specified with a sync or single type given by the following syntax:

sync-type:
  'sync' type-expression

single-type:
  'single' type-expression

A default-initialized synchronization variable will be empty. A synchronization variable initialized from another expression will be full and store the value from that expression.

Example (beginWithSyncVar.chpl).

The code
class Tree {
  var isLeaf: bool;
  var left, right: unmanaged Tree?;
  var value: int;

  proc sum():int {
    if (isLeaf) then
       return value;

    var x$: sync int;
    begin x$.writeEF(left!.sum());
    var y = right!.sum();
    return x$.readFE() + y;
  }
}
the sync variable x$ is assigned by an asynchronous task created with the begin statement. The task returning the sum waits on the reading of x$ until it has been assigned. By convention, synchronization variables end in $ to provide a visual cue to the programmer indicating that the task may block.

Example (singleVar.chpl).

The following code implements a simple split-phase barrier using a single variable.
var count$: sync int = n;  // counter which also serves as a lock
var release$: single bool; // barrier release

forall t in 1..n do begin {
  work(t);
  var myc = count$.readFE();  // read the count, set state to empty
  if myc!=1 {
    write(".");
    count$.writeEF(myc-1);   // update the count, set state to full
    // we could also do some work here before blocking
    release$.readFF();
  } else {
    release$.writeEF(true);  // last one here, release everyone
    writeln("done");
  }
}
In each iteration of the forall loop after the work is completed, the task reads the count$ variable, which is used to tally the number of tasks that have arrived. All tasks except the last task to arrive will block while trying to read the variable release$. The last task to arrive will write to release$, setting its state to full at which time all the other tasks can be unblocked and run.

If a formal argument with a default intent either has a synchronization type or the formal is generic (Formal Arguments of Generic Type) and the actual has a synchronization type, the actual must be an lvalue and is passed by reference. In these cases the formal itself is an lvalue, too. The actual argument is not read or written during argument passing; its state is not changed or waited on. The qualifier sync or single without the value type can be used to specify a generic formal argument that requires a sync or single actual.

When the actual argument is a sync or single and the corresponding formal has the actual’s base type or is implicitly converted from that type, a normal read of the actual is performed when the call is made, and the read value is passed to the formal.

Predefined Single and Sync Methods¶

The following methods are defined for variables of sync and single type.

proc (sync t).readFE(): t

Returns the value of the sync variable. This method blocks until the sync variable is full. The state of the sync variable is set to empty when this method completes. This method implements the normal read of a sync variable.

proc (sync t).readFF(): t
proc (single t).readFF(): t

Returns a copy of the value of the sync or single variable. This method blocks until the sync or single variable is full. The state of the sync or single variable remains full when this method completes. This method implements the normal read of a single variable.

proc (sync t).readXX(): t
proc (single t).readXX(): t

This method does not block and the state of the sync or single variable is unchanged when this method completes.

This function returns:

for a full sync or single, a copy of the value stored

for an empty sync or single, the implementation will return either a new default-initialzed value of type t or the last value stored.
proc (sync t).writeEF(v: t)
proc (single t).writeEF(v: t)

Assigns v to the value of the sync or single variable. This method blocks until the sync or single variable is empty. The state of the sync or single variable is set to full when this method completes. This method implements the normal write of a sync or single variable.

proc (sync t).writeFF(v: t)

Assigns v to the value of the sync variable. This method blocks until the sync variable is full. The state of the sync variable remains full when this method completes.

proc (sync t).writeXF(v: t)

Assigns v to the value of the sync variable. This method is non-blocking and the state of the sync variable is set to full when this method completes.

proc (sync t).reset()

Assigns the default value of type t to the value of the sync variable. This method is non-blocking and the state of the sync variable is set to empty when this method completes.

proc (sync t).isFull: bool
proc (single t).isFull: bool

Returns true if the sync or single variable is full and false otherwise. This method is non-blocking and the state of the sync or single variable is unchanged when this method completes.

Atomic Variables¶

atomic is a type qualifier that precedes the variable’s type in the declaration. An atomic variable is specified with an atomic type given by the following syntax:

atomic-type:
  'atomic' type-expression

For example, the following code declares an atomic variable x that stores an int:

var x: atomic int;

Such an atomic variable that is declared without an initialization expression will store the default value of the contained type (i.e. 0 or false).

Atomic variables can also be declared with an initial value:

var y: atomic int = 1;

Chapel currently supports atomic operations for bools, all supported sizes of signed and unsigned integers, as well as all supported sizes of reals. Note that not all operations are supported for all atomic types. The supported types are listed for each operation.

Rationale.

The choice of supported atomic variable types as well as the atomic operations was strongly influenced by the C11 standard.

Most atomic methods accept an optional argument named order of type memoryOrder. The order argument is used to specify the ordering constraints of atomic operations. The supported memoryOrder values are:

memoryOrder.relaxed

memoryOrder.acquire

memoryOrder.release

memoryOrder.acqRel

memoryOrder.seqCst

See also Memory Consistency Model and in particular Non-Sequentially Consistent Atomic Operations for more information on the meaning of these memory orders.

Unless specified, the default for the memoryOrder parameter is memoryOrder.seqCst.

Implementors’ note.

Not all architectures or implementations may support all memoryOrder values. In these cases, the implementation should default to a more conservative ordering than specified.

proc atomicFence(param order: memoryOrder = memoryOrder.seqCst)¶: An atomic fence that establishes an ordering of non-atomic and relaxed atomic operations.

atomic (bool)

proc read(param order: memoryOrder = memoryOrder.seqCst): bool¶

Returns: The stored value.

proc write(value: bool, param order: memoryOrder = memoryOrder.seqCst): void¶: Stores value as the new value.

proc exchange(value: bool, param order: memoryOrder = memoryOrder.seqCst): bool¶: Stores value as the new value and returns the original value.

proc compareExchange(ref expected: bool, desired: bool, param order: memoryOrder = memoryOrder.seqCst): bool¶: Stores desired as the new value, if and only if the original value is equal to expected. Returns true if desired was stored, otherwise updates expected to the original value.

proc compareExchange(ref expected: bool, desired: bool, param success: memoryOrder, param failure: memoryOrder): bool

proc compareExchangeWeak(ref expected: bool, desired: bool, param order: memoryOrder = memoryOrder.seqCst): bool¶

Similar to compareExchange, except that this function may return false even if the original value was equal to expected. This may happen if the value could not be updated atomically.

This weak version is allowed to spuriously fail, but when compareExchange is already in a loop, it can offer better performance on some platforms.

proc compareExchangeWeak(ref expected: bool, desired: bool, param success: memoryOrder, param failure: memoryOrder)

proc compareAndSwap(expected: bool, desired: bool, param order: memoryOrder = memoryOrder.seqCst): bool¶: Stores desired as the new value, if and only if the original value is equal to expected. Returns true if desired was stored.

proc testAndSet(param order: memoryOrder = memoryOrder.seqCst): bool¶: Stores true as the new value and returns the old value.

proc clear(param order: memoryOrder = memoryOrder.seqCst): void¶: Stores false as the new value.

proc waitFor(value: bool, param order: memoryOrder = memoryOrder.seqCst): void¶

Arguments: value – Value to compare against.

Waits until the stored value is equal to value. The implementation may yield the running task while waiting.

atomic (T)

proc read(param order: memoryOrder = memoryOrder.seqCst): T

Returns: The stored value.

proc write(value: T, param order: memoryOrder = memoryOrder.seqCst): void: Stores value as the new value.

proc exchange(value: T, param order: memoryOrder = memoryOrder.seqCst): T: Stores value as the new value and returns the original value.

proc compareExchange(ref expected: T, desired: T, param order: memoryOrder = memoryOrder.seqCst): bool: Stores desired as the new value, if and only if the original value is equal to expected. Returns true if desired was stored, otherwise updates expected to the original value.

proc compareExchange(ref expected: T, desired: T, param success: memoryOrder, param failure: memoryOrder): bool

proc compareExchangeWeak(ref expected: T, desired: T, param order: memoryOrder = memoryOrder.seqCst): bool

Similar to compareExchange, except that this function may return false even if the original value was equal to expected. This may happen if the value could not be updated atomically.

This weak version is allowed to spuriously fail, but when compareExchange is already in a loop, it can offer better performance on some platforms.

proc compareExchangeWeak(ref expected: T, desired: T, param success: memoryOrder, param failure: memoryOrder): bool

proc compareAndSwap(expected: T, desired: T, param order: memoryOrder = memoryOrder.seqCst): bool: Stores desired as the new value, if and only if the original value is equal to expected. Returns true if desired was stored.

proc fetchAdd(value: T, param order: memoryOrder = memoryOrder.seqCst): T¶

Returns: The original value.

Adds value to the original value and stores the result. Defined for integer and real atomic types.

proc add(value: T, param order: memoryOrder = memoryOrder.seqCst): void¶: Adds value to the original value and stores the result. Defined for integer and real atomic types.

proc fetchSub(value: T, param order: memoryOrder = memoryOrder.seqCst): T¶

Returns: The original value.

Subtracts value from the original value and stores the result. Defined for integer and real atomic types.

proc sub(value: T, param order: memoryOrder = memoryOrder.seqCst): void¶: Subtracts value from the original value and stores the result. Defined for integer and real atomic types.

proc fetchOr(value: T, param order: memoryOrder = memoryOrder.seqCst): T¶

Returns: The original value.

Applies the | operator to value and the original value, then stores the result.