Locales

Chapel provides high-level abstractions that allow programmers to exploit locality by controlling the affinity of both data and tasks to abstract units of processing and storage capabilities called locales. The on-statement allows for the migration of tasks to remote locales.

Throughout this section, the term local will be used to describe the locale on which a task is running, the data located on this locale, and any tasks running on this locale. The term remote will be used to describe another locale, the data on another locale, and the tasks running on another locale.

Locales

A locale is a Chapel abstraction for a piece of a target architecture that has processing and storage capabilities. Generally speaking, the tasks running within a locale have roughly uniform access to values stored in the locale’s local memory and longer latencies for accessing the memories of other locales. As an example, a single shared memory machine would be defined as a single locale. In contrast, a cluster of network-connected multicore nodes would have a locale for each node.

Locale Types

The identifier locale is a type that abstracts a locale as described above. Both data and tasks can be associated with a value of locale type.

The default value for a variable with locale type is Locales[0].

Locale Methods

The locale type supports the following methods:

proc locale.hostname : string

Get the hostname of this locale.

Returns:

the hostname of the compute node associated with the locale

Return type:

string

proc locale.name : string

Get the name of this locale.

In general, this method returns the same string as locale.hostname; however, it can differ when the program is executed in an oversubscribed manner.

Note

The locale’s id (from locale.id) will be appended to the hostname when launching in an oversubscribed manner with CHPL_COMM=gasnet and one of the following configurations:

  • CHPL_COMM_SUBSTRATE=udp & GASNET_SPAWNFN=L

  • CHPL_COMM_SUBSTRATE=smp

More information about these environment variables can be found here: Multilocale Chapel Execution

Returns:

the name of this locale

Return type:

string

proc locale.id : int

Get the unique integer identifier for this locale.

Returns:

index of this locale in the range 0..numLocales-1

Return type:

int

proc locale.gpuId : int

Warning

‘locale.gpuId’ is unstable

If using a gpu locale, return its position in the parent locale’s gpus array.

Returns:

index of this gpu sublocale in the parent locale’s gpus array.

Return type:

int

proc locale.maxTaskPar : int

Get the maximum task concurrency that one can expect to achieve on this locale.

Returns:

the maximum number of tasks that can run in parallel on this locale

Return type:

int

Note that the value is an estimate by the runtime tasking layer. Typically it is the number of physical processor cores available to the program. Executing a data-parallel construct with more tasks this that is unlikely to improve performance.

proc locale.numPUs(logical: bool = false, accessible: bool = true) : int

Warning

‘locale.numPUs’ is unstable

Get the number of processing units available on this locale.

A processing unit or PU is an instance of the processor architecture, basically the thing that executes instructions. locale.numPUs tells how many of these are present on this locale. It can count either physical PUs (commonly known as cores) or hardware threads such as hyperthreads and the like. It can also either take into account any OS limits on which PUs the program has access to or do its best to ignore such limits. By default it returns the number of accessible physical cores.

Arguments:
  • logical : bool – Count logical PUs (hyperthreads and the like), or physical ones (cores)? Defaults to false, for cores.

  • accessible : bool – Count only PUs that can be reached, or all of them? Defaults to true, for accessible PUs.

Returns:

number of PUs

Return type:

int

Note that there are several things that can cause the OS to limit the processor resources available to a Chapel program. On plain Linux systems using the taskset(1) command will do it. On Cray systems the CHPL_LAUNCHER_CORES_PER_LOCALE environment variable may do it, indirectly via the system job launcher. Also on Cray systems, using a system job launcher (aprun or slurm) to run a Chapel program manually may do it, as can running programs within Cray batch jobs that have been set up with limited processor resources.

proc locale.runningTasks() : int

Get the number of tasks running on this locale.

This method is intended to guide task creation during a parallel section. If the number of running tasks is greater than or equal to the locale’s maximum task parallelism (queried via locale.maxTaskPar), then creating more tasks is unlikely to decrease walltime.

Returns:

the number of tasks that have begun executing, but have not yet finished

Return type:

int

The Predefined Locales Array

Chapel provides a predefined environment that stores information about the locales used during program execution. This execution environment contains definitions for the array of locales on which the program is executing (Locales), a domain for that array (LocaleSpace), and the number of locales (numLocales).

config const numLocales: int;
const LocaleSpace: domain(1) = [0..numLocales-1];
const Locales: [LocaleSpace] locale;

When a Chapel program starts, a single task executes main on Locales(0).

Note that the Locales array is typically defined such that distinct elements refer to distinct resources on the target parallel architecture. In particular, the Locales array itself should not be used in an oversubscribed manner in which a single processor resource is represented by multiple locale values (except during development). Oversubscription should instead be handled by creating an aggregate of locale values and referring to it in place of the Locales array.

Rationale.

This design choice encourages clarity in the program’s source text and enables more opportunities for optimization.

For development purposes, oversubscription is still very useful and this should be supported by Chapel implementations to allow development on smaller machines.

Example.

The code

const MyLocales: [0..numLocales*4] locale
               = [loc in 0..numLocales*4] Locales(loc%numLocales);
on MyLocales[i] ...

defines a new array MyLocales that is four times the size of the Locales array. Each locale is added to the MyLocales array four times in a round-robin fashion.

The here Locale

A predefined constant locale here can be used anywhere in a Chapel program. It refers to the locale that the current task is running on.

Example.

The code

on Locales(1) {
  writeln(here.id);
}

results in the output 1 because the writeln statement is executed on locale 1.

The identifier here is not a keyword and can be overridden.

Querying the Locale of an Expression

The locale associated with an expression (where the expression is stored) is queried using the following syntax:

locale-query-expression:
  expression . 'locale'

When the expression is a class, the access returns the locale on which the class object exists rather than the reference to the class. If the expression is a value, it is considered local. The implementation may warn about this behavior. If the expression is a locale, it is returned directly.

Example.

Given a class C and a record R, the code

on Locales(1) {
  var x: int;
  var c: C;
  var r: R;
  on Locales(2) {
    on Locales(3) {
      c = new C();
      r = new R();
    }
    writeln(x.locale.id);
    writeln(c.locale.id);
    writeln(r.locale.id);
  }
}

results in the output

1
3
1

The variable x is declared and exists on Locales(1). The variable c is a class reference. The reference exists on Locales(1) but the object itself exists on Locales(3). The locale access returns the locale where the object exists. Lastly, the variable r is a record and has value semantics. It exists on Locales(1) even though it is assigned a value on a remote locale.

Module-scope constants that are not distributed in nature are replicated across all locales.

Example.

For example, the following code:

const c = 10;
for loc in Locales do on loc do
    writeln(c.locale.id);

outputs

0
1
2
3
4

when running on 5 locales.

The On Statement

The on statement controls on which locale a block of code should be executed or data should be placed. The syntax of the on statement is given by

on-statement:
  'on' expression 'do' statement
  'on' expression block-statement

The locale of the expression is automatically queried as described in Querying the Locale of an Expression. Execution of the statement occurs on this specified locale and then continues after the on-statement.

Return statements may not be lexically enclosed in on statements. Yield statements may only be lexically enclosed in on statements in parallel iterators Parallel Iterators.

One common code idiom in Chapel is the following, which spreads parallel tasks across the network-connected locales upon which the program is running:

coforall loc in Locales { on loc { ... } }

Remote Variable Declarations

By default, when new variables and data objects are created, they are created in the locale where the task is running. Variables can be defined within an on-statement to define them on a particular locale such that the scope of the variables is outside the on-statement. This is accomplished using a similar syntax but omitting the do keyword and braces. The syntax is given by:

remote-variable-declaration-statement:
  'on' expression variable-declaration-statement

As with regular on-statements, the locale of the expression is queried as described in Querying the Locale of an Expression. The initialization expression of the variable, if any, is executed on the target locale.

Example.

The code

proc computeInitialValue() {
  writeln(here.id);
  return 42;
}

on Locales(1) var x: int = computeInitialValue();
writeln(x.locale.id);

prints 1 twice: once because the writeln statement inside the computeInitialValue procedure is executed on locale 1, and once at the end of the program because variable x is remote and resides on locale 1.