Locales
Chapel provides high-level abstractions that allow programmers to exploit locality by controlling the affinity of both data and tasks to abstract units of processing and storage capabilities called locales. The on-statement allows for the migration of tasks to remote locales.
Throughout this section, the term local will be used to describe the locale on which a task is running, the data located on this locale, and any tasks running on this locale. The term remote will be used to describe another locale, the data on another locale, and the tasks running on another locale.
Locales
A locale is a Chapel abstraction for a piece of a target architecture that has processing and storage capabilities. Generally speaking, the tasks running within a locale have roughly uniform access to values stored in the locale’s local memory and longer latencies for accessing the memories of other locales. As an example, a single shared memory machine would be defined as a single locale. In contrast, a cluster of network-connected multicore nodes would have a locale for each node.
Locale Types
The identifier locale is a type that abstracts a locale as described above.
Both data and tasks can be associated with a value of locale type.
The default value for a variable with locale type is Locales[0].
Locale Methods
The locale type supports the following methods:
- proc locale.hostname : string
- Get the hostname of this locale. - Returns:
- the hostname of the compute node associated with the locale 
- Return type:
- string 
 
- proc locale.name : string
- Get the name of this locale. - In general, this method returns the same string as - locale.hostname; however, it can differ when the program is executed in an oversubscribed manner.- Note - The locale’s id (from - locale.id) will be appended to the hostname when launching in an oversubscribed manner with CHPL_COMM=gasnet and one of the following configurations:- CHPL_COMM_SUBSTRATE=udp & GASNET_SPAWNFN=L 
- CHPL_COMM_SUBSTRATE=smp 
 - More information about these environment variables can be found here: Multilocale Chapel Execution - Returns:
- the name of this locale 
- Return type:
- string 
 
- proc locale.id : int
- Get the unique integer identifier for this locale. - Returns:
- index of this locale in the range - 0..numLocales-1
- Return type:
- int 
 
- proc locale.gpuId : int
- Warning - ‘locale.gpuId’ is unstable - If using a gpu locale, return its position in the parent locale’s - gpusarray.- Returns:
- index of this gpu sublocale in the parent locale’s - gpusarray.
- Return type:
- int 
 
- proc locale.maxTaskPar : int
- Get the maximum task concurrency that one can expect to achieve on this locale. - Returns:
- the maximum number of tasks that can run in parallel on this locale 
- Return type:
- int 
 - Note that the value is an estimate by the runtime tasking layer. Typically it is the number of physical processor cores available to the program. Executing a data-parallel construct with more tasks than this is unlikely to improve performance. 
- proc locale.numColocales : int
- Warning - ‘locale.numColocales’ is unstable - Get the number of co-locales on the locale’s node, inclusive. - Note that this may not be equal to the number of locales on the node due to oversubscription. The value is one in the typical case in which there is only one locale per node. For example, if a job is launched with - -nl 2then- numColocaleswill be one, and if it is launched with- -nl 1x2- numColocaleswill be two.- More information about co-locales can be found here: Co-locales - Returns:
- the number of co-locales on the locale’s node 
- Return type:
- int 
 
- proc locale.numPUs(logical: bool = false, accessible: bool = true) : int
- Warning - ‘locale.numPUs’ is unstable - Get the number of processing units available on this locale. - A processing unit or PU is an instance of the processor architecture, basically the thing that executes instructions. - locale.numPUstells how many of these are present on this locale. It can count either physical PUs (commonly known as cores) or hardware threads such as hyperthreads and the like. It can also either take into account any OS limits on which PUs the program has access to or do its best to ignore such limits. By default it returns the number of accessible physical cores.- Arguments:
- logical : bool – Count logical PUs (hyperthreads and the like), or physical ones (cores)? Defaults to false, for cores. 
- accessible : bool – Count only PUs that can be reached, or all of them? Defaults to true, for accessible PUs. 
 
- Returns:
- number of PUs 
- Return type:
- int 
 - Note that there are several things that can cause the OS to limit the processor resources available to a Chapel program. On plain Linux systems using the - taskset(1)command will do it. On Cray systems the- CHPL_LAUNCHER_CORES_PER_LOCALEenvironment variable may do it, indirectly via the system job launcher. Also on Cray systems, using a system job launcher (- aprunor- slurm) to run a Chapel program manually may do it, as can running programs within Cray batch jobs that have been set up with limited processor resources.
- proc locale.runningTasks() : int
- Get the number of tasks running on this locale. - This method is intended to guide task creation during a parallel section. If the number of running tasks is greater than or equal to the locale’s maximum task parallelism (queried via - locale.maxTaskPar), then creating more tasks is unlikely to decrease walltime.- Returns:
- the number of tasks that have begun executing, but have not yet finished 
- Return type:
- int 
 
The Predefined Locales Array
Chapel provides a predefined environment that stores information about
the locales used during program execution. This execution environment
contains definitions for the array of locales on which the program is
executing (Locales), a domain for that array (LocaleSpace), and
the number of locales (numLocales).
config const numLocales: int;
const LocaleSpace: domain(1) = [0..numLocales-1];
const Locales: [LocaleSpace] locale;
When a Chapel program starts, a single task executes main on
Locales(0).
Note that the Locales array is typically defined such that distinct elements refer to distinct resources on the target parallel architecture. In particular, the Locales array itself should not be used in an oversubscribed manner in which a single processor resource is represented by multiple locale values (except during development). Oversubscription should instead be handled by creating an aggregate of locale values and referring to it in place of the Locales array.
Rationale.
This design choice encourages clarity in the program’s source text and enables more opportunities for optimization.
For development purposes, oversubscription is still very useful and this should be supported by Chapel implementations to allow development on smaller machines.
Example.
The code
const MyLocales: [0..numLocales*4] locale = [loc in 0..numLocales*4] Locales(loc%numLocales); on MyLocales[i] ...defines a new array
MyLocalesthat is four times the size of theLocalesarray. Each locale is added to theMyLocalesarray four times in a round-robin fashion.
The here Locale
A predefined constant locale here can be used anywhere in a Chapel
program. It refers to the locale that the current task is running on.
Example.
The code
on Locales(1) { writeln(here.id); }results in the output
1because thewritelnstatement is executed on locale 1.
The identifier here is not a keyword and can be overridden.
Querying the Locale of an Expression
The locale associated with an expression (where the expression is stored) is queried using the following syntax:
locale-query-expression:
  expression . 'locale'
When the expression is a class, the access returns the locale on which the class object exists rather than the reference to the class. If the expression is a value, it is considered local. The implementation may warn about this behavior. If the expression is a locale, it is returned directly.
Example.
Given a class C and a record R, the code
on Locales(1) { var x: int; var c: C; var r: R; on Locales(2) { on Locales(3) { c = new C(); r = new R(); } writeln(x.locale.id); writeln(c.locale.id); writeln(r.locale.id); } }results in the output
1 3 1The variable
xis declared and exists onLocales(1). The variablecis a class reference. The reference exists onLocales(1)but the object itself exists onLocales(3). The locale access returns the locale where the object exists. Lastly, the variableris a record and has value semantics. It exists onLocales(1)even though it is assigned a value on a remote locale.
Module-scope constants that are not distributed in nature are replicated across all locales.
Example.
For example, the following code:
const c = 10; for loc in Locales do on loc do writeln(c.locale.id);outputs
0 1 2 3 4when running on 5 locales.
The On Statement
The on statement controls on which locale a block of code should be executed or data should be placed. The syntax of the on statement is given by
on-statement:
  'on' expression 'do' statement
  'on' expression block-statement
The locale of the expression is automatically queried as described
in Querying the Locale of an Expression. Execution of the
statement occurs on this specified locale and then continues after the
on-statement.
Return statements may not be lexically enclosed in on statements. Yield statements may only be lexically enclosed in on statements in parallel iterators Parallel Iterators.
One common code idiom in Chapel is the following, which spreads parallel tasks across the network-connected locales upon which the program is running:
coforall loc in Locales { on loc { ... } }
Remote Variable Declarations
By default, when new variables and data objects are created, they are
created in the locale where the task is running. Variables can be
defined within an on-statement to define them on a particular locale
such that the scope of the variables is outside the on-statement.
This is accomplished using a similar syntax but omitting the do
keyword and braces. The syntax is given by:
remote-variable-declaration-statement:
  'on' expression variable-declaration-statement
As with regular on-statements, the locale of the expression is queried as described in Querying the Locale of an Expression. The initialization expression of the variable, if any, is executed on the target locale.
Example.
The code
proc computeInitialValue() { writeln(here.id); return 42; } on Locales(1) var x: int = computeInitialValue(); writeln(x.locale.id);prints
1twice: once because thewritelnstatement inside thecomputeInitialValueprocedure is executed on locale 1, and once at the end of the program because variablexis remote and resides on locale 1.