Variables

A variable is a symbol that represents memory. Chapel is a statically-typed, type-safe language so every variable has a type that is known at compile-time and the compiler enforces that values assigned to the variable can be stored in that variable as specified by its type.

Variable Declarations

Variables are declared with the following syntax:

variable-declaration-statement:
  privacy-specifier[OPT] config-extern-or-export[OPT] variable-kind variable-declaration-list ;

config-extern-or-export: one of
  'config' 'extern' 'export'

variable-kind:
  'param'
  'const'
  'var'
  'ref'
  'const ref'

variable-declaration-list:
  variable-declaration
  variable-declaration , variable-declaration-list

variable-declaration:
  identifier-list type-part[OPT] initialization-part[OPT]

type-part:
  : type-expression

initialization-part:
  = expression

identifier-list:
  identifier
  identifier , identifier-list
  tuple-grouped-identifier-list
  tuple-grouped-identifier-list , identifier-list

tuple-grouped-identifier-list:
  ( identifier-list )

A variable-declaration-statement is used to define one or more variables. If the statement is a top-level module statement, the variables are module level; otherwise they are local. Module level variables are discussed in Module Level Variables. Local variables are discussed in Local Variables.

The optional privacy-specifier keywords indicate the visibility of module level variables to outside modules. By default, variables are publicly visible. More details on visibility can be found in  Visibility Of A Module’s Symbols.

The optional keyword config specifies that the variables are configuration variables, described in Section Configuration Variables. The optional keyword extern indicates that the variable is externally defined. Its name and type are used within the Chapel program for resolution, but no space is allocated for it and no initialization code emitted. See Shared Data for further details.

The variable-kind specifies whether the variables are parameters (param), constants (const), ref variables (ref), or regular variables (var). Parameters are compile-time constants whereas constants are runtime constants. Both levels of constants are discussed in Constants. Ref variables are discussed in Ref Variables.

The type-part of a variable declaration specifies the type of the variable. It is optional.

The initialization-part of a variable declaration specifies an initialization expression for the variable. It is optional. When present, the initialization expression will be stored into the variable as its initial value.

If the initialization-part is omitted, the compiler will consider if split initialization can be applied to this variable as described in Split Initialization. If split initialization can be applied, the compiler will identify one or more later assignment statements and the right-hand-side of such statements will form the initialization expression. If the initialization-part is omitted and split initialization cannot be applied, then the variable will need to be initialized to a default value. Only var, const, and param declarations can be initialized to a default value - ref and const ref declarations cannot. Not all types have a default value. Default values are described in Default Initialization.

If the type-part is omitted or refers to a generic type, an initialization expression as described above is required. Note that such initialization expressions can be in later statements if Split Initialization us used. When the type-part is omitted or generic, the type of the variable is inferred from the initialization expression using local type inference described in Local Type Inference. If the type-part is present, the initialization expression must be coercible to the specified type or, if type-part is generic, to its instantiation.

Multiple variables can be defined in the same variable-declaration-list. The semantics of declaring multiple variables that share an initialization-part and/or type-part is defined in Multiple Variable Declarations.

Multiple variables can be grouped together using a tuple notation as described in Splitting a Tuple in a Declaration.

Split Initialization

Split initialization is a feature that allows an initialization expression for a variable to be in a statement after the variable declaration statement.

If the initialization-part of a local variable declaration is omitted, the compiler will search forward in that scope for the earliest assignment statement(s) setting that variable that occurs before the variable is otherwise mentioned. It will consider the variable passed to an out intent argument as an assignment statement for this purpose. It will search only within block statements { }, local blocks, serial blocks, sync blocks, try blocks, try! blocks, and conditionals. These assignment statements and calls to functions with out intent are called applicable assignment statements. They perform initialization, not assignment, of that variable.

Example (simple-split-init.chpl)

The combination of statements const x; and x = 5; in the below example are equivalent to the declaration const x = 5;.

proc main() {
  const x;
  x = 5;
  writeln(x);
}

Example (no-split-init.chpl)

In the following code, the variable x is used before it is assigned to, and so split initialization cannot apply to that variable.

proc main() {
  const x;
  writeln(x);
  x = 5;
}

Example (split-cond-blocks-init.chpl)

Split initialization can find the applicable assignment statement within a nested block or conditional. When conditionals are involved, there might be multiple applicable assignment statements representing different branches.

config const option = false;
proc main() {
  const x;
  if option {
    x = 6;
  } else {
    {
      x = 4;
    }
  }
  writeln(x);
}

A function call passing a variable to an out intent serves as an applicable assignment statement, provided that the variable was declared with a type. For example:

Example (split-init-out.chpl)

proc setArgToFive(out arg: int) {
  arg = 5;
}
proc main() {
  var x:int;
  setArgToFive(x); // initializes x
  writeln(x);
}

Split initialization does not apply:

  • when the variable is a field, config variable, or extern variable.

  • when an applicable assignment statement setting the variable could not be identified

  • when an applicable assignment statement is in one branch of a conditional but not in the other, unless:

    • the variable is not an out intent formal, and

    • the other branch always returns or throws.

    This rule prevents split-initialization when the applicable assignment statement is in a conditional that has no else branch and the if branch does not return or throw.

  • when an applicable assignment statement is in a try or try! block which has catch clauses that mention the variable

  • when an applicable assignment statement is in a try or try! with catch clauses unless:

    • the variable is not an out intent formal, and

    • all catch clauses return or throw

In the case that the variable is declared with no type-part or with a generic declared type, and where multiple applicable assignment statements are identified, all of the assignment statements need to contain an initialization expression of the same type.

Any variables declared in a particular scope that are initialized with split init in both the then and else branches of a conditional must be initialized in the same order in the then and else branches.

Default Initialization

If a variable declaration has no initialization expression, a variable is initialized to the default value of its type. The default values are as follows:

Type

Default Value

bool

false

int(*)

0

uint(*)

0

real(*)

0.0

imag(*)

0.0i

complex(*)

0.0 + 0.0i

string

“”

bytes

b””

enums

first enum constant

classes

nil

records

default constructed record

ranges

1..0 (empty range)

arrays

elements are default values

tuples

components are default values

sync

base default value and empty status

atomic

base default value

Local Type Inference

If the type is omitted from a variable declaration, the type of the variable is defined to be the type of the initialization expression.

var v = e;

is equivalent to

var v: e.type = e;

for an arbitrary expression e.

Multiple Variable Declarations

All variables defined in the same identifier-list are defined such that they have the same type and value, and so that the type and initialization expression are evaluated only once.

Example (multiple.chpl).

In the declaration

proc g() { writeln("side effect"); return "a string"; }
var a, b = 1.0, c, d:int, e, f = g();

variables a and b are of type real with value 1.0. Variables c and d are of type int and are initialized to the default value of 0. Variables e and f are of type string with value "a string". The string "side effect" has been written to the display once. It is not evaluated twice.

The exact way that multiple variables are declared is defined as follows:

  • If the variables in the identifier-list are declared with a type, but without an initialization expression as in

    var v1, v2, v3: t;
    

    for an arbitrary type expression t, then the declarations are rewritten so that the first variable is declared to be of type t and each later variable is declared to be of the type of the first variable as in

    var v1: t; var v2: v1.type; var v3: v1.type;
    
  • If the variables in the identifier-list are declared without a type, but with an initialization expression as in

    var v1, v2, v3 = e;
    

    for an arbitrary expression e, then the declarations are rewritten so that the first variable is initialized by expression e and each later variable is initialized by the first variable as in

    var v1 = e; var v2 = v1; var v3 = v1;
    
  • If the variables in the identifier-list are declared with both a type and an initialization expression as in

    var v1, v2, v3: t = e;
    

    for an arbitrary type expression t and an arbitrary expression e, then the declarations are rewritten so that the first variable is declared to be of type t and initialized by expression e, and each later variable is declared to be of the type of the first variable and initialized by the result of calling the function readXX on the first variable as in

    var v1: t = e; var v2: v1.type = readXX(v1); var v3: v1.type = readXX(v1);
    

    where the function readXX is defined as follows:

    proc readXX(x: sync) do return x.readXX();
    proc readXX(x) do return x;
    

    Note that the use of the helper function readXX() in this code fragment is solely for the purposes of illustration. It is not actually a part of Chapel’s semantics or implementation.

Rationale.

This algorithm is complicated by the existence of sync variables. If these did not exist, we could rewrite any multi-variable declaration such that later variables were simply initialized by the first variable and the first variable was defined as if it appeared alone in the identifier-list. However, sync variables require careful handling to avoid unintentional changes to their full/empty state.

Module Level Variables

Variables declared in statements that are in a module but not in a function or block within that module are module level variables. Module level variables can be accessed anywhere within that module after the initialization of that variable. If they are public, they can also be accessed in other modules that use that module.

Local Variables

Local variables are declared within block statements. They can only be accessed within the scope of that block statement (including all inner nested block statements and functions).

A local variable only exists during its lifetime. The lifetime of a local variable will end when its deinitialization point, or deinit point, is reached. At that time, the local variable and the storage representing it is removed. See Deinit Points for more details.

Note that unlike most types, variables of unmanaged class type do not automatically reclaim the storage that they refer to. Such storage can be reclaimed as described in Deleting Unmanaged Class Instances.

Constants

Constants are divided into two categories: parameters, specified with the keyword param, are compile-time constants and constants, specified with the keyword const, are runtime constants.

Compile-Time Constants

A compile-time constant, or “parameter”, must have a single value that is known statically by the compiler. Parameters are restricted to primitive and enumerated types.

Parameters can be assigned expressions that are parameter expressions. Parameter expressions are restricted to the following constructs:

  • Literals of primitive or enumerated type.

  • Parenthesized parameter expressions.

  • Casts of parameter expressions to primitive or enumerated types.

  • Applications of the unary operators + ``-, !, and ~ on operands that are bool or integral parameter expressions.

  • Applications of the unary operators + and - on operands that are real, imaginary or complex parameter expressions.

  • Applications of the binary operators +, -, *, /, %, **, &&, ||, &, |, ^, <<, >>, ==, !=, <=, >=, <, and > on operands that are bool or integral parameter expressions.

  • Applications of the binary operators +, -, *, /, **, ==, !=, <=, >=, <, and > on operands that are real, imaginary or complex parameter expressions.

  • Applications of the string concatenation operator +, string comparison operators ==, !=, <=, >=, <, >, and the string length and byte methods on parameter string expressions.

  • The conditional expression where the condition is a parameter and the then- and else-expressions are parameters.

  • Call expressions of parameter functions. See The Param Return Intent.

Runtime Constants

Runtime constants, or simply “constants”, do not have the restrictions that are associated with parameters. Constants can be of any type. Whether initialized explicitly or via its type’s default value, a constant stores the same value throughout its lifetime.

A variable of a class type that is a constant is a constant reference. That is, the variable always points to the object that it was initialized to reference. However, the fields of that object are allowed to be modified.

Configuration Variables

If the keyword config precedes the keyword var, const, or param, the variable, constant, or parameter is called a configuration variable, configuration constant, or configuration parameter respectively. Such variables, constants, and parameters must be declared at the module level.

The default initialization of such variables can be overridden via implementation-dependent means, such as command-line switches or configuration files. When overridden in this manner, the initialization expression in the program is ignored.

Configuration parameters are set at compilation time via compilation flags or other implementation-defined means. The value passed via these means can be an arbitrary Chapel expression as long as the expression can be evaluated at compile-time. If present, the value thus supplied overrides the default value appearing in the Chapel code.

Example (config-param.chpl).

For example,

config param rank = 2;

sets an integer parameter rank to 2. At compile-time, this default value of rank can be overridden with another parameter expression, such as 3 or 2*n, provided n itself is a parameter. The rank configuration variable can be used to write rank-independent code.

Ref Variables

A ref variable is a variable declared using the ref keyword. A ref variable serves as an alias to another variable, field, tuple component, or array element. The declaration of a ref variable must contain initialization-part, which specifies what is to be aliased. If the ref variable declaration contains type-part, this type must equal the type of initialization-part. If type-part is a generic type, the type of initialization-part must be its instantiation. If the initialization-part is also a ref variable or a call to a function with a ref return intent, the declared ref variable is an alias to the variable being aliased by the initialization-part.

Access or update to a ref variable is equivalent to access or update to the variable being aliased. For example, an update to a ref variable is visible via the original variable, and visa versa.

A ref variable can also be declared using const ref. Updates to const ref variables are disallowed.

The ref variable must be declared using const ref if its initialization-part disallows updates, for example if initialization-part is a const or const ref variable, a formal argument with a const ref intent (The Const Ref Intent), or a call to a function with a const ref or out return intent (The Const Ref Return Intent, The Out Return Intent). Parameter constants and expressions cannot be aliased.

Open issue.

The behavior of a const ref alias to a non-const variable is an open issue. The options include disallowing such an alias, disallowing changes to the variable while it can be accessed via a const ref alias, making changes visible through the alias, and making the behavior undefined.

Open issue.

The behavior of a ref alias to a domain or array variable when type-part is present is an open issue. In particular, should the runtime types of type-part and the variable being aliased be equal, or is the compile-time type equality sufficient?

Example (refVariables.chpl).

For example, the following code:

var myInt = 51;
ref refInt = myInt;                   // alias of the previous variable
myInt = 62;
writeln("refInt = ", refInt);
refInt = 73;
writeln("myInt = ", myInt);

var myArr: [1..3] int = 51;
proc arrayElement(i) ref do  return myArr[i];
ref refToExpr = arrayElement(3);      // alias to lvalue returned by a function
myArr[3] = 62;
writeln("refToExpr = ", refToExpr);
refToExpr = 73;
writeln("myArr[3] = ", myArr[3]);

const constArr: [1..3] int = 51..53;
const ref myConstRef = constArr[2];   // would be an error without 'const'
writeln("myConstRef = ", myConstRef);

prints out:

refInt = 62
myInt = 73
refToExpr = 62
myArr[3] = 73
myConstRef = 52

Variable Conflicts

If multiple variables defined in the same scope share a name, then a compilation error will occur when that name is used.

An error will not occur if the would-be conflicting symbols are defined within different scopes contained by the same outer scope. For example, the following code will not encounter a conflict when writing the symbol x:

Example (conflict1.chpl).

var x: int;
writeln(x);
{
  var x = 3; // Does not conflict with the earlier `x`
  writeln(x);
}

A variable will also conflict with other symbols defined in the same scope that share a name with it. While functions may share the same name (see Function and Operator Overloading), a function sharing a name with a variable in the same scope will lead to conflicts.

Variable Lifetimes

A variable only exists during its lifetime. The lifetime of a variable begins when the variable is initialized (whether at the declaration or at a later point with Split Initialization).

A variable’s lifetime ends:

  • after copy elision if it occurred (after the last mention is used to copy-initialize a variable or in intent argument) – see Copy Elision.

  • otherwise, at the variable’s deinit point (see Deinit Points)

Deinit Points

The compiler will add a deinitialization for each variable that is not the source of copy elision. The deinitialization point is particularly relevant for records and managed classes. For a record, the compiler will call the record deinit method at the deinitialization point. See Record Deinitializer for more details on this method.

Module-scope variables are destroyed at program tear-down as described in Module Deinitialization.

Fields are deinitialized when the containing class instance or record is deinitialized.

Regular local variables are destroyed at the end of the containing block. Temporary local variables have a different rule as described below.

The compiler adds temporary local variables to contain the result of nested call expressions. For example, g() in the statement f(g()); is a nested call expression. If the containing statement is an initialization expression for a ref or const ref, such as const ref x = f(g());, then the temporary local variables for that statement are deinitialized at the end of the containing block. Otherwise, the temporary local variables are deinitialized at the end of the containing statement.

Example (temporary-deinit-point.chpl)

proc makeRecord() {
  return new R(); // creates a new R record
}
proc f(const ref arg) {
  return new R(); // ignores argument, returns new record
}

proc temporaryInDeclaration() {
  const x = f(makeRecord());
  // the temporary result of 'makeRecord()' is deinited here
  writeln("block ending");
  // 'x' is deinited here
}

proc temporaryInConstRefDeclaration() {
  const ref x = f(makeRecord());
  writeln("block ending");
  // 'x' and the temporary result of 'makeRecord()' are deinited here
}

proc temporaryInStatement() {
  f(makeRecord());
  // temporary result of 'f()' and 'makeRecord()' are deinited here
  writeln("block ending");
}

Copy and Move Initialization

This section uses the terminology copy and move. These terms describe how a Chapel program initializes a variable based upon an existing variable. Both copy and move initialize a new variable from an initial variable. The compiler may change copy initialization to move initialization with Copy Elision.

Since records can use init= and deinit methods to adjust the behavior of copy initialization, this section is particularly relevant for records. In is also relevant for non-nilable owned class types since copies of those types will not be allowed by the compiler, and to strings, arrays, and domains that have record-like behavior in this regard. For records and other types that behave like “plain old data”, copy and move are indistinguishable.

After a copy, both the new variable and the initial variable exist separately. Generally speaking, they refer to different storage and can be modified independently. For example, changing a field in the new record variable should not change the corresponding field in the initial record variable.

A move is when the value changes its storage location from the initial to the new variable. It is similar to a copy initialization but it represents a transfer rather than duplication. In particular, the initial record variable is no longer available after the move. A move can be thought of as an optimized form of a copy followed by destruction of the initial record. After a move, there is only one record variable - where after a copy there are two.

When a record is copied, it will run its copy initializer otherwise known as proc init=.

The compiler will choose whether to add copy or move initialization based upon the pattern of variable mentions.

Here is an example of when copy initialization occurs:

var x:R = ...;
var y:R = x;    // copy initialization occurs here
... uses of both x and y ...;

Here is an example of when the compiler uses move initialization:

record R { ... }
proc makeR() {
  return new R(...);
}
var x = makeR();    // move initialization occurs here

The remainder of this section describes situations in which a copy or a move is added by the compiler to implement some kind of initialization.

When one variable is initialized from another variable or from a call expression, the compiler must choose whether to perform copy initialization or move initialization.

The following table shows in which situations a copy or move initialization is added. Each row in this table corresponds to a particular use of an expression <expr>. Each column indicates the kind the expression <expr>.

operation

value call

local var last mention

local var mentioned again

outer/ref

variable initialization

move

move

copy

copy

value return

move

move

impossible

copy

Here are definitions of the rows and columns:

variable initialization

means when a new variable is initialized in a variable declaration, in a field initialization, or by the in argument intent.

value return

means that an expression is returned from a function by value

value call

means a function call that does not return with ref or const ref return intent

local var last mention

means a use of a function-local variable which is not mentioned again - see Copy Elision for further details

local var mentioned again

means a use of a function-local variable which is mentioned again later

outer/ref

means a use of a module-scope variable, a variable in an outer function, or reference variable or argument

Copy Elision

The compiler elides a copy initialization from a local var or const variable when the source variable is not mentioned again. When a copy is elided, the copy initialization is changed into move initialization and the source variable is considered dead. Compile-time analysis provides compilation errors when a variable is used after it is dead in common cases.

Like split initialization, copy elision looks forward from variable declaration points and considers mentions of variables to determine whether or not a copy can be elided. After a copy, if the source variable is not mentioned again, the copy will be elided. Since a return or throw exits a function, a copy can be elided if it is followed immediately by a return or throw. When searching forward from variable declarations, copy elision considers eliding copies only within block statements { }, local blocks, serial blocks, sync blocks, try blocks, try! blocks, and conditionals.

Example (copy-elision.chpl)

config const option = true;

proc makeRecord() {
  return new R(); // creates a new R record
}

proc elideCopy() {
  var x = makeRecord();
  var y = x; // copy elided because 'x' is not used again
  writeln("block ending");
}
elideCopy();

proc noElideCopy() {
  var x = makeRecord();
  var y = x;  // copy is not elided because 'x' is used again
  writeln(x); // 'x' used here
  writeln("block ending");
}
noElideCopy();

proc elideCopyInReturningConditional() {
  var x = makeRecord();
  if option {
    var y = x; // copy elided because 'x' is not used again
    writeln("returning");
    return;    // because this branch of conditional returns
  }
  writeln(x);  // mention of 'x' here not relevant
  writeln("block ending");
}
elideCopyInReturningConditional();

proc elideCopyBothConditional() {
  var x = makeRecord();
  var y; // split initialization below
  if option {
    y = x;
  } else {
    y = x;
  }
  // copy is elided because 'x' is not used after the copy
  // (in either branch of the conditional or after it)
  writeln("block ending");
}
elideCopyBothConditional();

Copy elision does not apply:

  • when the source variable is a reference, field, or module-level variable

  • when the copy statement is in one branch of a conditional but not in the other, or when the other branch does not always return or throw.

  • when the copy statement is in a try or try! block which has catch clauses that mention the variable or which has catch clauses that do not always throw or return.