chapel-py¶
Python bindings for Chapel’s frontend library, Dyno.
The purpose of this library is to allow programmers to easily create tools for the Chapel programming language that would be more difficult to write using C++ (the native language in which the Chapel frontend library is written).
For example, here’s a toy program that prints the names of all things declared
anywhere in myfile.chpl
.
from chapel import *
ctx = Context()
ast = ctx.parse("myfile.chpl")
all_names = set()
for node in postorder(ast):
if isinstance(node, NamedDecl):
all_names.add(node.name())
for name in all_names:
print(name)
If myfile.chpl
contains the following:
module myfile {
var x = 1+1;
var y = x+1;
record R {
}
}
Then the program will print out something like the following:
R
y
x
myfile
Note
The Chapel Python bindings are not yet fully mature. The API is subject to change and is not considered stable. Users of this library should be prepared to update their code as the library evolves.
Installation¶
Make sure that you have a from-source build of Chapel available in your
CHPL_HOME
. Currently, the build script also requires having LLVM available
in your path. The build script also requires that the development package of
Python be installed (for many package managers this is called
python3-devel
). With those constraints met, you can just run make
chapel-py-venv
:
This will allow you to use the Python bindings from a Python script run with
$CHPL_HOME/util/config/run-in-venv-with-python-bindings.bash python3
,
instead of just python3
. If you wish to install the Python bindings in your
system Python, run python3 -m pip install $CHPL_HOME/tools/chapel-py
.
Usage¶
The library is split into several major components:
The
chapel
module provides the AST node class hierarchy and theContext
object. It also provides some higher-level, pure Python implementations of certain utility functions. For instance, it provides thepreorder
andpostorder
traversal iterators and amatch
function to perform AST pattern matching.The
chapel.replace
module provides the “replacer API”, which is intended to perform transformations on existing Chapel files for various reasons. For instance, deprecations and syntax changes can be performed automatically using the replacer API, by finding AST patterns and performing string substitution.The
chapel.lsp
module provides a few helpers to transform Dyno types to LSP (Language Server Protocol) types. This module requires the lsprotocol package.
chapel
¶
The main entry point to the Chapel Python API is the Context
object. This
is a wrapper around the C++ construct of the same name. The Context in Dyno
is responsible for memoizing computations, interning strings, and more. The
Python wrapper around this object has methods that can be used to get started
with processing Chapel programs; one such method is parse
, which accepts
a file path and returns a list of Chapel AST nodes that represent this file.
At compile-time, chapel-py
generates a class hierarchy of Python objects,
with base class AstNode
. A class is created for each of the AST nodes understood
by Dyno. Thus, one might write:
from chapel import *
ctx = Context()
asts = ctx.parse("myfile.chpl")
print("it's a module" if isinstance(asts[0], Module) else "it's not a module")
Here, the Python isinstance
builtin is used to determine if the first
AST node returned by parse
is a module (it should be!)
Because the AST nodes form a class hierarchy, it’s possible to perform a more
general check; the following example prints “It’s a loop!” for all loops
(do
- while
, for
, foreach
, etc.), and “It’s not a loop…” otherwise.
def check(node):
if isinstance(node, Loop):
print("It's a loop!")
else:
print("It's not a loop...")
The AST nodes from the Python API are iterable; iterating over a node yields its children. Thus, one might print all the declarations in a module as follows:
def print_decls(mod):
for child in mod:
if isinstance(child, NamedDecl):
print(child.name())
Full AST traversals can be built on top of this functionality. Below
is the definition for postorder
from the chapel
module:
def postorder(node):
"""
Recursively visit the given AST node, going in post-order (children-then-parent)
"""
for child in node:
yield from postorder(child)
yield node
To find the list of available methods for each AST node class, the current best
practice is to consult the generated
$CHPL_HOME/tools/chapel-py/src/chapel/core/__init__.pyi
file. This is
generated by default when the Python bindings are built and is the most
up-to-date source of information on the API.
The chapel
module also provides convenience functions for working with the library.
For instance, it provides the postorder
and preorder
iterators (the
implementation of the former is included above). It also provides a couple
of more advanced helpers for dealing with Chapel ASTs.
chapel.parse_attribute
¶
The parse_attribute
function, given a “description” of an attribute (its
name and formal list), tries to parse an Attribute
AST node. This function
accounts for named and unnamed actuals, reordering, etc. For instance, given
an “attribute schema”:
attr = ("doXYZ", ["x", "y", "z"])
And the following Chapel attribute:
@doXYZ("hello", z = "world", y = "!")
The function returns:
{"x": "hello", "y": "!", "z": "world"}
This function is used in the Chapel linter to handle silencing warnings:
def ignores_rule(node, rulename):
ag = node.attribute_group()
if ag is None: return False
for attr in ag:
attr_call = chapel.parse_attribute(attr, IgnoreAttr)
if attr_call is None: continue
ignored_rule = attr_call["rule"]
if ignored_rule is not None and ignored_rule.value() == rulename:
return True
return False
chapel.match_pattern
¶
This function provides general pattern matching functionality to enable users of the Python API to rapidly find “interesting” locations in the AST. It supports arbitrary levels of nesting, as well as “named variables” to easily retrieve deeply nested AST nodes after a pattern has matched.
Generally, a call to match_pattern
takes an AST node and a pattern (what
constitutes a pattern is explained below). If the pattern matches, the function
returns a Python dict
that maps pattern variables to their corresponding
AST nodes; an empty dict
is returned if no pattern variables were used. If
the pattern does not match, the function returns None
.
To explain the possible patterns, it’s helpful to have an example AST.
We will use the following Chapel function, and its AST myFn
, in the following
examples:
proc f(x: int, y: int) {
var z = 1+2;
return x+y+z;
}
The simplest example of pattern is an AST class. For instance:
print(chapel.match_pattern(myFn, Function)) # Prints "{}"
When a function AST node myFn
is matched against the Function
pattern,
the match is successful; since no pattern variables were used, this results
in an empty dictionary {}
.
A list can be used to represent patterns of AST nodes with children. To match a function with exactly two arguments:
print(chapel.match_pattern(myFn, [Function, Formal, Formal, Block])) # Prints "{}"
Since we still haven’t used any pattern variables, the result of
match_pattern
still returns {}
. However, having matched the pattern, it
would be cumbersome to attempt to retrieve each of the two formals from the
list: we’d need to access the function’s children again, by their index. Since
Formal
nodes are direct children of the Function
, this isn’t too
inconvenient; however, conceivably, the nodes of interest could be further down
in the AST from the parent. This is where pattern variables come in. Instead of
writing Formal
, we can write ("?f1", Formal)
. This will cause the
pattern matcher to save the AST node matching Formal
into the variable
f1
.
pat = [Function, ("?f1", Formal), ("?f2", Formal), Block]
print(chapel.match_pattern(myFn, pat)) # Prints "{ 'f1': ..., 'f2': ... }"
The dictionary returned by match_pattern
can then be queried by the variables
to retrieve the formals.
We can further restrict our pattern to require formals with ‘simple’ type expressions (identifiers).
pat = [Function, [Formal, Identifier], [Formal, Identifier], Block]
print(chapel.match_pattern(myFn, pat)) # Prints "{}"
Pattern variables can still be used with list patterns like [Formal,
Identifier]
. This is done by adding the pattern variable string to the front
of the list.
pat = [Function, ["?f1", Formal, Identifier], ["f2", Formal, Identifier], Block]
print(chapel.match_pattern(myFn, pat)) # Prints "{ 'f1': ..., 'f2': ... }"
Some other useful features of the pattern library:
The
rest
pattern can be used as part of a list to indicate that you don’t care about the remaining AST children, however many there are.A variable without an AST node, (
"?x"
) can be used by itself to match any AST node and store it inx
.A
set
can be used to represent a disjunctive or pattern. For example,set([Begin, Cobegin])
will match either aBegin
or aCobegin
node.
chapel.each_matching
¶
This function combines the operation of preorder
and match_pattern
to iterate
an AST and yield all nodes matching the given pattern. Concretely, given
a pattern, it yields the node and the variables resulting from the match.
The following snippet will print one line for each binary operation, listing
the operation itself and the operation’s two operands.
for (op, variables) in chapel.each_matching(myFn, [OpCall, "?l", "?r"]):
print("Found an operation ", op.op(), "with operands: ", variables["l"], variables["r"])
chapel.replace
¶
The chapel.replace
module is used for writing “replacer scripts”. The motivation
for this module is evolving the Chapel language. As the language develops, we
tend to shift the patterns we prefer, and modify unstable language syntax. When
modifying existing code to match the new standards, we either have to perform
the changes manually, or resort to “general” tools like sed
or awk
. However,
the trouble with these tools is that they are suited for modifying streams of
text. When modifying programs in a language, it is more natural to work with
a structured representation – the AST. Thus, the chapel.replacer
module
provides utilities to modify a file by traversing the ASTs contained within,
and generating a list of substitutions.
Although the AST is used to find places in the code where changes must be made, the changes themselves are performed using string operation. This is motivated by many reasons:
Dyno does not have good support for modifying ASTs (they are immutable).
If an AST is modified, it should be printed back to the file with only those modifications; however, this would require the Chapel syntax printer to be able to preserve most of the original format of the text.
Writing plain code, such as
1+1
, is often easier than creating an AST node (PlusNode(IntNode(1), IntNode(1))
).
The chapel.replace
module provides a driver function run
, which, when given
a source of changes (described below), takes over the execution of the program.
It registers and parses command line arguments, and handles file modifications.
Thus, given an AST traversal, you end up with a fully-featured command-line
script ready to be pointed at files.
usage: replace [-h] [--suffix SUFFIX] [--in-place] [filenames ...]
A tool to search-and-replace Chapel expressions with others
positional arguments:
filenames
options:
-h, --help show this help message and exit
--suffix SUFFIX
--in-place`
The ‘source of changes’ is a Python iterator that should accept two arguments:
rc
(for ‘replacement context’) and root
(for the AST node at which the
traversal begins). A very simple (albeit completely pointless and semantically
incorrect) example is the following script, which replaces all occurrences of
the number 42
with meaningOfLife
:
def replace_all_42(rc, root):
for (num, _) in each_matching(root, IntLiteral):
if num.text() == '42': yield (num, 'meaningOfLife')
run(replace_all_42)
A more practical example is renaming methods. Note that this is not
amenable to naive substitution: we don’t want to accidentally rename non-methods
that happen to have the same name, or even locally declared functions inside
other functions in a record or class. Consider renaming the enterThis
method to enterContext
(this is a real change that was performed in
the Chapel language in 1.32, though it was not done automatically with the
help of this tool). In the following snippet, only the proc
labeled (1)
should be renamed. This is because enterThis
is actually a method. On
the other hand, (2)
, even though it’s declared within R
’s curly braces,
is actually a locally defined function, and not a method on R
.
(3)
is a freestanding function, and thus should also not be renamed.
record R {
proc enterThis() {} // (1)
proc f() {
proc enterThis() {} // (2)
}
}
proc enterThis() {} // (3)
This can be implemented using the following iterator:
def replace_enter_this(rc, root):
for (fn, _) in each_matching(root, Function):
if fn.name() == "enterThis" and fn.is_method():
yield (fn, lambda txt: txt.replace("enterThis", "enterContext", 1))
In this case, instead of yielding a pair of the node-to-replace and the
replacement text, the iterator produces a lambda
, which will be fed the
node’s current string representation (proc enterThis() {...
). Using
this lambda, we perform a simple substitution, renaming enterThis
to
enterContext
with Python’s regular replace
method.
Finally, the following script was used to automatically insert interfaces into records and classes that provided the corresponding functionality. It was used to update about 150 files (including Chapel modules and tests). This script implements the following features:
Detecting particular signatures of the
init
method specific to deserializationAlways printing the supported interfaces in a specific order
Supporting both primary and secondary methods, including a mix of both
Adding a new interface list or modifying an existing one
def tag_aggregates_with_io_interfaces(rc, root):
aggrs_to_change = defaultdict(lambda: set())
names_to_tag = defaultdict(lambda: set())
for (fn, _) in chapel.each_matching(root, Function):
if not fn.is_method(): continue
name = fn.name()
if name == "serialize":
tag = "writeSerializable"
elif name == "deserialize":
tag = "readDeserializable"
elif name == "init":
formal_names = []
for child in fn:
if not isinstance(child, Formal): continue
if child.name() == "this": continue
formal_names.append(child.name())
if len(formal_names) >=2 and formal_names[-1] == "deserializer" and formal_names[-2] == "reader":
tag = "initDeserializable"
else:
continue
else:
continue
if fn.is_primary_method():
aggrs_to_change[fn.parent().unique_id()].add(tag)
continue
this_receiver = fn.this_formal()
names_to_tag[rc.node_exact_string(this_receiver)].add(tag)
def build_tag_str(tags):
if len(tags) == 3: return "serializable"
# tags have a preferred order, so just use an if-else chain to make that work
the_order = ["writeSerializable", "readDeserializable", "initDeserializable"]
return ", ".join(t for t in the_order if t in tags)
for (record, _) in chapel.each_matching(root, AggregateDecl):
tags = set()
if record.unique_id() in aggrs_to_change:
tags |= aggrs_to_change[record.unique_id()]
if record.name() in names_to_tag:
tags |= names_to_tag[record.name()]
if len(tags) == 0: continue
tag_str = build_tag_str(tags)
record_text = rc.node_exact_string(record)
curlypos = record_text.find("{")
colonpos = record_text.find(":")
if colonpos >= 0 and colonpos < curlypos:
new_text = record_text.replace(" {" , ", " + tag_str + " {" , 1)
else:
new_text = record_text.replace(record.name(), record.name() + " : " + tag_str, 1)
yield (record, new_text)