Vectorizing Iterator¶
Data parallel constructs (such as forall
loops) are implicitly
vectorizable. If the --vectorize
compiler flag is thrown the Chapel
compiler will emit vectorization hints to the backend compiler, though the
effects will vary based on the target compiler.
In order to allow users to explicitly request vectorization, this prototype
vectorizing iterator is being provided. Loops that invoke this iterator will
be marked with vectorization hints, provided the --vectorize
flag is
thrown.
This iterator is currently available for all Chapel programs and does not
require a use
statement to make it available. In future releases it will
be moved to a standard module and will likely require a use
statement to
make it available.
-
proc
iteratorToArrayElementType
(type t: _iteratorRecord) type¶
-
iter
vectorizeOnly
(iterables ...)¶ Vectorize only “wrapper” iterator:
This iterator wraps and vectorizes other iterators. It takes one or more iterables (an iterator or class/record with a these() iterator) and yields the same elements as the wrapped iterables.
This iterator exists to provide a way to vectorize data parallel loops without invoking a parallel iterator with the goal of avoiding task creation for loops with small trip counts or where task creation isn’t desirable.
Data parallel operations in Chapel such as forall loops are order-independent. However, a forall is implemented in terms of either leader/follower or standalone iterators which typically create tasks. This iterator exists to allow vectorization of order-independent loops without requiring task creation. By using this wrapper iterator you are asserting that the loop is order-independent (and thus a candidate for vectorization) just as you are when using a forall loop.
When invoked from a serial for loop, this iterator will simply mark your iterator(s) as order-independent. When invoked from a parallel forall loop this iterator will implicitly be order-independent because of the semantics of a forall, and additionally it will invoke the serial iterator instead of the parallel iterators. For instance:
forall i in vectorizeOnly(1..10) do; for i in vectorizeOnly(1..10) do;
will both effectively generate:
CHPL_PRAGMA_IVDEP for (i=0; i<=10; i+=1) {}
The
vectorizeOnly
iterator automatically handles zippering, so thezip
keyword is not needed. For instance, to vectorize:for (i, j) in zip(1..10, 1..10) do;
simply write:
for (i, j) in vectorizeOnly(1..10, 1..10) do;
Note that the use of
zip
is not explicitly prevented, but all iterators being zipped must be wrapped by avectorizeOnly
iterator. Future releases may explicitly prevent the usezip
with this iterator.