Welcome to the final day of our ‘Twelve Days of Chapel AoC’ series! For some
background on this series, check out the introductory article.
You can also click the Advent of Code 2022 tag above to see all the other
articles from this series.
The Task at Hand and My Approach
In today’s challenge, our protagonist is about to embark on a hiking journey through the jungle to rendezvous with the elves, but first we need to plan an efficient route!
The trusty handheld device given to us on day 6
provides a topographic map of the surrounding landscape in the form of a grid
of lowercase letters. In this map, a represents the lowest elevation, and z
represents the highest. From the starting position, marked by an S, we are
tasked with finding the length of the shortest path to the top of a nearby
hill — marked with an E. Additionally, the possible paths through the terrain
are limited by our character’s climbing abilities. We can only follow paths
where the elevation increases by one step, or decreases by any number of steps,
between adjacent grid points.
To solve this problem, I split it into two major parts:
- reading the height-map into a numerical 2D array where each letter is
mapped to an integer (
a->0andz->25) that represents the elevation at that point - applying a recursive task-parallel search algorithm to the map to find the
shortest path from
StoE
The following sections will cover the implementation of both parts in detail. If you’ve been following this series so far, you’ve probably seen ample discussion of the concepts shown in the IO section. If that’s the case, please feel free to skim or skip ahead.
In the second step, I’ll first describe a serial implementation of the search algorithm to give a clear sense of how it works. I’ll then explain how we can use Chapel’s task-parallel features and atomic variables to easily create a parallel implementation of the same algorithm.
For those who like to watch the movie before reading the book, here is the full code:
aoc2022-day12-summit.chpl
|
|
Reading the Elevation Map
To handle the IO portion of the puzzle, I write a procedure called
readElevations that parses the raw-text input, as in the following sample
case:
Sabqponm
abcryxxl
accszExk
acctuvwj
abdefghi
It returns a three-tuple containing the following items:
- a 2D array with the elevation of each grid point represented as a numerical value from 0–25
- a two-tuple with the coordinates of the starting position
- a two-tuple with the coordinates of the ending position
To start out, I define the header of the procedure and use the IO module
just inside:
|
|
As a reminder from the 8th day
in this series, placing a use statement inside the body of a procedure
like this will make the symbols from that module available within the
procedure’s scope only. The same is true for some other scopes, such as
iterators or plain old curly-braces (i.e., { use IO; ... }).
For this program, I’ve confined all the IO operations to the
readElevations procedure, so it makes sense that IO’s symbols
only be accessible from within it.
Next, I define some params to represent the numerical ASCII values
of a few important characters. See day 3’s article
for more on params.
|
|
The value of lowercase a will be used to map the input characters to
their numerical representations. Uppercase S and E will be used to
locate the starting and ending positions in the 2D array.
Note that I also could have looked at an ASCII table, and hard-coded the values instead:
param a: uint(8) = 97,
S: uint(8) = 83,
E: uint(8) = 69;
However, the original approach tends to be less error-prone and more self-documenting.
With those params set up, I’ll read the lines of the input into an
array of strings, and then compute the size of the grid:
|
|
This code makes use of the lines iterator, which yields one
line of text from the input at a time in the form of a string. I then call
the strip method on each of those strings to remove their trailing
newline characters. Because I’ve assigned the iterator call directly to
a variable, Chapel will implicitly create a 0-indexed array with one entry
for each iteration. As such, elevLines will contain an array of strings;
one for each line of the input.
This is very similar to the approach taken to parse the input in day 10’s article.
On the second line, I query the size of the array to get the height of the map
and the size of the first line to get the width. Both values are then used
to define a 2D domain called grid. This domain represents all the
pairs of indices that comprise the elevation map (for more information
about domains, check out the article from day 8
of this series).
And now I use grid to define a numerical elevation array called elevs
(not to be confused with elves):
|
|
This array initialization syntax is composed of two major parts:
The leftmost portion, var elevs = , tells Chapel that I want to
store the result of the expression on the right in a variable named
elevs.
In the middle of this line, [(i, j) in grid] indicates that I want to
initialize an array whose elements are defined by the indices in grid.
In other words, the array will use grid as its domain, and each of its
elements will be defined in terms of some expression (to the right) that
can use the values i and j.
To the right, I use these indices to pull out individual characters
from elevLines. Specifically, this code takes the jth character of
the ith line, converts it to a byte, and then subtracts the special
a value from that byte.
This has the effect of mapping a = 0, b = 1, and so on, up to
z = 25 (note that the ASCII values of the lowercase letters are
all consecutive integers). The only characters that won’t be represented
correctly in elevs are the S and E.
Those values are located using the following maximum-location (‘maxloc’) reductions :
|
|
The maxloc reduction takes a zippered pair of iterable expressions with
compatible size and shape. Here, the first argument to zip is the set
of values over which we want to find a maximum, and the second argument
is the set of indices we’ll use to define the location of that maximum value.
Both the maximum value and its location are returned in a two-tuple. Here,
I don’t actually need the value itself, only its location in grid, so I
choose not to store it in a variable by putting an underscore (_) in its
place.
More on maxloc can be found in day 6’s post
or in the documentation
In both of these lines, I am applying the reduction to a promoted array
expression. The first expression: elevs == (S - a) [note:Chapel is smart enough to avoid actually creating this array. Instead
it will provide it to the reduction as an iterator in order to save
memory.] an array of boolean values where the only true entry
should be the location of S. Note that we are checking against S - a,
rather than S, because we already subtracted the ASCII value of a
from all entries in the elevation array.
Chapel defines true to be greater than false, so maxloc will find
the location of S, storing it in start as a two-tuple of coordinates. The
reduction to find the ending position works in a similar manner. With these
locations, I can set the proper elevations for the starting and ending
positions, as defined by the problem:
|
|
And now we have everything we need from the input text, so the relevant values are returned from the procedure in a three-tuple:
|
|
In the process we cast elevs to a signed array of 8-bit integers using a
promoted cast operation (: int(8)). When initialized, elevs was assigned
with the type: [grid] uint(8), meaning that it contained unsigned 8-bit
integers (this is because toByte() returns a uint(8)); however, signed
integers will be more convenient for subtractions later on, so I apply a cast
here (note that this involves an extra array allocation, so defining elevs
as an array of int(8) to begin with may be a more efficient strategy for
larger problems).
Next, I’ll discuss how to use the data extracted from the input text to find the shortest path from start to end!
Searching for the Shortest Path
As a reminder, the goal of this step is to find the length of the shortest
possible path from S to E, where the set of possible paths is
constrained by the elevation changes in the landscape. A path can only go
from one space to another if the elevation of the destination is at most
one step higher than the elevation of the current space.
To facilitate a search over the possible paths, I’ll define an explore
procedure which starts at one space in the map, and attempts to explore the
four surrounding spaces. If an adjacent space is too high (or sits outside
the bounds of the map), it will be ignored. Otherwise, explore will be
called on that neighboring space, and the search will continue
until explore is called on the end space.
This approach falls under the category of a recursive tree search
algorithm. It’s recursive
because the principal function repeatedly calls itself until some terminating
condition is met (in this case, explore stops calling itself when the search
reaches the end space). It’s a tree search, because at each node (or grid
space) there are multiple “branches” that the path could take next (here:
up, down, left, and right are the possible options).
Very roughly, explore will look something like this (many details are
omitted here):
proc explore(pos, end, elevations, pathLength): int {
if pos == end then return pathLength;
var shortest = max(int);
for nextPos in nextPositions(pos, elevations) do
shortest = min(shortest, explore(nextPos, end, elevations, pathLength + 1));
return shortest;
}
where nextPositions is an iterator I’ll define later that provides all the
possible next steps taking into account the elevation constraint and the
borders of the map.
At a high level, this simplified version of explore does two things:
- If
posis at theend, it returns the total length of the path that got us here. Notice that each call toexploreincrementspathLengthby1, so by this point its value will be the total number of explorations required to traverse the path. - Otherwise, starting from
pos, it explores all the neighboring spaces that can be explored, and returns the shortest resulting path.
The net result is that calling explore with start as the first argument
should eventually return the shortest path to end.
However, there are a few challenges with this approach that are not addressed in the dummy implementation above:
- We need some mechanism to keep track of paths that have already been tried
so that our search doesn’t end up going in circles. For example, starting
from some space ,
explorecould be called on the space to its right, which could then immediately callexploreto its left — bringing the search back to without making any progress towards theend. - The number of possible paths is huge, so we’ll want to terminate branches of the search early whenever we know that they aren’t going to beat the record for the shortest path — a technique called pruning.
- We should be able to process search paths in parallel across multiple threads to speed up the search process — this is a good opportunity to make use of some of Chapel’s parallel features that have not been explored in this series so far.
In the following section, I’ll describe how the first two challenges are addressed in a complete serial implementation. After that, I’ll show a parallel code that addresses all three.
The Serial Search Algorithm
I use a single mechanism to solve the first two problems described above: instead of directly returning the minimum of the search branches, I’ll allocate a 2D array that keeps track of the shortest known path-length to each location in the map.
This array will be queried and updated over the course of the search. For
convenience, I create the following findShortestPath procedure that sets
this up:
proc findShortestPath(const ref elevs: [?d] int(8), start, end) {
var minDistanceTo: [d] int = max(int);
explore(start, end, elevs, minDistanceTo, 0);
return minDistanceTo[end];
}
(what does const ref mean?)
This is an example of an argument intent.
It indicates that the formal elevs must be taken by reference (hence
ref) — meaning that the variable won’t be copied or moved into the
procedure when findShortestPath is called — and that it cannot be
modified by findShortestPath (hence const). Since elevs is an array,
the default argument intent is ref, meaning that I also could have simply
written const instead; however, I decided to write the full const ref
for documentation purposes.
(what does : [?d] int(8) mean?)
This is a type specifier — and a fairly advanced one at that.
It indicates to the compiler (and to anyone reading the code), that
this procedure will only accept an array of int(8) as its first
argument. That’s the : [] int(8) portion.
The ?d is called a type-query. In this case, it queries the
array’s domain and stores its value in a symbol called d. This
makes it easy to reuse d in the body of the procedure to define
minDistancesTo over the same set of indices as the elevs argument.
An alternative would have been to omit the ?d and query the domain
directly in the procedure’s body:
proc findShortestPath(const ref elevs: [] int(8), start, end) {
var minDistanceTo: [elevs.domain] int = max(int);
// ...
The array minDistanceTo is initialized to have the maximum integer value
for all elements. The rationale behind this is the same as in the dummy
implementation of explore above: the starting minimum value is initialized
to max(int) so that essentially any value we compare with it becomes
the new working minimum.
After exploration is complete, the minDistanceTo array will be populated
with the shortest path from start to each location in the map (more on
how I accomplish this shortly). The value of this array at the end
location is the solution to our problem, so the procedure returns that
value.
As you might have noticed, we are now passing five arguments to explore,
whereas the dummy version I defined above only took four arguments. Here is
the actual implementation of explore that makes use of minDistanceTo
(shortened to minTo within the procedure’s scope):
proc explore(
pos: 2*int,
end: 2*int,
const ref elevs: [?d] int(8),
ref minTo: [d] int,
pathLen: int
) {
// stop searching if we've reached 'end'
if pos == end then return;
// stop searching if another path has reached 'end' in fewer steps
// than we've taken so far
if pathLen >= minTo[end] then return;
// otherwise, explore the next positions
for nextPos in nextPositions(pos, elevs, minTo, pathLen + 1) do
explore(nextPos, end, elevs, minTo, pathLen + 1);
}
Like the simplified implementation above, the terminating condition:
if pos == end then return is used to stop searching when a path has
reached the end; however, the path length is not returned directly.
Instead, minTo[end] will be modified by reference in the nextPositions
iterator — more on this momentarily.
The next conditional: if pathLen >= minTo[end] then return;, takes
care of the pruning concern mentioned above. The logic behind this check is
as follows: suppose that this particular branch of the search has made 40
steps so far (i.e., pathLen=40); however, some other branch has already
reached end in 35 steps (i.e., minTo[end]=35). In this case, we know
that it’s impossible for the current path to be the shortest, thus explore
returns early so that the computer can use its resources for other paths.
Additionally, notice that the nextPositions iterator is used in a similar
manner as before, except I am not using a temporary variable to keep track
of the shortest path. Again, this is because minTo will be updated with
the shortest paths to each location as exploration progresses.
Let’s take a look at how nextPositions is implemented to see how that
works:
iter nextPositions(pos, const ref elevs, ref minTo, nextPathLen) {
// try moving in each direction
for move in ((-1, 0), (1, 0), (0, -1), (0, 1)) {
const next = pos + move;
// is this move on the map and valid?
if elevs.domain.contains(next) &&
elevs[next] - elevs[pos] <= 1 {
// does this path beat the shortest record to 'next'?
if nextPathLen < minTo[next] {
minTo[next] = nextPathLen;
yield next;
}
}
}
}
I start out by iterating over the possible moves: up, down, left,
and right — each represented as a two-tuple. The next position will be
the sum of the two-tuple and the current position. I store this value
in next. Notice that I also passed pathLen + 1 to the nextPathLen
argument, meaning that nextPathLen represents the path length to next,
not to pos.
To check if the path to next will exceed the boundaries of the map, I
simply query elev’s domain and use the contains procedure. This
will return true if next is in the domain and false if it isn’t.
I also check if the elevation constraint is met by subtracting the
elevation at next from the elevation at pos. If the difference is small
enough, then we know that our protagonist can make the climb. If either of
these conditions is not met, the iterator will continue on to the next
move without yielding anything.
Lastly, I check whether the path to next is shorter than the shortest known
path to that location (i.e., if nextPathLen < minTo[next]). If it is,
I update minTo with the new shortest path length and then yield next.
Then, going forward, when other branches of the search read minTo[next],
they’ll get the new shortest path length: nextPathLen.
Notice that next is only yielded if this path beats the shortest-known
path length. This solves the first challenge with my naive solution from
the previous section because it prevents explore from starting searches
down paths that are not an improvement on previous paths that have explored
the same spaces.
To summarize, this iterator does a few things: it yields the coordinates of
the next locations that are valid moves—only if they result in an
improvement on the best-known path lengths to those positions—and updates
the best-known path lengths in the process. When called by the recursive
explore procedure, it will eventually set the value of minTo[end] to
the length of the shortest path.
With the above procedures and iterator defined, we could solve the problem in serial as follows:
const (elevations, start, end) = readElevations();
writeln(findShortestPath(elevations, start, end));
Parallelizing the Search Algorithm
As the problem size grows, the serial solution above will continue to work; however, the time it takes to find the shortest path will grow rapidly. Thus, in practice, problems like this are often solved in parallel by spawning a new task to handle each branch of the search tree. As tasks are created, different threads can take responsibility for each task, and work on separate portions of the search concurrently.
Correctly implementing this form of concurrency will motivate the
introduction of a new concept, namely atomic variables; however, to
show why they are needed, let’s first discuss what would happen if we
altered the explore procedure to spawn new tasks for each branch without
making any other changes to the code.
Specifically, what would happen if we changed explore’s for-loop to spawn
a new task for each of the next positions?
Task Concurrency
In Chapel, this is as simple as replacing for with coforall:
// explore the next positions in parallel
coforall nextPos in nextPositions(pos, elevs, minTo, pathLen + 1) do
explore(nextPos, end, elevs, minTo, pathLen + 1);
A coforall loop is a task-parallel loop construct that spawns precisely
one new task for each iteration of the loop. This is distinct from
Chapel’s forall loop which typically spawns one task per physical
hardware thread and then breaks the loop’s work up across each task.
A much more detailed description of the coforall loop can be found in
yesterday’s AoC article;
however, the important point for our purposes is that the above code will
execute each call to explore on its own task, allowing multiple threads
to work on the search simultaneously.
Due to the recursive nature of our approach, the number of spawned tasks will rapidly exceed the number of physical threads needed to execute them concurrently. As such, Chapel’s runtime will manage the execution of those tasks in the background. Whenever a thread finishes executing a task, it will be provided with the next available task in the queue.
Although this parallel code would execute faster than the serial version,
it would not actually produce the correct answer (or would at least be
very unlikely to do so). This is because we’ve failed to introduce any
coordination between threads. Each will behave as if it has exclusive
access to the minTo array even though this isn’t actually true. This
will cause threads to overwrite each other’s work in a very haphazard
manner, likely resulting in an incorrect solution.
(an example of why coordination is necessary…)
Simply invoking a coforall in this situation will result in incorrect results
because each thread will have the ability to read and modify the
minTo array without coordinating with other threads. This problem comes
up in a few places, but let’s look at one in particular to understand what’s
going on.
Consider the following code from the nextPositions iterator:
if nextPathLen < minTo[next] {
minTo[next] = nextPathLen;
yield next;
}
Imagine that there are two threads that arrive at this conditional at
roughly the same time. Thread 1 has taken 31 steps to get here, thread
2 has taken 33 steps, and minTo[next] = 37. Now, the following events
transpire in order:
- thread 1 reads the value of
minTo[next]and compares it with its local copy ofnextPathLen(the result istrue) - thread 2 does the same, and the result is also
true - thread 1 writes the value
32tominTo[next](noticenextPathLen = pathLen + 1). - thread 2 then overwrites
minTo[next]with the value34
This is a problem because the correct minimum path length at next is the
smaller of the two values: 32. However, because the two threads did not
coordinate with each other, the value is now 34. The correctness of the
algorithm relies on minTo always holding the best-known minimum at each
location, so now we can’t trust the results going forward.
As such, we’ll need to introduce a mechanism to prevent separate tasks from
interfering with each other when they are reading and writing to the same
locations in memory (in this case, the minTo array).
Atomic Variables
This class of coordination problem is so fundamental in parallel computing that essentially all modern hardware exposes a set of mechanisms that allow threads to safely read and write to the same locations in memory at roughly the same time.
One such class of mechanisms are referred to as atomic operations, or just atomics. The idea behind the name being that the operation is not divisible into its sub-components and thus, the memory that they operate on cannot be manipulated by another thread during the operation.
In Chapel specifically, atomic operations are exposed in a nice abstract
manner. Any variable of a primitive type can be declared as an
atomic variable
by prepending the keyword atomic to its type declaration. For example,
we can create an atomic int as follows:
var x: atomic int = 1;
And now, the variable x will have access to a wide range of atomic operations.
The important ones for our purposes are: read
and compareExchange.
I’ll provide a more detailed explanation of each as they come up.
The essential takeaway for this program is that replacing some of our
variables with their atomic counterparts will allow us to safely keep
track of minimum path lengths across multiple threads simultaneously.
I’ll also note that Chapel provides another primitive to facilitate coordination across tasks. Synchronization variables or sync variables expose a similar interface that could have also been used to solve todays challenge with task-parallelism.
(Some notes on when to use atomics vs. syncs…)
In our solution to day 11, we used Chapel’s synchronization variables where today we used its atomic variables. You might be wondering how to decide between these options when writing your own task-parallel programs given that, in most cases, either variable type can be used with just a bit of effort.
In practice, we tend to think of atomics as being best-suited for what we
might call “optimistic” synchronization situations: cases where the chances
of interference with other tasks are low; or where, even if there is
interference, it will be brief and generally not block a task’s ability
to proceed. This is a good characterization of today’s problem, because
as we’ll see in the coming sections, two or more threads might attempt
to update minTo[next] simultaneously; however, this interference is
somewhat unlikely to begin with, and when it does occur, the conflict can
be resolved rapidly.
By contrast, sync can be thought of as more of a “pessimistic”
synchronization concept since it can result in tasks blocking (if a
variable’s full/empty state is not as expected) or yielding to other
tasks to permit forward progress and avoid deadlock or livelock.
In yesterday’s simulation of monkeys, only one of the troop of monkeys was going to be able to proceed based on the given synchronization variable’s value, so — on average across the group of monkeys — there was no expectation that reading the synchronized value would return the current monkey’s ID. As a result, it would make sense for the current monkey to yield and let other monkeys process their items. Yielding in this scenario was especially important when running a simulation with more monkeys than processor cores.
One other difference between the two is that atomics are implemented using hardware, which generally comes with a performance advantage over syncs, which are typically not.
Alternatively, the hardware implementation of atomics means that they
are limited to working with a fixed number of simple scalar types, whereas
syncs are supported for most types including user-defined records and
classes.
All that said, either atomics or syncs can both be made to work in most situations, sometimes using methods or routines that we haven’t covered in this series. We tend to reach for atomics in most cases where they apply due to their implementation in hardware and consequent performance advantages; but syncs provide a reasonable solution when needing to synchronize on non-scalar types, or in producer-consumer patterns where it may be best for tasks to block in order for the program to make forward progress.
Parallelizing Search with atomics
Now I’ll briefly go over the code used in my parallel solution to today’s puzzle. Most of this code will be the same as the serial solution in the previous section, so I’ll focus my explanation primarily on the differences between the two.
First, I update the minDistancesTo array in the findShortestPath procedure
to store atomic ints rather than traditional ints. This is as simple as
changing the type in the array’s declaration and returning the value we read
from the atomic:
|
|
Now, values in this array have access to the read and compareExchange
methods which will be used later on. Note that the return type of this
procedure has implicitly changed from int to atomic int because
the last line is now accessing an atomic variable.
Next, the explore method is also updated slightly. First, its header is
changed to accept minTo as an array of atomic ints rather than ints:
|
|
Note that I didn’t alter elevs to be an atomic array because it is
never modified after it’s created. The concurrent tasks spawned for the
search will only ever read values from elevs, so we don’t have to worry
about one thread modifying its state while another is reading it.
This fact is further denoted by the const ref intent which indicates
that explore cannot modify elevs.
The end condition is left unchanged; however, I do update the
early-termination condition to call the read method on the minimum path length
at end:
|
|
This is necessary because numerical comparison operators like >= are
not available on atomic ints; therefore I first need to create a new int
with the same value as minTo[end] by calling read on it.
You may be wondering whether this check is valid since some other thread
could come along and update minTo[end] between the time that this thread
reads the value and compares it with pathLen. Such a concern is not
invalid; however, we know that the values in minTo only ever get smaller.
As such, the worst-case scenario here is that this early-termination
condition is not met, but would have been met only an instant later. In
such a case, some time is wasted on spawning more tasks that will
ultimately be unfruitful; however, the correctness of the algorithm is
not compromised. In other words, we can’t be certain that some other task
won’t find and register a shorter path after our check, but we also can’t
spend all our time waiting to see whether another task will do so because
if all of the tasks are waiting for each other, none of them will make
progress on the actual search.
Next, I spawn new tasks for each subsequent call to explore, using the
coforall loop discussed earlier:
|
|
And finally, I’ll make some changes to the nextPositions iterator to
properly handle coordination between tasks. Its header remains the same as
the sequential implementation; however, I do change the for-loop over the
four directions. Here, I add a label called checkingMoves to the loop:
|
|
A label is a special annotation that allows control-flow operations
like break and continue to refer to a specific loop rather than
the nearest surrounding loop. More details about labels can be found
in the documentation,
and the reason for this particular addition will be discussed below.
The validity bounds and elevation checks remain unchanged:
|
|
This last section of code is modified pretty significantly. At first
glance, it looks like it could be doing something completely different than
the simple if nextPathLen < minTo[next] check from the serial version:
|
|
However, this code has exactly the same effect, only it’s safe for concurrent use. Let’s break it down:
-
First, I define a temporary variable,
minToNext, with the current value ofminTo[next]by callingread. This value has typeint, which allows us to compare it with otherints. -
Next I’ll use
minToNextto check if this path length beats the record. I can’t simply check whethernextPathLenis smaller thanminToNextbecause its value could be changed by another thread while I’m checking. So, I’ll have to do something a little fancier.Initially, I’d like to rule out the case where
nextPathLenis actually larger than the minimum path length atnext. In that case, I just want to continue on to the nextmoveinstead of yieldingnext. This is what the body of the do-while loop is designed to do.If the value is too large, then I
continue checkingMoves. I can’t just saycontinuehere because that would only continue to the next iteration of the do-while loop—having no effect. Thus, I use the aforementionedlabelon the outer for-loop to explicitlycontinuethere. -
Now, looking at the terminating condition for the while loop itself: I use a compare and exchange operation, which can do one of two things in this case:
- if
minTo[next] == minToNext: update the value ofminTo[next]to matchnextPathLenand returntrue. - if
minTo[next] != minToNext: update the value ofminToNextto matchminTo[next]and returnfalse`
In the first case, I know that no other thread has updated the value of
minTo[next]becausecompareExchangehas confirmed thatminTo[next] == minToNextis still true. As such, the new minimum value is put in its place; hence the exchange portion ofcompareExchange. The function also returnstrue, so the program leaves the do-while loop and moves on to yieldnext.In the second case, the values don’t match, so I know that another thread updated the value while I was executing the body of the do-while loop. As such,
compareExchangekindly replacesminToNextwith the updated value, and I use it to run the check again. This loop keeps running until the first case is met (minTo[next]is updated with this task’s smaller value), or until itcontinues on to the next iteration of the outer for-loop. - if
In summary, this compare-and-exchange loop has the same effect as the
serial code. It either updates minTo[next] with a smaller path
length and yields next, or it continues checking the subsequent move.
And that’s the end of the parallel implementation! Now we can call our two primary procedures to find the shortest path in parallel:
|
|
Conclusion and Tips for Part 2
In summary, this post discussed a serial and parallel implementation
of a recursive tree search algorithm used to find the shortest path
through a topographic map. Along the way we reviewed some IO concepts
from previous posts, discussed recursion and task concurrency, and
introduced a new concept: atomic variables.
The full parallel code can be downloaded from the top of the article or found on GitHub;
In part two, you are asked to find the shortest path from any location
with an elevation of a to E. This might sound like it would add a
layer to the problem (i.e., run a separate search for each a in
the map, and minimize over the shortest path from each); however, the
algorithm shown above can be adjusted to simply set minTo[pos] to zero
whenever pos has an elevation of 0. This will have the effect of
moving S to whichever a is closest to E. You could also run a
reverse search, starting from E and ending whenever any a is
encountered.
With that we’ve concluded the 12th and final entry in our ‘Twelve Days of Chapel AoC’ series! Thanks for reading, and I hope you’ll check out the other 11 posts from this series if you haven’t already. As always, please feel free to leave questions or comments about this post in the Blog Category on Chapel’s Discourse Page.
Updates to this article
| Date | Change |
|---|---|
| Feb 5, 2023 | Updated findShortestPath() to return the value stored in the atomic |