Welcome to day 10 of Chapel’s Advent of Code 2022 series. Having taken a little break for the weekend, we’re ready to dive into the remaining three problems in our Twelve Days of Chapel. Check out that linked introductory post if you would like more context on the series!
The Task at Hand and My Approach
It wouldn’t be Advent of Code without a problem
involving some
virtual machine. In today’s challenge,
we are given a virtual computer with a single register (memory cell) X, and two operations:
addx and noop. The addx operation is used to add a number to the
X register (subtractions can be achieved by adding negative numbers to X),
and noop does nothing. Each instruction takes some time to execute: the
noop instruction takes one step, while addx takes two steps. We are tasked
with determining the values of the register X at particular times, which
realistically means we have to figure out what X is equal to after every
step.
My approach for this problem uses concepts that we’ve already covered
in previous articles, with the notable exceptions of scan expressions
and array reshaping. By expressing our computation as a scan, we can elegantly solve the first
part of today’s puzzle. Chapel’s scan expressions, just like its reduce
expressions, are executed in parallel, so our solution immediately benefits from Chapel’s
multitasking capabilities. Array reshaping leads to some very clean code
when it comes to drawing the output of the CRT monitor for part two.
If you are a fan of palindromes, here is the complete solution for the day:
aoc2022-day10-crt.chpl
|
|
As usual, before we begin, let’s bring in our old friend, the IO module.
|
|
And now, onward to our first task — parsing!
An Iterator to Represent Operations
The first order of business is to read in the puzzle input, and turn it into a representation that is convenient for simulating the tiny virtual computer. The instructions we’re reading look like this:
noop
addx 3
addx -5
In the world of our little processor, only one thing can change at any
given step: the value of X can go up or down by a certain amount. We can therefore represent
the effects of the noop and addx instructions as a
[note:I think there’s a curious analogy between the integers we create and
CPU micro-operations.
Just like a CPU might break down complex instructions into smaller ones, we
break individual operations like noop and addx into
smaller pieces (changes to the register).] each integer will indicate that one step took place, and that
during that step, the register X changed by the amount stored in
the integer. Let’s call such a change to X a delta. My solution defines
an iterator that reads input from the console and yields deltas.
|
|
The first statement in the iterator is a yield that produces
a fixed value of 1. As indicated by the comment, this encodes the initial
state of the CPU. This first value will end up — unchanged — in the very beginning of our history of
values of X. If we wanted the CPU to start at 100 instead of 1, we
could simply change this statement to yield 100;.
Next, my solution iterates over every line in the input. Here I use
stdin.lines(), which is an iterator that yields a single string for each
line from the program’s input stream. These strings contain the newline
character \n, which I don’t want; I get rid of it by calling the strip()
method on each string. Note that it looks like I’m calling strip() on the iterator,
stdin.lines(), itself — this is another example of promotion in Chapel. We
first covered promotion in our day 2 article.
For each line of input, what’s yielded depends on the instruction; we can identify the instruction by its first four characters, which I retrieve by slicing the line (see the day 3 article for an introduction to slicing). From there, the two options are:
- In the
noopcase, a single step transpires. During this step, the value ofXdoes not change, and so a single integer0is yielded. - In the
addx ncase, two steps transpire. During the first step, theaddxinstruction is still running, so the state doesn’t change, and the iterator yields0. During the second step, however, the instruction completes, and the register is changed by the amount listed in the operation,n. I use more slicing and an integer cast to retrieve the part of the string after theaddxand convert it to an integer.
That completes the iterator of deltas, ops. For a set of instructions
like the ones in the problem statement:
noop
addx 3
addx -5
the following integers would be yielded by ops:
1 0 0 3 0 -5
Using Scans to Compute Values of X at Every State
How do we go from a series of deltas to a series of register values? It’s
pretty easy: to get the value of X after a certain number of steps,
we just have to sum up all the changes that have happened so far. Thus,
for the initial state, we just take the first element yielded by ops;
for the second state we sum the first two elements; for the third, we sum
up the first three. The words “summing” might evoke memories of + reduce,
and these memories would be on the right track. However, there’s a difference
between what we can get with + reduce and what we need: reductions compute
a single value from an iterable, whereas we want to have a whole array of items,
each representing reductions over some prefix of a sequence!
This is where Chapel’s scan expressions come in. They do exactly what
I just described above: given an iterable and a binary operation, they compute
and yield partial reductions up to and including each element. For example,
take the following array:
var A = [1, 2, 3, 4];
We can apply a + scan to compute partial sums of its elements:
writeln(+ scan A); // Prints 1 3 6 10
We can also use a * scan to compute the factorials of the numbers 1 through
4:
writeln(* scan A); // Prints 1 2 6 24
Note that although the answer is the same, the way scan performs the
factorial computation above is not the same as writing something like
the following:
writeln([i in A.domain] * reduce A[..i]); // Prints 1 2 6 24
In this last snippet, for each index of A, we use a * reduce to compute
the partial product of the elements up to that index.
The difference is that scans don’t do extra work; the sum for the first three elements,
for instance, can be computed just by adding the third element to the sum
of the first two. Thus, there’s no need to re-run a reduction for each prefix:
the result can be computed incrementally.
It gets even better: just as reduce expressions are parallel operations in
Chapel, so too are scan expressions. Even though it seems like
computing partial sums is a serial algorithm (“add 1, then add 2, then add 3…”),
the following code is also not how scan operates:
for i in A.indices[..<A.size-1] do
A[i+1] *= A[i];
Whereas that last snippet of code is serial, there are ways to break scans into concurrent pieces (see the Parallel Algorithms section on Wikipedia’s page on prefix sums), and Chapel applies such techniques. Thus, we get cleaner code and better performance — nice!
We can apply a + scan to our ops() iterator to create a sequence of
partial sums. These partial sums, as we have discussed, work out to be the
values of X at each step. I perform the computation as follows:
|
|
In this snippet, I first collect the output of the iterator into an array,
deltas. I then retrieve the array’s size into another variable, cycles.
Finally, I use a + scan on deltas to compute the partial sums, storing
the result into a new array Xs. By default, this new array Xs would be
0-indexed just like the deltas array it’s computed from. However, the
problem statement starts counting CPU steps at 1 (i.e., it’s 1-indexed).
I therefore explicitly specify the domain of Xs to be 1..cycles, which
makes it use 1-indexing and helps me write cleaner code down the line.
Slicing and Operator Promotion to Compute Signal Strengths
The next thing we need to do is to compute the signal strengths at six different indices. The problem asks for strengths at 20, 60, 100, 140, 180, and 220. The first step we can take is notice that the numbers go up by 40 each time, and express them as a range:
|
|
This way of representing the interesting indices is both more concise and less error-prone:
we only have to keep track of three constants instead of six. Also, because
interesting is a range, we can use slicing to get the values of Xs at each
of the required time steps. This would look like:
writeln(Xs[interesting]); // Prints 21 19 18 21 16 18 for sample input
In the above statement, we took advantage of the fact that we declared Xs
to be 1-indexed instead of 0-indexed. The interesting range is built
based on the problem description, which counts from one, making its indices
a perfect match for the 1-based Xs array.
The values of the register X are not signal strengths, though! To compute
the signal strengths, we must multiply the value of X at a particular step
by the number of that step. Fortunately, we already have both ingredients
for computing signal strengths: the six values of X
(in Xs[interesting]) as well as the indices of these six values (in
interesting itself). We can multiply these two lists of values element-by-element
— thus computing the desired signal strengths – by using Chapel’s
operator promotion (first seen in our day 8 article).
writeln(Xs[interesting] * interesting);
// Prints 420 1140 1800 2940 2880 3960 for sample input
All that’s left is to sum up the signal strengths, which we can do using a reduction.
|
|
That’s our answer to part one!
Displaying the CRT Output
In part two, we discover that the value of the register is actually moving
a three-pixel sprite. A CRT monitor, which is drawing on a screen pixel-by-pixel,
draws a # if the current column overlaps with the sprite’s position, and a . if the
current column does not.
The first thing I do is encode the information about the CRT into a few helper
variables. I made the variables config consts (first seen in our day 6 article)
so that it’s possible to change the CRT’s size from the command line.
|
|
I then use the information to create a new two-dimensional domain representing the possible pixel positions on the CRT’s screen.
|
|
Iterating over Screen would give us the positions (row and column) of each
pixel that the CRT draws, in order. It would be nice to associate the
corresponding value of the register X (i.e., the location of the sprite)
with each of these scan positions. To do so, we can make use of Chapel’s
reshape function.
Reshaping lets us “reorganize” the elements of an array:
for instance, we could turn an 8-element one-dimensional array into a
two-dimensional 2-by-4 array, or a three-dimensional 2-by-2-by-2 array.
The elements of the resulting array are the same as those of the original.
In the case of this problem, we want to arrange the sprite positions from
Xs (a one-dimensional array) into the same shape as our CRT monitor,
to match them up with each row and column.
|
|
In the above snippet, I use slicing to retrieve only the first Screen.size
positions from Xs, since the puzzle input leaves us with slightly more steps
than we need to draw the screen. Then, I call reshape with the resulting
slice, and the Screen domain. This tells Chapel to make spritePos
a two-dimensional array, with crtRows rows and crtCols columns, whose
elements are taken from the first crtRows * crtCols sprite positions in Xs.
Now: how do we know when the CRT’s current column overlaps with the sprite?
The sprite is three pixels wide, and centered at the value of the register
X at a particular time step. Thus, a column overlaps with the sprite if
it’s no more than one pixel away from its center. We can express this in
terms of an absolute value:
abs(col - X) <= 1
The above expression produces a boolean. We can get the correct symbol from
this boolean (# when it’s true, or . when it’s false) by using
an if-expression:
if abs(col - X) <= 1 then '#' else '.'
Since we have just reshaped our Xs into a two-dimensional array matching
our Screen, we can figure out the value of X simply by accessing that
array at a particular row and column:
if abs(col - spritePos[row, col]) <= 1 then '#' else '.'
That last piece of code gives us the value of a particular pixel in the CRT monitor. All that’s left is to compute the value of every pixel in the monitor. We can accomplish this using a parallel loop expression (which we covered in our day 6 article).
|
|
Since Chapel knows the pixels variable is in the shape of Screen, it knows
to print it line-by-line, and we get our desired output.
Summary
That’s it for both parts of today’s solution! This time, we used scan
expressions to elegantly compute partial sums of an array (though they can
be used to compute any partial reduction). We also made use of reshape
to rearrange a one-dimensional array into a more desirable form — one that
matched up directly with our two-dimensional CRT screen.
Our solution is automatically parallel because of the scan and
the loop expression we used to define our pixels variable.
This once again highlights Chapel’s power: we expressed our solution using
natural, high-level patterns, and the language took care of making them
parallel.
Thanks for reading! Please feel free to ask any questions or post any comments you have in the new Blog Category of Chapel’s Discourse Page.