This module represents work in progress. The API is unstable and likely to change over time.
This module provides unordered versions of non-fetching atomic operations for
real types. Unordered versions of
xor() are provided. The results
of these functions are not visible until task or forall termination or an
unorderedAtomicTaskFence(), but they can provide a
significant speedup for bulk atomic operations that do not require ordering:
use UnorderedAtomics; const numTasksPerLocale = here.maxTaskPar, iters = 10000; var a: atomic int; coforall loc in Locales do on loc do coforall 1..numTasksPerLocale do for i in 1..iters do a.unorderedAdd(i); // unordered atomic add // no fence required, fenced at task termination const itersSum = iters*(iters+1)/2, // sum from 1..iters numTasks = numLocales * numTasksPerLocale; assert(a.read() == numTasks * itersSum);
It's important to be aware that unordered atomic operations are not
consistent with regular atomic operations and updates may not be visible
until the task or forall that issued them terminates or they are explicitly
var a: atomic int; a.unorderedAdd(1); writeln(a); // can print 0 or 1 unorderedAtomicTaskFence(); writeln(a); // prints 1
Generally speaking they are useful for when you have a large batch of atomic updates to perform and the order of those operations doesn't matter.
Currently, these are only optimized for
Processor atomics or any other implementation falls back to ordered
operations. Under ugni these operations are internally buffered. When the
buffers are flushed, the operations are performed all at once. Cray Linux
Environment (CLE) 5.2.UP04 or newer is required for best performance. In
our experience, unordered atomics can achieve up to a 5X performance
improvement over ordered atomics for CLE 5.2UP04 or newer.
unorderedAdd(value: T): void¶
Unordered atomic add.
unorderedSub(value: T): void¶
Unordered atomic sub.
unorderedOr(value: T): void¶
Unordered atomic or.
unorderedAnd(value: T): void¶
Unordered atomic and.
unorderedXor(value: T): void¶
Unordered atomic xor.
Fence any pending unordered atomics issued by the current task.
This function has been deprecated - please use
unorderedAtomicTaskFence()instead. Note that this function has been deprecated without a full release of support because the previous global fence semantics imposed expensive implementation requirements and is not expected to be needed now that operations are implicitly fenced at task/forall termination.