LockFreeQueue

Usage

use LockFreeQueue;

or

import LockFreeQueue;

A lock-free queue using the Michael and Scott algorithm.

Warning

This module relies on the AtomicObjects package module, which has several platform restrictions in its current state:

  • It relies on Chapel extern code blocks and so requires that the Chapel compiler is built with LLVM enabled.

  • Currently only CHPL_TARGET_ARCH=x86_64 is supported as it uses the x86-64 instruction: CMPXCHG16B.

  • The implementation relies on GCC style inline assembly, and so is restricted to a CHPL_TARGET_COMPILER value of gnu, clang, or llvm.

An implementation of the Michael & Scott 1, a lock-free queue. Concurrent safe memory reclamation is handled by an internal EpochManager. Usage of the queue can be seen below.

var lfq = new LockFreeQueue(int);
forall i in 1..N do lfq.enqueue(i);
var total : int;
coforall tid in 1..here.maxTaskPar with (+ reduce total) {
  var (hasElt, elt) = lfq.dequeue();
  while hasElt {
    total += elt;
    (hasElt, elt) = lfq.dequeue();
  }
}

As an optimization, the user can register to receive a TokenWrapper, and pass this to the stack. This can provide significant improvement in performance by up to an order of magnitude by avoiding the overhead of registering and unregistering for each operation.

var lfq = new LockFreeQueue(int);
forall i in 1..N with (var tok = lfq.getToken()) do lfq.enqueue(i,tok);
var total : int;
coforall tid in 1..here.maxTaskPar with (+ reduce total) {
  var tok = lfq.getToken();
  var (hasElt, elt) = lfq.dequeue(tok);
  while hasElt {
    total += elt;
    (hasElt, elt) = lfq.dequeue(tok);
  }
}

Lastly, to safely reclaim memory, the user must explicitly invoke tryReclaim, or else there will be a memory leak. This must be explicitly invoked so that the user may tune how often reclamation will be attempted. Reclamation is concurrent-safe, but if called too frequently, it can add unnecessary overhead. A complete example of what would be considered ‘optimal’ usage of this lock-free stack.

var lfq = new LockFreeQueue(int);
forall i in 1..N with (var tok = lfq.getToken()) do lfq.push(i,tok);
var total : int;
coforall tid in 1..here.maxTaskPar with (+ reduce total) {
  var tok = lfq.getToken();
  var (hasElt, elt) = lfq.dequeue(tok);
  var n : int;
  while hasElt {
    total += elt;
    (hasElt, elt) = lfq.dequeue(tok);
    n += 1;
    if n % GC_THRESHOLD == 0 then lfq.tryReclaim();
  }
}

Also provided, is a utility method for draining the stack of all elements, called drain. This iterator will implicitly call tryReclaim at the end and will optimally create one token per task.

var lfq = new LockFreeQueue(int);
forall i in 1..N with (var tok = lfq.getToken()) do lfq.enqueue(i,tok);
var total = + reduce lfq.drain();
1

Michael, Maged M., and Michael L. Scott. Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. No. TR-600. ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE, 1995.

class Node
type eltType
var val: toNilableIfClassType(eltType)
var next: AtomicObject(unmanaged Node(eltType)?, hasGlobalSupport = true, hasABASupport = false)
proc init(val: ?eltType)
proc init(type eltType)
class LockFreeQueue
type objType
var _head: AtomicObject(unmanaged Node(objType), hasGlobalSupport = true, hasABASupport = false)
var _tail: AtomicObject(unmanaged Node(objType), hasGlobalSupport = true, hasABASupport = false)
var _manager = new owned LocalEpochManager()
proc objTypeOpt type
proc init(type objType)
proc getToken(): owned TokenWrapper
proc enqueue(newObj: objType, tok: owned TokenWrapper = getToken())
proc dequeue(tok: owned TokenWrapper = getToken()): (bool, objTypeOpt)
iter drain(): objTypeOpt
iter drain(param tag: iterKind): objTypeOpt  where tag == iterKind.standalone
proc tryReclaim()