Zarr

Usage

use Zarr;

or

import Zarr;

Support for distributed reading and writing of Zarr stores. Support is limited to v2 Zarr arrays stored on local filesystems. NFS is not supported. The module uses c-blosc to compress and decompress chunks. Zarr specification: https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html

config param zarrProfiling = false

Turns on/off profiling of Zarr IO

iter zarrProfilingResults() throws

Returns a map of profiling results for Zarr IO operations. The keys are the names of the operations and the values are the total time spent in each operation across all threads. Requires that zarrProfiling be set to true.

record zarrMetadataV2
var zarr_format : int
var chunks : list(int)
var dtype : string
var shape : list(int)
record zarrMetadataV3

Unused until support is added for v3.0 stores

var zarr_format : int
var node_type : string
var shape : list(int)
var data_type : string
var dimension_names : list(string)
proc getLocalChunks(D: domain(?), localD: domain(?), chunkShape: ?dimCount*int) : domain(dimCount)

Returns the domain of chunks that the calling locale is responsible for

proc readChunk(param dimCount: int, chunkPath: string, chunkDomain: domain(dimCount), ref arraySlice: [] ?t) throws

Reads a chunk from storage and fills arraySlice with its corresponding values.

Arguments:
  • dimCount – Dimensionality of the array being read.

  • chunkPath – Relative or absolute path to the chunk being read.

  • chunkDomain – Array subdomain the chunk contains.

  • arraySlice – Reference to the portion of the array the calling locale stores.

Throws:

Error – If the decompression fails

proc writeChunk(param dimCount, chunkPath: string, chunkDomain: domain(dimCount), ref arraySlice: [] ?t, bloscLevel: int(32) = 9) throws

Updates a chunk in storage with a locale’s contribution to that chunk. The calling function is expected to manage synchronization among locales. If the locale contributes the entire chunk, it will immediately compress and write the chunk’s data. If the contribution is partial, it decompresses the chunk, updates the necessary values, then compresses and writes the chunk to storage.

Arguments:
  • dimCount – Dimensionality of the array being written.

  • chunkPath – Relative or absolute path to the chunk being written.

  • chunkDomain – Array subdomain that the chunk contains.

  • arraySlice – The portion of the array that the calling locale contributes to this chunk.

  • bloscLevel – Compression level to use. 0 indicates no compression, 9 (default) indicates maximum compression. Values outside of this range will be clipped to a value between 0 and 9.

Throws:

Error – If the compression fails

proc readZarrArray(directoryPath: string, type dtype, param dimCount: int, bloscThreads: int(32) = 1) throws

Reads a v2.0 zarr store from storage, returning a block distributed array. Each locale reads and decompresses the chunks with elements in its subdomain. This method assumes a shared filesystem where all nodes can access the store directory.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The store is expected to contain a ‘.zarray’ metadata file

  • dtype – Chapel type of the store’s data

  • dimCount – Dimensionality of the zarr array

  • bloscThreads – The number of threads to use during decompression (default=1)

proc writeZarrArray(directoryPath: string, ref A: [?domainType] ?dtype, chunkShape: ?dimCount*int, bloscThreads: int(32) = 1, bloscLevel: int(32) = 9) throws

Writes an array to storage as a v2.0 zarr store. The array metadata and chunks will be stored within the directoryPath directory, which is created if it does not yet exist. The chunks will have the dimensions given in the chunkShape argument. This function writes chunks in parallel, and supports distributed execution. It assumes a shared filesystem where all nodes can access the store directory.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The directory and all necessary parent directories will be created if it does not exist.

  • A – The array to write to storage.

  • chunkShape – The dimension extents to use when breaking A into chunks.

  • bloscThreads – The number of threads to use during compression (default=1)

  • bloscLevel – Compression level to use. 0 indicates no compression, 9 (default) indicates maximum compression.