Zarr

Usage

use Zarr;

or

import Zarr;

Support for reading and writing of Zarr stores.

Support is limited to v2 Zarr arrays stored on local filesystems. NFS is not supported. The module uses c-blosc to compress and decompress chunks. Zarr specification: https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html

config param zarrProfiling = false

Turns on/off profiling of Zarr IO

iter zarrProfilingResults() throws

Returns a map of profiling results for Zarr IO operations. The keys are the names of the operations and the values are the total time spent in each operation across all threads. Requires that zarrProfiling be set to true.

proc resetZarrProfiling()

Resets the profiling timer for Zarr IO operations. Should only be used when compiled with zarrProfiling set to true.

record zarrMetadataV2
var zarr_format : int
var chunks : list(int)
var dtype : string
var shape : list(int)
var compressor : string
record zarrMetadataV2Required
var zarr_format : int
var chunks : list(int)
var dtype : string
var shape : list(int)
record zarrMetadataV2Optional
var compressor : string
record zarrMetadataV3

Unused until support is added for v3.0 stores

var zarr_format : int
var node_type : string
var shape : list(int)
var data_type : string
var dimension_names : list(string)
proc getLocalChunks(D: domain(?), localD: domain(?), chunkShape: ?dimCount*int) : domain(dimCount)

Returns the indices of the chunks that contain elements present in a subdomain of the array.

proc getChunkDomain(chunkShape: ?dimCount*int, chunkIndex: dimCount*int)

Returns the domain of a chunk for a store with a given chunk shape.

Arguments:
  • chunkShape – A tuple of the extents of the dimensions of each chunk in the store.

  • chunkIndex – A tuple of the indices of the chunk to get the domain for.

Returns:

The domain of the chunk.

proc readChunk(param dimCount: int, chunkPath: string, chunkDomain: domain(dimCount), ref arraySlice: [] ?t) throws

Reads a chunk from storage and fills arraySlice with its corresponding values.

Arguments:
  • dimCount – Dimensionality of the array being read.

  • chunkPath – Relative or absolute path to the chunk being read.

  • chunkDomain – Domain of the chunk being read. Because boundary chunks are padded with zeros, the chunk’s domain may be larger in some dimensions than the array’s.

  • arraySlice – Reference to the portion of the calling locale’s section of the array that this chunk will update. The domain of this slice should be a subset of the chunk’s.

Throws:

Error – If the decompression fails

proc writeChunk(param dimCount, chunkPath: string, chunkDomain: domain(dimCount), ref arraySlice: [] ?t, bloscLevel: int(32) = 9, compressor: string = "blosclz") throws

Updates a chunk in storage with a locale’s contribution to that chunk. The calling function is expected to manage synchronization among locales. If the locale contributes the entire chunk, it will immediately compress and write the chunk’s data. If the contribution is partial, it decompresses the chunk, updates the necessary values, then compresses and writes the chunk to storage.

Arguments:
  • dimCount – Dimensionality of the array being written.

  • chunkPath – Relative or absolute path to the chunk being written.

  • chunkDomain – Domain of the chunk being updated. Because boundary chunks are padded with zeros, the chunk’s domain may be larger in some dimensions than the array’s.

  • arraySlice – The portion of the array that the calling locale contributes to this chunk.

  • bloscLevel – Compression level to use. 0 indicates no compression, 9 (default) indicates maximum compression. Values outside of this range will be clipped to a value between 0 and 9.

Throws:

Error – If the compression fails

proc readZarrArray(directoryPath: string, type dtype, param dimCount: int, bloscThreads: int(32) = 1, targetLocales: [] locale = Locales) throws

Reads a v2.0 zarr store from storage using all locales, returning a block distributed array. Each locale reads and decompresses the chunks with elements in its subdomain. This method assumes a shared filesystem where all nodes can access the store directory.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The store is expected to contain a ‘.zarray’ metadata file

  • dtype – Chapel type of the store’s data

  • dimCount – Dimensionality of the zarr array

  • bloscThreads – The number of threads to use during compression (default=1)

  • targetLocales – The locales to use for reading the array in the shape the array will be distributed

proc writeZarrArray(directoryPath: string, const ref A: [?domainType] ?dtype, chunkShape: ?dimCount*int, bloscLevel: int(32) = 9, compressor = "blosclz") throws

Writes an array to storage as a v2.0 zarr store. The array metadata and chunks will be stored within the directoryPath directory, which is created if it does not yet exist. The chunks will have the dimensions given in the chunkShape argument. This function writes chunks in parallel, and supports distributed execution. It assumes a shared filesystem where all nodes can access the store directory.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The directory and all necessary parent directories will be created if it does not exist.

  • A – The array to write to storage.

  • chunkShape – The dimension extents to use when breaking A into chunks.

  • bloscLevel – Compression level to use. 0 indicates no compression, 9 (default) indicates maximum compression.

  • compressor – Compression algorithm to use. Supported values are “blosclz” (default), “lz4”, “lz4hc”, “zlib”, and “zstd”.

proc readZarrArrayPartial(directoryPath: string, type dtype, param dimCount: int, partialDomain, bloscThreads: int(32) = 1, targetLocales: [] locale = Locales) throws

Reads part of a v2.0 zarr store from storage using all locales, returning a block distributed array. Each locale reads and decompresses the chunks with elements in its subdomain. This method assumes a shared filesystem where all nodes can access the store directory.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The store is expected to contain a ‘.zarray’ metadata file

  • dtype – Chapel type of the store’s data

  • dimCount – Dimensionality of the zarr array

  • partialDomain – The domain of the elements of the array that should be read

  • bloscThreads – The number of threads to use during compression (default=1)

  • targetLocales – The locales to use for reading the array in the shape the array will be distributed

proc readZarrArrayLocal(directoryPath: string, type dtype, param dimCount: int) throws

Reads a v2.0 zarr store from storage using a single locale, returning a locally allocated array. This method assumes a shared filesystem where the current locale can access the store directory.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The store is expected to contain a ‘.zarray’ metadata file

  • dtype – Chapel type of the store’s data

  • dimCount – Dimensionality of the zarr array

proc writeZarrArrayLocal(directoryPath: string, ref A: [?domainType] ?dtype, chunkShape: ?dimCount*int, bloscLevel: int(32) = 9, compressor = "blosclz") throws

Writes an array to storage as a v2.0 zarr store using a single locale. The array metadata and chunks will be stored within the directoryPath directory, which is created if it does not yet exist. The chunks will have the dimensions given in the`chunkShape` argument.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. The directory and all necessary parent directories will be created if it does not exist.

  • A – The array to write to storage.

  • chunkShape – The dimension extents to use when breaking A into chunks.

  • bloscLevel – Compression level to use. 0 indicates no compression, 9 (default) indicates maximum compression.

  • compressor – Compression algorithm to use. Supported values are “blosclz” (default), “lz4”, “lz4hc”, “zlib”, and “zstd”.

proc updateZarrChunk(directoryPath: string, ref A: [?domainType] ?dtype, chunkIndex: ?dimCount*int) throws

Updates a single chunk within a Zarr store with the data in A. The Zarr store and the associated metadata file must already exist.

Arguments:
  • directoryPath – Relative or absolute path to the root of the zarr store. This directory should exist and contain a ‘.zarray’ metadata file.

  • A – The array to update the chunk with.

  • chunkIndex – The index of the chunk to update.

  • bloscThreads – The number of threads to use during compression (default=1)

proc updateZarrChunk(directoryPath: string, ref A: [?domainType] ?dtype, chunkIndex: int) throws