IO Serializers and Deserializers¶
Overview¶
Historically, Chapel’s IO module supported formatting options for reading and
writing values in non-standard formats via the readf and writef methods
(e.g., %jt for JSON). Chapel 1.32 introduced a new API that allows for
user-defined formatting with fileReader and fileWriter, rather than
relying solely on built-in support in the standard library. This new API allows
for the configuration of fileReader and fileWriter with user-defined
types that can define the format used by methods like read and write.
For example, if a user wishes to write a record in JSON format they can now
use the JSON standard module in Chapel 1.32:
use IO, JSON;
record Person {
var name : string;
var age : int;
}
var f = open("output.txt", ioMode.cw);
// configure 'w' to always write in JSON format
var w = f.writer(serializer=new jsonSerializer());
// writes:
// {"name":"Sam", "age":20}
var p = new Person("Sam", 20);
w.write(p);
Serializers and Deserializers interact with user-defined types like Person
by invoking particular methods whose API will be discussed in more detail
later in this technote. By default, the compiler will generate methods on
user-defined types capable of interacting with Serializers and Deserializers
such that most types will simply work out of the box. For more complicated
cases, users can implement their own methods on their types to customize
serialization and deserialization.
In Chapel 1.32, Serializers and Deserializers are enabled by default. Users
wishing to opt-out of this capability can recompile their programs with
the config param useIOSerializers set to false. This config param
will be available through the Chapel 1.34 release at minimum.
API Changes to Standard IO¶
Before diving into the API that Serializers and Deserializers must implement, there are additions to the API of standard IO types. For the purposes of this document, “Serializer” or “Deserializer” refer to types that implement the appropriate API that the standard IO types will invoke.
Creating fileReaders and fileWriters¶
The fileReader and fileWriter types can now be created with a specified
Serializer or Deserializer. The following methods now contain new optional
serializer or deserializer arguments that accept a record by the
in intent. The copy of the record will be stored inside of the
fileReader/Writer. The default value for these arguments when
-suseIOSerializers is used will be an instance of DefaultSerializer or
DefaultDeserializer.
proc openWriter(path:string,
param kind=iokind.dynamic, param locking=true,
hints = ioHintSet.empty,
in serializer: ?st = new DefaultSerializer())
proc file.writer(param kind=iokind.dynamic, param locking=true,
region: range(?) = 0.., hints = ioHintSet.empty,
in serializer: ?st = new DefaultSerializer())
proc openReader(path:string,
param kind=iokind.dynamic, param locking=true,
region: range(?) = 0.., hints=ioHintSet.empty,
in deserializer: ?dt = new DefaultDeserializer())
proc file.reader(param kind=iokind.dynamic, param locking=true,
region: range(?) = 0.., hints = ioHintSet.empty,
in deserializer: ?dt = new DefaultDeserializer())
New Fields on fileReader and fileWriter¶
The fileReader and fileWriter types each have a new type field
named deserializerType and serializerType respectively. These fields
can be used to constrain arguments to better separate code dedicated to
particular serialization formats:
proc readData(data: [],
reader: fileReader(deserializerType=jsonDeserializer, ?)) {
}
proc readData(data: [],
reader: fileReader(deserializerType=binaryDeserializer, ?)) {
}
Accessing Serializers and Deserializers¶
The instance of a Serializer or Deserializer can be accessed with new methods
on fileReader and fileWriter, which will return the stored instance
by ref:
proc fileReader.deserializer ref : deserializerType
proc fileWriter.serializer ref : serializerType
These instances are returned by ref in case complex implementations require
modification of some internal state.
Switching Formats In-Place¶
The IO library now supports the ability to create an alias of a fileReader
or fileWriter with a new Deserializer or Serializer. This new alias will
point to the same place in the file as the original, but will use the newly
specified format when reading or writing. These methods accept either a record
by in intent, or a type.
proc fileWriter.withSerializer(type serializerType) :
fileWriter(this.kind, this.locking, serializerType)
proc fileWriter.withSerializer(in serializer: ?st) :
fileWriter(this.kind, this.locking, st)
proc fileReader.withDeserializer(type deserializerType) :
fileReader(this.kind, this.locking, deserializerType)
proc fileReader.withDeserializer(in deserializer: ?dt) :
fileReader(this.kind, this.locking, dt)
With these methods, mixing serialization formats within the same file is a simple process:
// An imaginary 'Connection' object that wishes to log the data it sends
// as JSON in the form "[INFO] {...}"
proc Connection.sendData(data: [] Info, log: fileWriter) {
log.writeln("[DEBUG] Sending Info data...");
for d in data {
log.write("[INFO] ");
log.withSerializer(new jsonSerializer()).writeln(d);
this.sendInfo(d);
}
log.writeln("[DEBUG] Done sending Info data.");
}
The type versions of these methods exist for convenience in the case that
the user wishes for the fileReader or fileWriter to create the instance
itself. The Serializer or Deserializer in such cases must support
initialization without any arguments.
// Replacing the line from the previous example
log.withSerializer(jsonSerializer).writeln(d);
Methods That Invoke Serializers and Deserializers¶
The current methods on fileReader and fileWriter that will invoke
Serializers or Deserializers are:
fileWriter.write
fileWriter.writeln
fileReader.read
fileReader.readln
Reading Generic Types and Borrowed Classes¶
As of Chapel 1.31 generic types and borrowed classes are no longer valid
arguments to the versions of read and readln that accept a type
argument. Note that fully-instantiated generic types are still allowed.
Serializer API¶
The API for a Serializer can be split into a few parts:
The interface invoked by a
fileWriterto serialize a valueThe user-defined
serializemethod describing how values of the type should be serializedThe serializer interface used to implement
serializemethods
The fileWriter-Facing Serializer API¶
A Serializer must implement the serializeValue method, which will be used
to serialize values passed to fileWriter.write and fileWriter.writeln.
The signature of the serializeValue method is:
proc Serializer.serializeValue(writer: fileWriter, const val: ?) throws
The serializeValue method returns nothing, and once invoked has complete
control over how the provided value is serialized. The given fileWriter is
guaranteed to have a serializerType identical to the type whose
serializeValue method was called. The fileWriter is also defined to be
non-locking.
By convention Serializers will invoke a serialize method on records and
classes, but notably may choose not to do so if the class instance is nil.
The implementation of serializeValue is expected to handle primitive types
directly. Those primitive types are:
- numeric types (e.g., integers, reals, complex numbers)
- bool types
- string and bytes types
- nil and none values
- enum types
The argument val is defined to be a “primitive” type or a type that
implements either the writeSerializable or serializable interfaces,
both of which define a serialize method that a Serializer may invoke to
allow for user-defined serialization of a type.
The ‘serialize’ Method¶
The serialize method has the following signature, whose API includes the
named arguments “writer” and “serializer”:
proc T.serialize(writer: fileWriter(?), ref serializer: ?st) throws
Types implementing this method must also indicate that they satisfy the
writeSerializable interface in the type declaration. For example:
record R : writeSerializable {
// ...
}
Please refer to the interfaces technote for more information on interfaces and how they can be used.
For classes, the serialize method signature must include override to
account for the serialize method on the RootClass type.
The writer and serializer are passed separately to help distinguish the
method signature from other possible implementations named “serialize”, as well
as to make it slightly more convenient to call methods on the Serializer.
The serializer argument does not necessarily need to be of the same type as
writer.serializerType. Instead, the argument simply needs to implement the
Serializer API and must serialize in a compatible format with
writer.serializerType. This constraint exists to allow for child classes to
pass helper objects created by Serializers to parent class serialize
methods. See the serializer inheritance section
for more information.
The User-Facing Serializer API¶
The user-facing part of the Serializer API is intended to allow users to serialize their types in a format-agnostic way. This is done by invoking a variety of API methods, instead of printing specific characters for a specific format.
The user-facing part of the Serializer API is much larger, and is designed to
support serializing various “kinds” of types. In particular, the API currently
supports serializing Classes, Records, Tuples, Arrays, Lists, and Maps. A given
implementation of a Serializer determines how to represent each kind of type
in its format. For example, JSON lacks a native representation of tuples, and
so the JSON Serializer represents both “list” and “tuple” type-kinds as
JSON lists (e.g. [1, 2, 3]).
To begin serializing a kind of type, users will invoke one of six available
“start” methods on a Serializer, each of which return a “helper” object that
implements an API specific to that kind of type. Note that any of the “start”
methods may return the same “helper” type as another method, in the case that
it is useful for the helper to share logic among certain type kinds. For
example, in the Chapel 1.32 release the defaultSerializer type returned
the same helper object type for both Class and Record type kinds.
Note
In each of these methods, unless otherwise stated, it is entirely up to the
author of the Serializer to define their behavior. For example, name
arguments for classes and records may not apply to a particular format, and
might be ignored.
Note
In each of these groups of methods, it should be noted that the name of each helper object is purely illustrative, and does not indicate the name of a stable interface to be implemented in the future.
The Record Helper¶
Users may begin serializing a Record type kind by invoking the startRecord
method on a Serializer. This method takes a name argument that represents
the name of the record type, and a size argument that represents the
number of fields to be serialized.
proc Serializer.startRecord(writer: fileWriter(false, this.type), name: string, size: int) : RecordHelper throws;
The returned object must implement the following API:
// Serialize a field named 'name'
proc RecordHelper.writeField(name: string, const field: ?) throws;
// End the record according to the serialization format.
proc RecordHelper.endRecord() throws;
The Tuple Helper¶
Users may begin serializing a Tuple type kind by invoking the startTuple
method on a Serializer. This method takes a size argument that represents
the number of elements to be serialized.
proc Serializer.startTuple(writer: fileWriter(false, this.type), size: int) : TupleHelper throws;
The returned object must implement the following API:
// Serialize an element of the tuple.
proc TupleHelper.writeElement(const element: ?) throws;
// End the tuple according to the serialization format.
proc TupleHelper.endTuple() throws;
The Array Helper¶
Users may begin serializing an Array type kind by invoking the startArray
method on a Serializer. This method takes a size argument that represents
the number of array elements to be serialized.
proc Serializer.startArray(writer: fileWriter(false, this.type), size: int) : ArrayHelper throws;
The returned object must implement the following API:
// Serialize the start of a new dimension of size ``size``
proc ArrayHelper.startDim(size: int) throws;
// Serialize the end of the current dimension
proc ArrayHelper.endDim() throws;
// Serializer an element of the array.
proc ArrayHelper.writeElement(const element: ?) throws;
// End the array according to the serialization format.
proc ArrayHelper.endArray() throws;
ArrayHelpers may also optionally implement a writeBulkElements method for
performance:
// If the format permits, write 'numElements' of 'data' in bulk.
proc ArrayHelper.writeBulkElements(data: c_ptr(?eltType), numElements: int) throws;
Note
Currently users can only test for writeBulkElements support by using
Reflection. Improvements to interfaces may provide a more elegant
approach to the ‘optional’ aspect of this method in the future.
The List Helper¶
Users may begin serializing a List type kind by invoking the startList
method on a Serializer. This method takes a size argument that represents
the number of list elements to be serialized.
proc Serializer.startList(writer: fileWriter(false, this.type), size: int) : ListHelper throws;
The returned object must implement the following API:
// Serialize the list element.
proc ListHelper.writeElement(const element: ?) throws;
// End the list according to the serialization format.
proc ListHelper.endList() throws;
The Map Helper¶
Users may begin serializing a Map type kind by invoking the startMap
method on a Serializer. This method takes a size argument that represents
the number of map entries to be serialized.
proc Serializer.startMap(writer: fileWriter(false, this.type), size: int) : MapHelper throws;
The returned object must implement the following API:
// Serialize a map key.
proc MapHelper.writeKey(const key: ?) throws;
// Serialize a map value.
proc MapHelper.writeValue(const val: ?) throws;
// End the map according to the serialization format.
proc MapHelper.endMap() throws;
The Class Helper, Serializers, and Inheritance¶
Users may begin serializing a Class type kind by invoking the startClass
method on a Serializer. The writer argument is passed in and will be used
by the returned ClassHelper to write serialized output. The name argument
is expected to be the name of the class type being serialized. The size
argument is the number of fields being serialized in the current class,
excluding any parent fields. Parent fields are not included to preserve
encapsulation of class implementations and to avoid the inextricable coupling
of parent and child classes.
proc Serializer.startClass(writer: fileWriter(false, this.type), name: string, size: int) : ClassHelper throws;
The returned object must implement the following API:
// Serialize a field named 'name'
proc ClassHelper.writeField(name: string, const field: ?) throws;
// End the class according to the serialization format
proc ClassHelper.endClass() throws;
ClassHelpers are also required to implement the rest of the Serializer API
since they may be passed to parent serialize methods in the
compiler-generated default implementation of serialize methods on classes.
This may be achieved without too much extra effort by using
forwarding on the stored fileWriter’s
.serializer accessor. By allowing ClassHelpers to be passed to parent
serialize methods, formats may capture an inheritance hierarchy if such is
relevant to their format.
The following code snippet is an example of writing serialize methods for
a parent and child class:
class Parent : writeSerializable {
var x : int;
}
class Child : Parent, writeSerializable {
var y : int;
}
// When serializing an instance of 'Parent', 'serializer' could be the same
// type as 'writer.serializerType'.
//
// When serializing an instance of 'Child', 'serializer' could be a
// ClassHelper type, and so the ClassHelper must satisfy the Serializer API.
override proc Parent.serialize(writer: fileWriter(?), ref serializer) {
var ser = serializer.startClass(writer, "Parent", 1);
ser.writeField("x", x);
ser.endClass();
}
override proc Child.serialize(writer: fileWriter(?), ref serializer) {
var ser = serializer.startClass(writer, "Child", 1);
// pass the ClassHelper 'ser' to the parent 'serialize' method
super.serialize(writer, ser);
ser.writeField("y", y);
ser.endClass();
}
User Facing API Notes¶
Note
This document does not define what errors these methods may or may not throw.
Deserializer API¶
The API for a Deserializer can be split into a few parts:
The interface invoked by a
fileReaderto deserialize a valueThe user-defined
deserializemethod and initializer describing how values of the type should be deserializedThe deserializer interface used to implement
deserializemethods and deserializing initializers
The fileReader-Facing Serializer API¶
A Deserializer must implement the following methods, corresponding to the
versions of fileReader.read that accept either a type or a value:
proc Deserializer.deserializeType(reader: fileReader,
type readType) : readType throws
proc Deserializer.deserializeValue(reader: fileReader,
ref val: ?readType) : void throws
The deserializeType method is responsible for creating a new instance of
the given type, and returning that new instance. By convention
deserializeType will invoke a initializer by passing in the reader and
a Deserializer. This technote will refer to such initializers with the desired
signature as “deserializing initializers”, which can be generated by the
compiler. If a suitable initializer is not available, this method may attempt
to invoke a deserialize method on a default-initialized value.
The deserializeValue method must modify an existing value, which can be
useful for types that are not cheap to allocate and benefit from re-use (e.g.
arrays). By convention deserializeValue will invoke a deserialize
method on records and classes. If a suitable deserialize method is not
available, this method may attempt to invoke a suitable initializer and assign
the result into the value.
For classes, the deserializeValue method has the freedom to potentially
free the given class and/or reassign it, depending on the needs of the
Deserializer.
The arguments val or readType are defined to be a “primitive” type or a
type that implements at least one of the following interfaces:
readDeserializableinitDeserializableserializable(combineswriteSerializablewith the two above)
In both methods, the given fileReader is also defined to be non-locking.
Note that while both methods may invoke initializers or methods that pass
control back to the user, Deserializers may ignore those options in the case
that a class is nilable and can be read as nil.
The Deserializing Initializer¶
An initializer invoked by a Deserializer must have the following signature, including the argument names “reader” and “deserializer”:
proc T.init(reader: fileReader(?),
ref deserializer: ?dt) throws
Types implementing this method must also indicate that they satisfy the
initDeserializable interface. Please refer to the
interfaces technote for more information on
interfaces and how they can be used.
By default, the compiler will generate a suitable initializer with this signature provided that no other user-defined initializers exist.
The reader and deserializer are passed separately to help distinguish
the method signature from other possible initializers, as well as to make it
slightly more convenient to call methods on the Deserializer.
The deserializer argument must implement the Deserializer API and must
deserialize in a compatible format with reader.deserializerType. This
constraint exists to allow for child classes to pass helper objects created
by Deserializers to parent initializers. See the previous section on
serializer inheritance for more information.
Generic types have a slightly more complex initializer signature, in that there
must be a type or param argument for each type or param field.
For example:
record G : initDeserializable {
type A;
type B;
var x : A;
var y : B;
}
proc G.init(type A, type B,
reader: fileReader, ref deserializer) throws {
/* ... */
}
// With a reader 'r'
var x = r.read(G(int, real));
// becomes something like...
// new G(A=int, B=real, reader=r, deserializer=r.deserializer)
Warning
Generic types with typeless fields, like “var x;”, cannot yet be deserialized using an initializer.
Warning
Throwing inside an initializer before the type is fully initialized is not yet allowed in Chapel.
The ‘deserialize’ Method¶
The deserialize method has the following signature, and also requires
its arguments to have the names “reader” and “deserializer”:
proc ref T.deserialize(reader: fileReader(?),
ref deserializer: ?dt) throws
For classes, this signature is slightly different in that it requires the
override keyword and a blank this-intent:
override proc T.deserialize(reader: fileReader(?),
ref deserializer: ?dt) throws
By default, the compiler will generate a suitable deserialize method with
this signature provided.
Types implementing this method must also indicate that they satisfy the
readDeserializable interface. Please refer to the
interfaces technote for more information on
interfaces and how they can be used.
The User-Facing Deserializer API¶
Like the Serializer API, the user-facing part of the Deserializer API is relatively large and supports the same set of type kinds as a Serializer. Also like the Serializer API, the Deserializer API works through the creation and use of helper objects returned by various “start” methods.
The Deserializer API is also slightly larger due to the need for “type” and
“by reference” versions of methods like readElement, to match the desired
behavior of the originating fileReader.read call. The List and Map type
kinds also support a hasMore method to help users know when they can stop
reading.
Note
In each of these methods, unless otherwise stated, it is entirely up to the
author of the Deserializer to define their behavior. For example, name
arguments for classes and records may not apply to a particular format, and
might be ignored.
Note
In each of these groups of methods, it should be noted that the name of each helper object is purely illustrative, and does not indicate the name of a stable interface to be implemented in the future.
The Record Helper¶
Users may begin deserializing a Record type kind by invoking the
startRecord method on a Deserializer. This method takes a name argument
that represents the name of the record type.
proc Deserializer.startRecord(reader: fileReader(false, this.type), name: string) : RecordHelper throws;
The returned object must implement the following API:
// Deserialize a field named 'name', returns a value of type 'fieldType'
proc RecordHelper.readField(name: string, type fieldType) : fieldType throws;
// Deserialize a field named 'name' in-place.
proc RecordHelper.readField(name: string, ref field :?) throws;
// End the record according to the deserialization format.
proc RecordHelper.endRecord() throws;
The Tuple Helper¶
Users may begin deserializing a Tuple type kind by invoking the startTuple
method on a Deserializer.
proc Deserializer.startTuple(reader: fileReader(false, this.type)) : TupleHelper throws;
The returned object must implement the following API:
// Deserialize an element of the tuple, return a value of type 'eltType'
proc TupleHelper.readElement(type eltType) : eltType throws;
// Deserialize 'element' as a tuple element in-place.
proc TupleHelper.readElement(ref element: ?) throws;
// End the tuple according to the deserialization format.
proc TupleHelper.endTuple() throws;
The Array Helper¶
Users may begin deserializing an Array type kind by invoking the startArray
method on a Deserializer.
proc Deserializer.startArray(reader: fileReader(false, this.type)) : ArrayHelper throws;
The returned object must implement the following API:
// Deserialize an element of the array, return a value of type 'eltType'
proc ArrayHelper.readElement(type eltType) : eltType throws;
// Deserialize 'element' as an array element in-place.
proc ArrayHelper.readElement(ref element: ?) throws;
// Start deserializing a new dimension
proc ArrayHelper.startDim() throws;
// End the array dimension according to the deserialization format.
proc ArrayHelper.endDim() throws;
// End the array according to the deserialization format.
proc ArrayHelper.endArray() throws;
ArrayHelpers may also optionally implement a readBulkElements method for
performance:
// If the format permits, write 'numElements' of 'data' in bulk.
proc ArrayHelper.readBulkElements(data: c_ptr(?eltType), n: int) throws;
Note
Currently users can only test for readBulkElements support by using
Reflection. Improvements to interfaces may provide a more elegant
approach to the ‘optional’ aspect of this method in the future.
The List Helper¶
Users may begin deserializing a List type kind by invoking the startList
method on a Deserializer.
proc Deserializer.startList(reader: fileReader(false, this.type)) : ListHelper throws;
The returned object must implement the following API:
// Deserialize an element of the list, return a value of type 'eltType'
proc ListHelper.readElement(type eltType) : eltType throws;
// Deserialize 'element' as a list element in-place.
proc ListHelper.readElement(ref element: ?) throws;
// Returns 'true' if there are more elements to deserialize
proc ListHelper.hasMore() : bool throws;
// End the list according to the deserialization format.
proc ListHelper.endList() throws;
The Map Helper¶
Users may begin deserializing a Map type kind by invoking the startMap
method on a Deserializer.
proc Deserializer.startMap(reader: fileReader(false, this.type)) : MapHelper throws;
The returned object must implement the following API:
// Deserialize a key of the map, return a value of type 'keyType'
proc MapHelper.readKey(type keyType) : keyType throws;
// Deserialize 'key' as a map key in-place.
proc MapHelper.readKey(ref key: ?) throws;
// Deserialize a value of the map, return a value of type 'valType'
proc MapHelper.readValue(type valType) : valType throws;
// Deserialize 'value' as a map value in-place.
proc MapHelper.readValue(ref value: ?) throws;
// Returns 'true' if there are more map entries to deserialize
proc MapHelper.hasMore() : bool throws;
// End the map according to the deserialization format.
proc MapHelper.endMap() throws;
The Class Helper¶
Users may begin deserializing a Class type kind by invoking the startClass
method on a Deserializer. This method takes a name argument that represents
the name of the class type.
proc Deserializer.startClass(reader: fileReader(false, this.type), name: string) : ClassHelper throws;
The returned object must implement the following API:
// Deserialize a field named 'name', returns a value of type 'fieldType'
proc ClassHelper.readField(name: string, type fieldType) : fieldType throws;
// Deserialize a field named 'name' in-place.
proc ClassHelper.readField(name: string, ref field :?) throws;
// End the class according to the deserialization format.
proc ClassHelper.endClass() throws;
Like in the Serializer API, the ClassHelper must implement the rest of the
Deserializer API to allow for the ClassHelper to be passed to parent
initializers and parent deserialize methods.
The ‘serializable’ Interface¶
The serializable interface mentioned on this document is intended to be
an interface that requires implementation of all three kinds of user-defined
methods: serialize, deserialize, and a deserializing initializer.
A formal definition of this interface is pending, following the standardization of interfaces in the language.