IO Serializers and Deserializers¶
Overview¶
Historically, Chapel’s IO module supported formatting options for reading and
writing values in non-standard formats via the readf
and writef
methods
(e.g., %jt
for JSON). Chapel 1.31 introduces a new API that allows for
user-defined formatting with fileReader
and fileWriter
, rather than
relying solely on built-in support in the standard library. This new API allows
for the configuration of fileReader
and fileWriter
with user-defined
types that can define the format used by methods like read
and write
.
For example, if a user wishes to write a record in JSON format they can now
use the Json
package module in Chapel 1.31:
use IO, Json;
record Person {
var name : string;
var age : int;
}
var f = open("output.txt", ioMode.cw);
// configure 'w' to always write in JSON format
var w = f.writer(serializer=new JsonSerializer());
// writes:
// {"name":"Sam", "age":20}
var p = new Person("Sam", 20);
w.write(p);
Serializers and Deserializers interact with user-defined types like Person
by invoking a particular set of methods which will be discussed in more detail
later in this technote. By default, the compiler will generate methods on
user-defined types capable of interacting with Serializers and Deserializers
such that most types will simply work out of the box. For more complicated
cases, users can implement their own methods on their types to customize
serialization and deserialization.
In the interest of supporting gradual updating of code, in Chapel 1.31 the
standard IO library will not use Serializers or Deserializers by default.
Users can opt-in to particular formats by creating fileReader
and
fileWriter
instances with new Serializer and Deserializer types. Users
wishing to experiment with exclusively using this new feature can set the
boolean config param useIOSerializers
:
chpl foo.chpl -suseIOSerializers
This config param configures fileReader
and fileWriter
to use the
DefaultDeserializer
and DefaultSerializer
types by default, which
implement the exact same formatting as past releases.
API Changes to Standard IO¶
Before diving into the API that Serializers and Deserializers must implement, there are additions to the API of standard IO types. For the purposes of this document, “Serializer” or “Deserializer” refer to types that implement the appropriate API that the standard IO types will invoke.
Creating fileReaders and fileWriters¶
The fileReader
and fileWriter
types can now be created with a specified
Serializer or Deserializer. The following methods now contain new optional
serializer
or deserializer
arguments that accept a record by the
in
intent. The copy of the record will be stored inside of the
fileReader/Writer
. The default value for these arguments when
-suseIOSerializers
is used will be an instance of DefaultSerializer
or
DefaultDeserializer
.
proc openWriter(path:string,
param kind=iokind.dynamic, param locking=true,
hints = ioHintSet.empty,
in serializer: ?st = new DefaultSerializer())
proc file.writer(param kind=iokind.dynamic, param locking=true,
region: range(?) = 0.., hints = ioHintSet.empty,
in serializer: ?st = new DefaultSerializer())
proc openReader(path:string,
param kind=iokind.dynamic, param locking=true,
region: range(?) = 0.., hints=ioHintSet.empty,
in deserializer: ?dt = new DefaultDeserializer())
proc file.reader(param kind=iokind.dynamic, param locking=true,
region: range(?) = 0.., hints = ioHintSet.empty,
in deserializer: ?dt = new DefaultDeserializer())
New Fields on fileReader and fileWriter¶
The fileReader
and fileWriter
types each have a new type
field
named deserializerType
and serializerType
respectively. These fields
can be used to constrain arguments to better separate code dedicated to
particular serialization formats:
proc readData(data: [],
reader: fileReader(deserializerType=JsonDeserializer, ?)) {
}
proc readData(data: [],
reader: fileReader(deserializerType=BinaryDeserializer, ?)) {
}
Accessing Serializers and Deserializers¶
The instance of a Serializer or Deserializer can be accessed with new methods
on fileReader
and fileWriter
, which will return the stored instance
by ref
:
proc fileReader.deserializer ref : deserializerType
proc fileWriter.serializer ref : serializerType
These instances are returned by ref
in case complex implementations require
modification of some internal state.
Switching Formats In-Place¶
The IO library now supports the ability to create an alias of a fileReader
or fileWriter
with a new Deserializer or Serializer. This new alias will
point to the same place in the file as the original, but will use the newly
specified format when reading or writing. These methods accept either a record
by in
intent, or a type
.
proc fileWriter.withSerializer(type serializerType) :
fileWriter(this.kind, this.locking, serializerType)
proc fileWriter.withSerializer(in serializer: ?st) :
fileWriter(this.kind, this.locking, st)
proc fileReader.withDeserializer(type deserializerType) :
fileReader(this.kind, this.locking, deserializerType)
proc fileReader.withDeserializer(in deserializer: ?dt) :
fileReader(this.kind, this.locking, dt)
With these methods, mixing serialization formats within the same file is a simple process:
// An imaginary 'Connection' object that wishes to log the data it sends
// as JSON in the form "[INFO] {...}"
proc Connection.sendData(data: [] Info, log: fileWriter) {
log.writeln("[DEBUG] Sending Info data...");
for d in data {
log.write("[INFO] ");
log.withSerializer(new JsonSerializer()).writeln(d);
this.sendInfo(d);
}
log.writeln("[DEBUG] Done sending Info data.");
}
The type
versions of these methods exist for convenience in the case that
the user wishes for the fileReader
or fileWriter
to create the instance
itself. The Serializer or Deserializer in such cases must support
initialization without any arguments.
// Replacing the line from the previous example
log.withSerializer(JsonSerializer).writeln(d);
Methods That Invoke Serializers and Deserializers¶
The current methods on fileReader
and fileWriter
that will invoke
Serializers or Deserializers are:
fileWriter.write
fileWriter.writeln
fileReader.read
fileReader.readln
Reading Generic Types and Borrowed Classes¶
In Chapel 1.31 generic types and borrowed classes are no longer valid arguments
to the versions of read
and readln
that accept a type
argument.
Note that fully-instantiated generic types are still allowed.
Serializer API¶
A Serializer must implement the serializeValue
method:
proc Serializer.serializeValue(writer: fileWriter, const val: ?) throws
The serializeValue
method returns nothing, and once invoked has complete
control over how the provided value is serialized. The given fileWriter
is
guaranteed to have a serializerType
identical to the type whose
serializeValue
method was called. The fileWriter
is also defined to be
non-locking.
By convention Serializers will invoke a serialize
method on records and
classes, but notably may choose not to do so if the class instance is nil
.
The ‘serialize’ Method¶
The serialize
method has the following signature, whose API includes the
named arguments “writer” and “serializer”:
proc T.serialize(writer: fileWriter(?),
ref serializer: writer.serializerType) throws
The writer
and serializer
are passed separately to help distinguish the
method signature from other possible implementations named “serialize”, as well
as to make it slightly more convenient to call methods on the Serializer. A
future release will standardize other methods on a Serializer that provide ways
to serialize into common types, like lists or maps.
It is an error for writer.serializer
to refer to a different Serializer
instance than the serializer
argument. The Serializer is responsible for
either passing itself to the ‘serializer’ argument, or if applicable can create
a new instance of itself to pass. The appropriate choice here depends on the
degree to which the Serializer relies on internal state, and how that internal
state must be managed. If a copy must be made, then the withSerializer
method may be used to provide an alias.
Note
The set of standard builtin types (e.g. ranges and domains) on which this method may be invoked is currently unstable.
Deserializer API¶
A Deserializer must implement the following methods, corresponding to the
versions of fileReader.read
that accept either a type or a value:
proc Deserializer.deserializeType(reader: fileReader,
type readType) : readType throws
proc Deserializer.deserializeValue(reader: fileReader,
ref val: ?readType) : void throws
The deserializeType
method is responsible for creating a new instance of
the given type, and returning that new instance. By convention
deserializeType
will invoke a initializer by passing in the reader
and
a Deserializer. This technote will refer to such initializers with the desired
signature as “deserializing initializers”, which can be generated by the
compiler. If a suitable initializer is not available, this method may attempt
to invoke a deserialize
method on a default-initialized value.
The deserializeValue
method must modify an existing value, which can be
useful for types that are not cheap to allocate and benefit from re-use (e.g.
arrays). By convention deserializeValue
will invoke a deserialize
method on records and classes. If a suitable deserialize
method is not
available, this method may attempt to invoke a suitable initializer and assign
the result into the value.
In both methods, the given fileReader
is guaranteed to have a
deserializerType
identical to the type whose method was called. The
fileReader
is also defined to be non-locking.
Note that while both methods may invoke initializers or methods that pass
control back to the user, Deserializers may ignore those options in the case
that a class is nilable and can be read as nil
.
The Deserializing Initializer¶
An initializer invoked by a Deserializer must have the following signature, including the argument names “reader” and “deserializer”:
proc T.init(reader: fileReader(?),
ref deserializer: reader.deserializerType) throws
By default, the compiler will generate a suitable initializer with this signature provided that no other user-defined initializers exist.
The reader
and deserializer
are passed separately to help distinguish
the method signature from other possible initializers, as well as to make it
slightly more convenient to call methods on the Deserializer. A future release
will standardize other methods on a Deserializer that provide ways to
deserialize into common types, like lists or maps.
As with the serialize
method, it is an error for reader.deserializer
to
refer to a Deserializer other than the deserializer
argument.
Generic types have a slightly more complex initializer signature, in that there
must be a type
or param
argument for each type
or param
field.
For example:
record G {
type A;
type B;
var x : A;
var y : B;
}
proc G.init(type A, type B,
reader: fileReader, ref deserializer) throws {
/* ... */
}
// With a reader 'r'
var x = r.read(G(int, real));
// becomes something like...
// new G(A=int, B=real, reader=r, deserializer=r.deserializer)
Warning
Generic types with typeless fields, like “var x;”, cannot yet be deserialized using an initializer.
Warning
Throwing inside an initializer before the type is fully initialized is not yet allowed in Chapel.
The ‘deserialize’ Method¶
The deserialize
method has the following signature, and also requires
its arguments to have the names “reader” and “deserializer”:
proc ref T.deserialize(reader: fileReader(?),
ref deserializer: reader.deserializerType) throws
By default, the compiler will generate a suitable deserialize
method with
this signature provided.
As with the serialize
method, it is an error for reader.deserializer
to
refer to a Deserializer other than the deserializer
argument.
Note
The set of standard builtin types (e.g. ranges and domains) on which this method may be invoked is currently unstable.
Compiler-Generated Methods¶
Generation of the deserializing initializer, or the serialize
and
deserialize
methods can be disabled with the flag
--no-io-gen-serialization
.
If the compiler sees a user-defined implementation of the serialize
method,
the deserialize
method, or the deserializing initializer, then the compiler
may choose to not automatically generate any of the other unimplemented
methods. This is out of concern that the user has intentionally deviated from
the compiler’s default implementation of serialization and deserialization.
Until it is determined that readThis
and writeThis
will be deprecated,
the compiler-generated versions of serialize
and deserialize
methods
will call any user-defined readThis
or writeThis
methods available on
the same type. If this behavior is undesirable, users may implement their own
serialize
and deserialize
methods, or they may use the following
compiler flags:
- --no-io-serialize-writeThis
- --no-io-deserialize-readThis