Bytes
The bytes type is similar to the string type but allows
arbitrary data to be stored in it. Methods on bytes that interpret the
data as characters assume that the bytes are ASCII characters.
Bytes Instantiation and Casting
A bytes instance can be created using the literals similar to strings,
prepended by a b character:
var b = b"my bytes";
The factory functions shown below, such as bytes.createBorrowingBuffer,
allow you to create a bytes using a specific buffer (i.e. data in another
bytes, a c_string or a c_ptr).
bytes and string
As bytes can store arbitrary data, any string can
be cast to bytes. In that event, the bytes will store UTF-8 encoded
character data. However, in general, a bytes can contain non-UTF-8 bytes
and needs to be decoded to be converted to a string.
var s = "my string";
var b = s:bytes; // this is legal
/*
The reverse is not. The following is a compiler error:
var s2 = b:string;
*/
var s2 = b.decode(); // you need to decode a bytes to convert it to a string
See the decode method below for details.
Similarly, a bytes can be initialized using a string:
var s = "my string";
var b: bytes = s;
Casts from bytes to a Numeric Type
Chapel supports casts from bytes to numeric types. Such casts will
interpret the bytes as ASCII characters, convert it to the numeric type
and then throw an error if the bytes does not match the expected format
of a number. For example:
var b = b"a";
var number = b:int;
throws an error when it is executed, but
var b = b"1";
var number = b:int;
stores the value 1 in number.
To learn more about handling these errors, see the Language-Specification page on Error Handling.
Predefined Routines on Bytes
The bytes type:
- type bytes
Supports the following methods:
- proc type bytes.createBorrowingBuffer(x: bytes) : bytes
Warning
‘createBorrowingBuffer’ is unstable and may change in the future
Creates a new
byteswhich borrows the internal buffer of anotherbytes. If the buffer is freed before thebytesreturned from this function, accessing it is undefined behavior.
- proc type bytes.createBorrowingBuffer(x: c_ptr(?t), length = strLen(x)) : bytes
Warning
‘createBorrowingBuffer’ is unstable and may change in the future
Creates a new
byteswhich borrows the memory allocated for a c_ptr. If the buffer is freed before thebytesreturned from this function, accessing it is undefined behavior.- Arguments:
x : c_ptr(int(8)) or c_ptr(uint(8)) – c_ptr to borrow as a buffer
length : int – Length of x, excluding the optional terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
- Returns:
A new
bytes
- proc type bytes.createBorrowingBuffer(x: c_ptrConst(?t), length = strLen(x)) : bytes
Warning
‘createBorrowingBuffer’ is unstable and may change in the future
Creates a new
byteswhich borrows the memory allocated for ac_ptrConst. If the buffer is freed before thebytesreturned from this function, accessing it is undefined behavior.- Arguments:
x : c_ptrConst(uint(8)) or c_ptrConst(int(8)) – c_ptrConst to borrow as a buffer
length : int – Length of x, excluding the optional terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
- Returns:
A new
bytes
- proc type bytes.createBorrowingBuffer(x: c_ptr(?t), length: int, size: int) : bytes
Warning
‘createBorrowingBuffer’ is unstable and may change in the future
Creates a new
byteswhich borrows the memory allocated for a c_ptr. If the buffer is freed before thebytesreturned from this function, accessing it is undefined behavior.- Arguments:
x : c_ptr(uint(8)) or c_ptr(int(8)) – Buffer to borrow
length – Length of the buffer x, excluding the optional terminating null byte.
size – Size of memory allocated for x in bytes
- Returns:
A new
bytes
- proc type bytes.createAdoptingBuffer(x: c_ptr(?t), length = strLen(x)) : bytes
Creates a new
byteswhich takes ownership of the memory allocated for a c_ptr. The buffer will be freed when thebytesis deinitialized.- Arguments:
x : c_ptr(uint(8)) or c_ptr(int(8)) – The c_ptr to take ownership of
length : int – Length of buffer x, excluding the optional terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
- Returns:
A new
bytes
- proc type bytes.createAdoptingBuffer(x: c_ptrConst(?t), length = strLen(x)) : bytes
Creates a new
byteswhich takes ownership of the memory allocated for ac_ptrConst. The buffer will be freed when thebytesis deinitialized.- Arguments:
x : c_ptrConst(uint(8)) or c_ptrConst(int(8)) – The c_ptrConst to take ownership of
length : int – Length of x’s buffer, excluding the optional terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
- Returns:
A new
bytes
- proc type bytes.createAdoptingBuffer(x: c_ptr(?t), length: int, size: int) : bytes
Creates a new
byteswhich takes ownership of the memory allocated for a c_ptr. The buffer will be freed when thebytesis deinitialized.- Arguments:
x : c_ptr(uint(8)) or c_ptr(int(8)) – The buffer to take ownership of
length – Length of the buffer x, excluding the optional terminating null byte.
size – Size of memory allocated for x in bytes
- Returns:
A new
bytes
- proc type bytes.createCopyingBuffer(x: c_ptrConst(?t), length = strLen(x)) : bytes
Creates a new
bytesby creating a copy of a buffer- Arguments:
x : c_ptrConst(uint(8)) or c_ptrConst(int(8)) – The
c_ptrConstto copylength : int – Length of buffer x, excluding the optional terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
- Returns:
A new
bytes
- proc type bytes.createCopyingBuffer(x: c_ptr(?t), length = strLen(x), size = length + 1) : bytes
Creates a new
bytesby creating a copy of a buffer.- Arguments:
x : c_ptr(uint(8)) or c_ptr(int(8)) – The buffer to copy
length – Length of buffer x, excluding the optional terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
size – Size of memory allocated for x in bytes
- Returns:
A new
bytes
- proc bytes.indices : range
- Returns:
The indices that can be used to index into the bytes (i.e., the range
0..<this.size)
- proc bytes.localize() : bytes
Warning
bytes.localize() is unstable and may change in a future release
Gets a version of the
bytesthat is on the currently executing locale.- Returns:
A shallow copy if the
bytesis already on the current locale, otherwise a deep copy is performed.
- proc bytes.item(i: int) : bytes
Gets an ASCII character from the
bytes- Arguments:
i – The index
- Returns:
A 1-length
bytes
- proc bytes.this(i: int) : uint(8)
Gets a byte from the
bytes- Arguments:
i – The index
- Returns:
uint(8)
- proc bytes.byte(i: int) : uint(8)
Gets a byte from the
bytes- Arguments:
i – The index
- Returns:
The value of the i th byte as an integer.
- iter bytes.items() : bytes
Iterates over the
bytes, yielding ASCII characters.- Yields:
1-length
bytes
- proc bytes.this(r: range(?)) : bytes
Slices the
bytes. Halts if r is non-empty and not completely inside the rangethis.indiceswhen compiled with –checks. –fast disables this check.
- proc bytes.isEmpty() : bool
Checks if the
bytesis empty.- Returns:
true – when empty
false – otherwise
- proc bytes.startsWith(patterns: bytes ...) : bool
Checks if the
bytesstarts with any of the given arguments.
- proc bytes.endsWith(patterns: bytes ...) : bool
Checks if the
bytesends with any of the given arguments.
- proc bytes.find(pattern: bytes, indices: range(?) = this.indices) : int
Finds the argument in the
bytes- Arguments:
pattern –
bytesto search forindices – an optional range defining the indices to search within, default is the whole. Halts if the range is not within
this.indices
- Returns:
the index of the first occurrence from the left of pattern within the
bytes, or -1 if the pattern is not in thebytes.
- proc bytes.rfind(pattern: bytes, indices: range(?) = this.indices) : int
Finds the argument in the
bytes- Arguments:
pattern – The
bytesto search forindices – an optional range defining the indices to search within, default is the whole. Halts if the range is not within
this.indices
- Returns:
the index of the first occurrence from the right of pattern within the
bytes, or -1 if the pattern is not in thebytes.
- proc bytes.count(pattern: bytes, indices: range(?) = this.indices) : int
Counts the number of occurrences of the argument in the
bytes
- proc bytes.replace(pattern: bytes, replacement: bytes, count: int = -1) : bytes
Replaces occurrences of a
byteswith another.- Arguments:
- Returns:
a copy of the
byteswhere replacement replaces pattern up to count times
- iter bytes.split(sep: bytes, maxsplit: int = -1, ignoreEmpty: bool = false) : bytes
Splits the
byteson sep yielding the bytes between each occurrence, up to maxsplit times.
- iter bytes.split(maxsplit: int = -1) : bytes
Works as above, but uses runs of whitespace as the delimiter.
- proc bytes.join(const ref x: bytes ...) : bytes
Returns a new
bytes, which is the concatenation of all of thebytespassed in with the contents of the method receiver inserted between them.var myBytes = b"|".join(b"a",b"10",b"d"); writeln(myBytes); // prints: "a|10|d"
- proc bytes.join(const ref x) : bytes
Returns a new
bytes, which is the concatenation of all of thebytespassed in with the contents of the method receiver inserted between them.var tup = (b"a",b"10",b"d"); var myJoinedTuple = b"|".join(tup); writeln(myJoinedTuple); // prints: "a|10|d" var myJoinedArray = b"|".join([b"a",b"10",b"d"]); writeln(myJoinedArray); // prints: "a|10|d"
- proc bytes.strip(chars = b" \t\r\n", leading = true, trailing = true) : bytes
Strips given set of leading and/or trailing characters.
- Arguments:
chars – Characters to remove. Defaults to b” \t\r\n”.
leading – Indicates if leading occurrences should be removed. Defaults to true.
trailing – Indicates if trailing occurrences should be removed. Defaults to true.
- Returns:
A new
byteswith leading and/or trailing occurrences of characters in chars removed as appropriate.
- proc bytes.dedent(columns = 0, ignoreFirst = true) : bytes
Warning
bytes.dedent is subject to change in the future.
Remove indentation from each line of bytes.
This can be useful when applied to multi-line bytes that are indented in the source code, but should not be indented in the output.
When
columns == 0, determine the level of indentation to remove from all lines by finding the common leading whitespace across all non-empty lines. Empty lines are lines containing only whitespace. Tabs and spaces are the only whitespaces that are considered, but are not treated as the same characters when determining common whitespace.When
columns > 0, removecolumnsleading whitespace characters from each line. Tabs are not considered whitespace whencolumns > 0, so only leading spaces are removed.- Arguments:
columns – The number of columns of indentation to remove. Infer common leading whitespace if
columns == 0.ignoreFirst – When
true, ignore first line when determining the common leading whitespace, and make no changes to the first line.
- Returns:
A new
byteswith indentation removed.
- proc bytes.decode(policy = decodePolicy.strict) : string throws
Returns a UTF-8 string from the given
bytes. If the data is malformed for UTF-8, policy argument determines the action.- Arguments:
policy –
decodePolicy.strict raises an error
decodePolicy.replace replaces the malformed character with UTF-8 replacement character
decodePolicy.drop drops the data silently
decodePolicy.escape escapes each illegal byte with private use codepoints
- Throws:
Throws a
DecodeErrorif decodePolicy.strict is passed to the policy argument and thebytescontains non-UTF-8 characters.- Returns:
A UTF-8 string.
- proc bytes.isUpper() : bool
Checks if all the characters in the
bytesare uppercase (A-Z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)- Returns:
true–there is at least one uppercase and no lowercase characters
false–otherwise
- proc bytes.isLower() : bool
Checks if all the characters in the
bytesare lowercase (a-z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)- Returns:
true–there is at least one lowercase and no uppercase characters
false–otherwise
- proc bytes.isSpace() : bool
Checks if all the characters in the
bytesare whitespace (’ ‘, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’) in ASCII.- Returns:
true – when all the characters are whitespace.
false – otherwise
- proc bytes.isAlpha() : bool
Checks if all the characters in the
bytesare alphabetic (a-zA-Z) in ASCII.- Returns:
true – when the characters are alphabetic.
false – otherwise
- proc bytes.isDigit() : bool
Checks if all the characters in the
bytesare digits (0-9) in ASCII.- Returns:
true – when the characters are digits.
false – otherwise
- proc bytes.isAlnum() : bool
Checks if all the characters in the
bytesare alphanumeric (a-zA-Z0-9) in ASCII.- Returns:
true – when the characters are alphanumeric.
false – otherwise
- proc bytes.isPrintable() : bool
Checks if all the characters in the
bytesare printable in ASCII.- Returns:
true – when the characters are printable.
false – otherwise
- proc bytes.isTitle() : bool
Checks if all uppercase characters are preceded by uncased characters, and if all lowercase characters are preceded by cased characters in ASCII.
- Returns:
true – when the condition described above is met.
false – otherwise
- proc bytes.toLower() : bytes
Creates a new
byteswith all applicable characters converted to lowercase.- Returns:
A new
byteswith all uppercase characters (A-Z) replaced with their lowercase counterpart in ASCII. Other characters remain untouched.
- proc bytes.toUpper() : bytes
Creates a new
byteswith all applicable characters converted to uppercase.- Returns:
A new
byteswith all lowercase characters (a-z) replaced with their uppercase counterpart in ASCII. Other characters remain untouched.
- proc bytes.toTitle() : bytes
Creates a new
byteswith all applicable characters converted to title capitalization.- Returns:
A new
byteswith all cased characters(a-zA-Z) following an uncased character converted to uppercase, and all cased characters following another cased character converted to lowercase.
- operator bytes.+=(ref lhs: bytes, const ref rhs: bytes) : void
- proc ref bytes.appendByteValues(x: uint(8) ...) : void
Warning
‘bytes.appendByteValues’ is unstable and may change in the future
Appends the one or more byte values passed as arguments to the
bytesthis.
- proc bytes.toHexadecimal(uppercase: bool = false, type resultType = bytes) : resultType
Warning
‘bytes.toHexadecimal’ is unstable and may change in the future
Computes a hexadecimal representation for a
bytesand returns it as abytes.
- operator bytes.+(s0: bytes, s1: bytes) : bytes
- Returns:
A new
byteswhich is the result of concatenating s0 and s1
- operator *(s: bytes, n: integral) : bytes
- Returns:
A new
byteswhich is the result of repeating s n times. If n is less than or equal to 0, an empty bytes is returned.
The operation is commutative. For example:
writeln(b"Hello! "*3); // Or... writeln(3*b"Hello! ");
Results in:
Hello! Hello! Hello! Hello! Hello! Hello!