Bytes¶
The following document shows functions and methods used to manipulate and
process Chapel bytes variables. bytes
is similar to a string but
allows arbitrary data to be stored in it. Methods on bytes that interpret the
data as characters assume that the bytes are ASCII characters.
Creating bytes
¶
- A
bytes
can be created using the literals similar to strings:
var b = b"my bytes";
- If you need to create
bytes
using a specific buffer (i.e. data in anotherbytes
, a c_string or a C pointer) you can use the factory functions shown below, such ascreateBytesWithNewBuffer
.
bytes
and string
¶
As bytes
can store arbitrary data, any string
can be
cast to bytes
. In that event, the bytes will store UTF-8 encoded
character data. However, a bytes
can contain non-UTF-8 bytes and needs
to be decoded to be converted to string.
var s = "my string";
var b = s:bytes; // this is legal
/*
The reverse is not. The following is a compiler error:
var s2 = b:string;
*/
var s2 = b.decode(); // you need to decode a bytes to convert it to a string
See the documentation for the decode
method for details.
Similarly, a bytes
can be initialized using a string:
var s = "my string";
var b: bytes = s;
Casts from bytes
to a Numeric Type¶
This module supports casts from bytes
to numeric types. Such casts
will interpret the bytes
as ASCII characters and convert it to the
numeric type and throw an error if the bytes
does not match the
expected format of a number. For example:
var b = b"a";
var number = b:int;
throws an error when it is executed, but
var b = b"1";
var number = b:int;
stores the value 1
in number
.
To learn more about handling these errors, see the Error Handling technical note.
-
proc
createBytesWithBorrowedBuffer
(x: bytes)¶ Creates a new
bytes
which borrows the internal buffer of anotherbytes
. If the buffer is freed before thebytes
returned from this function, accessing it is undefined behavior.Arguments: s – The bytes
to borrow the buffer fromReturns: A new bytes
-
proc
createBytesWithBorrowedBuffer
(x: c_string, length = x.size) Creates a new
bytes
which borrows the internal buffer of a c_string. If the buffer is freed before thebytes
returned from this function, accessing it is undefined behavior.Arguments: - s – c_string to borrow the buffer from
- length : int – Length of s’s buffer, excluding the terminating null byte.
Returns: A new
bytes
-
proc
createBytesWithBorrowedBuffer
(x: bufferType, length: int, size: int) Creates a new
bytes
which borrows the memory allocated for a c_ptr(uint(8)). If the buffer is freed before thebytes
returned from this function, accessing it is undefined behavior.Arguments: - s : bufferType (i.e. c_ptr(uint(8))) – Buffer to borrow
- length – Length of the buffer s, excluding the terminating null byte.
- size – Size of memory allocated for s in bytes
Returns: A new
bytes
-
proc
createBytesWithOwnedBuffer
(x: c_string, length = x.size)¶ Creates a new
bytes
which takes ownership of the internal buffer of a c_string.The buffer will be freed when thebytes
is deinitialized.Arguments: - s – The c_string to take ownership of the buffer from
- length : int – Length of s’s buffer, excluding the terminating null byte.
Returns: A new
bytes
-
proc
createBytesWithOwnedBuffer
(x: bufferType, length: int, size: int) Creates a new
bytes
which takes ownership of the memory allocated for a c_ptr(uint(8)). The buffer will be freed when thebytes
is deinitialized.Arguments: - s : bufferType (i.e. c_ptr(uint(8))) – The buffer to take ownership of
- length – Length of the buffer s, excluding the terminating null byte.
- size – Size of memory allocated for s in bytes
Returns: A new
bytes
-
proc
createBytesWithNewBuffer
(x: bytes)¶ Creates a new
bytes
by creating a copy of the buffer of anotherbytes
.Arguments: s – The bytes
to copy the buffer fromReturns: A new bytes
-
proc
createBytesWithNewBuffer
(x: c_string, length = x.size) Creates a new
bytes
by creating a copy of the buffer of a c_string.Arguments: - s – The c_string to copy the buffer from
- length : int – Length of s’s buffer, excluding the terminating null byte.
Returns: A new
bytes
-
proc
createBytesWithNewBuffer
(x: bufferType, length: int, size = length+1) Creates a new
bytes
by creating a copy of a buffer.Arguments: - s : bufferType (i.e. c_ptr(uint(8))) – The buffer to copy
- length – Length of buffer s, excluding the terminating null byte.
- size – Size of memory allocated for s in bytes
Returns: A new
bytes
-
record
bytes
¶ -
proc
length
¶ Deprecated - please use
bytes.size
.
-
proc
indices
¶ Returns: The indices that can be used to index into the bytes (i.e., the range 0..<this.size
)
-
proc
localize
(): bytes¶ Gets a version of the
bytes
that is on the currently executing locale.Returns: A shallow copy if the bytes
is already on the current locale, otherwise a deep copy is performed.
-
proc
c_str
(): c_string¶ Gets a c_string from a
bytes
. The returned c_string shares the buffer with thebytes
.Returns: A c_string
-
proc
item
(i: int): bytes¶ Gets an ASCII character from the
bytes
Arguments: i – The index Returns: A 1-length bytes
-
proc
byte
(i: int): byteType¶ Gets a byte from the
bytes
Arguments: i – The index Returns: The value of the i th byte as an integer.
-
proc
this
(r: range(?)): bytes Slices the
bytes
. Halts if r is non-empty and not completely inside the rangethis.indices
when compiled with –checks. –fast disables this check.Arguments: r – The range of indices the new bytes
should be made fromReturns: a new bytes
that is a slice withinthis.indices
. If the length of r is zero, an emptybytes
is returned.
-
proc
startsWith
(needles: bytes ...): bool¶ Checks if the
bytes
starts with any of the given arguments.Arguments: needles – bytes
(s) to match against.Returns: - true–when the
bytes
begins with one or more of the needles - false–otherwise
- true–when the
-
proc
endsWith
(needles: bytes ...): bool¶ Checks if the
bytes
ends with any of the given arguments.Arguments: needles – bytes
(s) to match against.Returns: - true–when the
bytes
ends with one or more of the needles - false–otherwise
- true–when the
-
proc
find
(needle: bytes, region: range(?) = 0: idxType..): idxType¶ Finds the argument in the
bytes
Arguments: - needle –
bytes
to search for - region – an optional range defining the indices to search
within, default is the whole. Halts if the range is not
within
this.indices
Returns: the index of the first occurrence from the left of needle within the
bytes
, or -1 if the needle is not in thebytes
.- needle –
-
proc
rfind
(needle: bytes, region: range(?) = 0: idxType..): idxType¶ Finds the argument in the
bytes
Arguments: - needle – The
bytes
to search for - region – an optional range defining the indices to search within,
default is the whole. Halts if the range is not
within
this.indices
Returns: the index of the first occurrence from the right of needle within the
bytes
, or -1 if the needle is not in thebytes
.- needle – The
-
proc
count
(needle: bytes, region: range(?) = this.indices): int¶ Counts the number of occurrences of the argument in the
bytes
Arguments: - needle – The
bytes
to search for - region – an optional range defining the substring to search within,
default is the whole. Halts if the range is not
within
this.indices
Returns: the number of times needle occurs in the
bytes
- needle – The
-
proc
replace
(needle: bytes, replacement: bytes, count: int = -1): bytes¶ Replaces occurrences of a
bytes
with another.Arguments: Returns: a copy of the
bytes
where replacement replaces needle up to count times
-
iter
split
(sep: bytes, maxsplit: int = -1, ignoreEmpty: bool = false): bytes¶ Splits the
bytes
on sep yielding the bytes between each occurrence, up to maxsplit times.Arguments: Yields:
-
iter
split
(maxsplit: int = -1): bytes Works as above, but uses runs of whitespace as the delimiter.
Arguments: maxsplit – The maximum number of times to split the bytes
, negative values indicate no limit.Yields: bytes
-
proc
join
(const ref S: bytes ...): bytes¶ Returns a new
bytes
, which is the concatenation of all of thebytes
passed in with the contents of the method receiver inserted between them.var x = b"|".join(b"a",b"10",b"d"); writeln(x); // prints: "a|10|d"
Arguments: S – bytes
values to be joinedReturns: A bytes
-
proc
join
(const ref x): bytes Returns a new
bytes
, which is the concatenation of all of thebytes
passed in with the contents of the method receiver inserted between them.var tup = (b"a",b"10",b"d"); var x = b"|".join(tup); writeln(x); // prints: "a|10|d"
Arguments: S : tuple or array of bytes
–bytes
values to be joinedReturns: A bytes
-
proc
strip
(chars = b" trn", leading = true, trailing = true): bytes¶ Strips given set of leading and/or trailing characters.
Arguments: - chars – Characters to remove. Defaults to b” \t\r\n”.
- leading – Indicates if leading occurrences should be removed. Defaults to true.
- trailing – Indicates if trailing occurrences should be removed. Defaults to true.
Returns: A new
bytes
with leading and/or trailing occurrences of characters in chars removed as appropriate.
-
proc
partition
(sep: bytes): 3*(bytes)¶ Splits the
bytes
on a given separatorArguments: sep – The separator Returns: a 3*bytes consisting of the section before sep, sep, and the section after sep. If sep is not found, the tuple will contain the whole bytes
, and then two emptybytes
.
-
proc
decode
(policy = decodePolicy.strict): string throws¶ Returns a UTF-8 string from the given
bytes
. If the data is malformed for UTF-8, policy argument determines the action.Arguments: policy – - decodePolicy.strict raises an error
- decodePolicy.replace replaces the malformed character with UTF-8 replacement character
- decodePolicy.drop drops the data silently
- decodePolicy.escape escapes each illegal byte with private use codepoints
Throws: DecodeError if decodePolicy.strict is passed to the policy argument and the bytes
contains non-UTF-8 characters.Returns: A UTF-8 string.
-
proc
isUpper
(): bool¶ Checks if all the characters in the
bytes
are uppercase (A-Z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)returns: - true–there is at least one uppercase and no lowercase characters
- false–otherwise
-
proc
isLower
(): bool¶ Checks if all the characters in the
bytes
are lowercase (a-z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)returns: - true–there is at least one lowercase and no uppercase characters
- false–otherwise
-
proc
isSpace
(): bool¶ Checks if all the characters in the
bytes
are whitespace (‘ ‘, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’) in ASCII.returns: - true – when all the characters are whitespace.
- false – otherwise
-
proc
isAlpha
(): bool¶ Checks if all the characters in the
bytes
are alphabetic (a-zA-Z) in ASCII.returns: - true – when the characters are alphabetic.
- false – otherwise
-
proc
isDigit
(): bool¶ Checks if all the characters in the
bytes
are digits (0-9) in ASCII.returns: - true – when the characters are digits.
- false – otherwise
-
proc
isAlnum
(): bool¶ Checks if all the characters in the
bytes
are alphanumeric (a-zA-Z0-9) in ASCII.returns: - true – when the characters are alphanumeric.
- false – otherwise
-
proc
isPrintable
(): bool¶ Checks if all the characters in the
bytes
are printable in ASCII.returns: - true – when the characters are printable.
- false – otherwise
-
proc
isTitle
(): bool¶ Checks if all uppercase characters are preceded by uncased characters, and if all lowercase characters are preceded by cased characters in ASCII.
Returns: - true – when the condition described above is met.
- false – otherwise
-
proc
toLower
(): bytes¶ Creates a new
bytes
with all applicable characters converted to lowercase.Returns: A new bytes
with all uppercase characters (A-Z) replaced with their lowercase counterpart in ASCII. Other characters remain untouched.
-
proc
toUpper
(): bytes¶ Creates a new
bytes
with all applicable characters converted to uppercase.Returns: A new bytes
with all lowercase characters (a-z) replaced with their uppercase counterpart in ASCII. Other characters remain untouched.
-
proc
toTitle
(): bytes¶ Creates a new
bytes
with all applicable characters converted to title capitalization.Returns: A new bytes
with all cased characters(a-zA-Z) following an uncased character converted to uppercase, and all cased characters following another cased character converted to lowercase.
-
proc
-
proc
=
(ref lhs: bytes, rhs_c: c_string) Copies the c_string rhs_c into the bytes lhs.
Halts if lhs is a remote bytes.