Bytes¶

The bytes type is similar to the string type but allows arbitrary data to be stored in it. Methods on bytes that interpret the data as characters assume that the bytes are ASCII characters.

Bytes Instantiation and Casting¶

A bytes instance can be created using the literals similar to strings, prepended by a b character:

var b = b"my bytes";

The factory functions shown below, such as bytes.createBorrowingBuffer, allow you to create a bytes using a specific buffer (i.e. data in another bytes, a c_string or a c_ptr).

bytes and string¶

As bytes can store arbitrary data, any string can be cast to bytes. In that event, the bytes will store UTF-8 encoded character data. However, in general, a bytes can contain non-UTF-8 bytes and needs to be decoded to be converted to a string.

var s = "my string";
var b = s:bytes;  // this is legal

/*
 The reverse is not. The following is a compiler error:

 var s2 = b:string;
*/

var s2 = b.decode(); // you need to decode a bytes to convert it to a string

See the decode method below for details.

Similarly, a bytes can be initialized using a string:

var s = "my string";
var b: bytes = s;

Casts from bytes to a Numeric Type¶

Chapel supports casts from bytes to numeric types. Such casts will interpret the bytes as ASCII characters, convert it to the numeric type and then throw an error if the bytes does not match the expected format of a number. For example:

var b = b"a";
var number = b:int;

throws an error when it is executed, but

var b = b"1";
var number = b:int;

stores the value 1 in number.

To learn more about handling these errors, see the Language-Specification page on Error Handling.

Predefined Routines on Bytes¶

The bytes type:

type bytes¶

Supports the following methods:

proc type bytes.createBorrowingBuffer(x: bytes): bytes¶

Warning

‘createBorrowingBuffer’ is unstable and may change in the future

Creates a new bytes which borrows the internal buffer of another bytes. If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments: x – The bytes to borrow the buffer from
Returns: A new bytes

proc type bytes.createBorrowingBuffer(x: c_ptr(?t), length = strLen(x)): bytes

Warning

‘createBorrowingBuffer’ is unstable and may change in the future

Creates a new bytes which borrows the memory allocated for a c_ptr. If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments

x : c_ptr(int(8)) or c_ptr(uint(8)) – c_ptr to borrow as a buffer
length : int – Length of x, excluding the terminating null byte. Defaults to the number of bytes in x before the terminating null byte.

Returns

A new bytes

proc type bytes.createBorrowingBuffer(x: c_ptrConst(?t), length = strLen(x)): bytes

Warning

‘createBorrowingBuffer’ is unstable and may change in the future

Creates a new bytes which borrows the memory allocated for a c_ptrConst. If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments

x : c_ptrConst(uint(8)) or c_ptrConst(int(8)) – c_ptrConst to borrow as a buffer
length : int – Length of x, excluding the terminating null byte. Defaults to the number of bytes in x before the terminating null byte.

Returns

A new bytes

proc type bytes.createBorrowingBuffer(x: c_ptr(?t), length: int, size: int): bytes

Warning

‘createBorrowingBuffer’ is unstable and may change in the future

Creates a new bytes which borrows the memory allocated for a c_ptr. If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments

x : c_ptr(uint(8)) or c_ptr(int(8)) – Buffer to borrow
length – Length of the buffer x, excluding the terminating null byte.
size – Size of memory allocated for x in bytes

Returns

A new bytes

proc type bytes.createAdoptingBuffer(x: c_ptr(?t), length = strLen(x)): bytes¶

Creates a new bytes which takes ownership of the memory allocated for a c_ptr. The buffer will be freed when the bytes is deinitialized.

Arguments

x : c_ptr(uint(8)) or c_ptr(int(8)) – The c_ptr to take ownership of
length : int – Length of buffer x, excluding the terminating null byte. Defaults to the number of bytes in x before the terminating null byte.

Returns

A new bytes

proc type bytes.createAdoptingBuffer(x: c_ptrConst(?t), length = strLen(x)): bytes

Creates a new bytes which takes ownership of the memory allocated for a c_ptrConst. The buffer will be freed when the bytes is deinitialized.

Arguments

x : c_ptrConst(uint(8)) or c_ptrConst(int(8)) – The c_ptrConst to take ownership of
length : int – Length of x’s buffer, excluding the terminating null byte. Defaults to the number of bytes in x before the terminating null byte.

Returns

A new bytes

proc type bytes.createAdoptingBuffer(x: c_ptr(?t), length: int, size: int): bytes

Creates a new bytes which takes ownership of the memory allocated for a c_ptr. The buffer will be freed when the bytes is deinitialized.

Arguments

x : c_ptr(uint(8)) or c_ptr(int(8)) – The buffer to take ownership of
length – Length of the buffer x, excluding the terminating null byte.
size – Size of memory allocated for x in bytes

Returns

A new bytes

proc type bytes.createCopyingBuffer(x: c_ptrConst(?t), length = strLen(x)): bytes¶

Creates a new bytes by creating a copy of a buffer

Arguments

x : c_ptrConst(uint(8)) or c_ptrConst(int(8)) – The c_ptrConst to copy
length : int – Length of buffer x, excluding the terminating null byte. Defaults to the number of bytes in x before the terminating null byte.

Returns

A new bytes

proc type bytes.createCopyingBuffer(x: c_ptr(?t), length = strLen(x), size = length + 1): bytes

Creates a new bytes by creating a copy of a buffer.

Arguments

x : c_ptr(uint(8)) or c_ptr(int(8)) – The buffer to copy
length – Length of buffer x, excluding the terminating null byte. Defaults to the number of bytes in x before the terminating null byte.
size – Size of memory allocated for x in bytes

Returns

A new bytes

proc bytes.size: int¶

Returns: The number of bytes in the bytes.

proc bytes.indices: range¶

Returns: The indices that can be used to index into the bytes (i.e., the range 0..<this.size)

proc bytes.numBytes: int¶

Returns: The number of bytes in the bytes.

proc bytes.localize(): bytes¶

Warning

bytes.localize() is unstable and may change in a future release

Gets a version of the bytes that is on the currently executing locale.

Returns: A shallow copy if the bytes is already on the current locale, otherwise a deep copy is performed.

proc bytes.c_str(): c_ptrConst(c_char)¶

Warning

‘bytes.c_str()’ has moved to ‘CTypes’. Please ‘use CTypes’ to access ‘c_str’

Gets a c_ptrConst(c_char) from a bytes. The returned c_ptrConst shares the buffer with the bytes.

Warning

This can only be called safely on a bytes whose home is the current locale. This property can be enforced by calling bytes.localize() before c_str(). If the bytes is remote, the program will halt.

For example:

var myBytes = b"Hello!";
on differentLocale {
  printf("%s", myBytes.localize().c_str());
}

Returns: A c_ptrConst(c_char) that points to the underlying buffer used by this bytes. The returned c_ptrConst(c_char) is only valid when used on the same locale as the bytes.

proc bytes.item(i: int): bytes¶

Gets an ASCII character from the bytes

Arguments: i – The index
Returns: A 1-length bytes

proc bytes.this(i: int): uint(8)¶

Gets a byte from the bytes

Arguments: i – The index
Returns: uint(8)

proc bytes.toByte(): uint(8)¶

Returns: The value of a single-byte bytes as an integer.

proc bytes.byte(i: int): uint(8)¶

Gets a byte from the bytes

Arguments: i – The index
Returns: The value of the i th byte as an integer.

iter bytes.items(): bytes¶

Iterates over the bytes, yielding ASCII characters.

Yields: 1-length bytes

iter bytes.these(): uint(8)¶

Iterates over the bytes

Yields: uint(8)

iter bytes.bytes(): uint(8)¶

Iterates over the bytes byte by byte.

Yields: uint(8)

proc bytes.this(r: range(?)): bytes

Slices the bytes. Halts if r is non-empty and not completely inside the range this.indices when compiled with –checks. –fast disables this check.

Arguments: r – The range of indices the new bytes should be made from
Returns: a new bytes that is a slice within this.indices. If the length of r is zero, an empty bytes is returned.

proc bytes.isEmpty(): bool¶

Checks if the bytes is empty.

Returns

true – when empty
false – otherwise

proc bytes.startsWith(patterns: bytes ...): bool¶

Checks if the bytes starts with any of the given arguments.

Arguments

patterns – bytes (s) to match against.

Returns

true–when the bytes begins with one or more of the patterns
false–otherwise

proc bytes.endsWith(patterns: bytes ...): bool¶

Checks if the bytes ends with any of the given arguments.

Arguments

patterns – bytes (s) to match against.

Returns

true–when the bytes ends with one or more of the patterns
false–otherwise

proc bytes.find(pattern: bytes, indices: range(?) = this.indices): int¶

Finds the argument in the bytes

Arguments

pattern – bytes to search for
indices – an optional range defining the indices to search within, default is the whole. Halts if the range is not within this.indices

Returns

the index of the first occurrence from the left of pattern within the bytes, or -1 if the pattern is not in the bytes.

proc bytes.rfind(pattern: bytes, indices: range(?) = this.indices): int¶

Finds the argument in the bytes

Arguments

pattern – The bytes to search for
indices – an optional range defining the indices to search within, default is the whole. Halts if the range is not within this.indices

Returns

the index of the first occurrence from the right of pattern within the bytes, or -1 if the pattern is not in the bytes.

proc bytes.count(pattern: bytes, indices: range(?) = this.indices): int¶

Counts the number of occurrences of the argument in the bytes

Arguments

pattern – The bytes to search for
indices – an optional range defining the substring to search within, default is the whole. Halts if the range is not within this.indices

Returns

the number of times pattern occurs in the bytes

proc bytes.replace(pattern: bytes, replacement: bytes, count: int = -1): bytes¶

Replaces occurrences of a bytes with another.

Arguments

pattern – The bytes to search for
replacement – The bytes to replace pattern with
count – an optional argument specifying the number of replacements to make, values less than zero will replace all occurrences

Returns

a copy of the bytes where replacement replaces pattern up to count times

iter bytes.split(sep: bytes, maxsplit: int = -1, ignoreEmpty: bool = false): bytes¶

Splits the bytes on sep yielding the bytes between each occurrence, up to maxsplit times.

Arguments

sep – The delimiter used to break the bytes into chunks.
maxsplit – The number of times to split the bytes, negative values indicate no limit.
ignoreEmpty –
- true– Empty bytes will not be yielded,
- false– Empty bytes will be yielded

Yields

bytes

iter bytes.split(maxsplit: int = -1): bytes

Works as above, but uses runs of whitespace as the delimiter.

Arguments: maxsplit – The maximum number of times to split the bytes, negative values indicate no limit.
Yields: bytes

proc bytes.join(const ref x: bytes ...): bytes¶

Returns a new bytes, which is the concatenation of all of the bytes passed in with the contents of the method receiver inserted between them.

var myBytes = b"|".join(b"a",b"10",b"d");
writeln(myBytes); // prints: "a|10|d"

Arguments: x – bytes values to be joined
Returns: A bytes

proc bytes.join(const ref x): bytes

Returns a new bytes, which is the concatenation of all of the bytes passed in with the contents of the method receiver inserted between them.

var tup = (b"a",b"10",b"d");
var myJoinedTuple = b"|".join(tup);
writeln(myJoinedTuple); // prints: "a|10|d"

var myJoinedArray = b"|".join([b"a",b"10",b"d"]);
writeln(myJoinedArray); // prints: "a|10|d"

Arguments: x – An array or tuple of bytes values to be joined
Returns: A bytes

proc bytes.strip(chars = b" \t\r\n", leading = true, trailing = true): bytes¶

Strips given set of leading and/or trailing characters.

Arguments

chars – Characters to remove. Defaults to b” \t\r\n”.
leading – Indicates if leading occurrences should be removed. Defaults to true.
trailing – Indicates if trailing occurrences should be removed. Defaults to true.

Returns

A new bytes with leading and/or trailing occurrences of characters in chars removed as appropriate.

proc bytes.partition(sep: bytes): 3*(bytes)¶

Splits the bytes on a given separator

Arguments: sep – The separator
Returns: a 3*bytes consisting of the section before sep, sep, and the section after sep. If sep is not found, the tuple will contain the whole bytes, and then two empty bytes.

proc bytes.dedent(columns = 0, ignoreFirst = true): bytes¶

Warning

bytes.dedent is subject to change in the future.

Remove indentation from each line of bytes.

This can be useful when applied to multi-line bytes that are indented in the source code, but should not be indented in the output.

When columns == 0, determine the level of indentation to remove from all lines by finding the common leading whitespace across all non-empty lines. Empty lines are lines containing only whitespace. Tabs and spaces are the only whitespaces that are considered, but are not treated as the same characters when determining common whitespace.

When columns > 0, remove columns leading whitespace characters from each line. Tabs are not considered whitespace when columns > 0, so only leading spaces are removed.

Arguments

columns – The number of columns of indentation to remove. Infer common leading whitespace if columns == 0.
ignoreFirst – When true, ignore first line when determining the common leading whitespace, and make no changes to the first line.

Returns

A new bytes with indentation removed.

proc bytes.decode(policy = decodePolicy.strict): string throws¶

Returns a UTF-8 string from the given bytes. If the data is malformed for UTF-8, policy argument determines the action.

Arguments

policy –

decodePolicy.strict raises an error
decodePolicy.replace replaces the malformed character with UTF-8 replacement character
decodePolicy.drop drops the data silently
decodePolicy.escape escapes each illegal byte with private use codepoints

Throws

Throws a DecodeError if decodePolicy.strict is passed to the policy argument and the bytes contains non-UTF-8 characters.

Returns

A UTF-8 string.

proc bytes.isUpper(): bool¶

Checks if all the characters in the bytes are uppercase (A-Z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)

Returns

true–there is at least one uppercase and no lowercase characters
false–otherwise

proc bytes.isLower(): bool¶

Checks if all the characters in the bytes are lowercase (a-z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)

Returns

true–there is at least one lowercase and no uppercase characters
false–otherwise

proc bytes.isSpace(): bool¶

Checks if all the characters in the bytes are whitespace (‘ ‘, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’) in ASCII.

Returns

true – when all the characters are whitespace.
false – otherwise

proc bytes.isAlpha(): bool¶

Checks if all the characters in the bytes are alphabetic (a-zA-Z) in ASCII.

Returns

true – when the characters are alphabetic.
false – otherwise

proc bytes.isDigit(): bool¶

Checks if all the characters in the bytes are digits (0-9) in ASCII.

Returns

true – when the characters are digits.
false – otherwise

proc bytes.isAlnum(): bool¶

Checks if all the characters in the bytes are alphanumeric (a-zA-Z0-9) in ASCII.

Returns

true – when the characters are alphanumeric.
false – otherwise

proc bytes.isPrintable(): bool¶

Checks if all the characters in the bytes are printable in ASCII.

Returns

true – when the characters are printable.
false – otherwise

proc bytes.isTitle(): bool¶

Checks if all uppercase characters are preceded by uncased characters, and if all lowercase characters are preceded by cased characters in ASCII.

Returns

true – when the condition described above is met.
false – otherwise

proc bytes.toLower(): bytes¶

Creates a new bytes with all applicable characters converted to lowercase.

Returns: A new bytes with all uppercase characters (A-Z) replaced with their lowercase counterpart in ASCII. Other characters remain untouched.

proc bytes.toUpper(): bytes¶

Creates a new bytes with all applicable characters converted to uppercase.

Returns: A new bytes with all lowercase characters (a-z) replaced with their uppercase counterpart in ASCII. Other characters remain untouched.

proc bytes.toTitle(): bytes¶

Creates a new bytes with all applicable characters converted to title capitalization.

Returns: A new bytes with all cased characters(a-zA-Z) following an uncased character converted to uppercase, and all cased characters following another cased character converted to lowercase.

operator bytes.+=(ref lhs: bytes, const ref rhs: bytes): void¶: Appends the bytes rhs to the bytes lhs.

operator bytes.=(ref lhs: bytes, rhs: bytes): void¶: Copies the bytes rhs into the bytes lhs.

operator bytes.=(ref lhs: bytes, rhs_c: c_string): void

Warning

the type ‘c_string’ is deprecated; please use one of the ‘bytes.create*ingBuffer’ methods that takes a ‘c_ptrConst(c_char)’ instead

Copies the c_string rhs_c into the bytes lhs.

Halts if lhs is a remote bytes.

operator bytes.+(s0: bytes, s1: bytes): bytes¶

Returns: A new bytes which is the result of concatenating s0 and s1

operator *(s: bytes, n: integral): bytes¶

Returns: A new bytes which is the result of repeating s n times. If n is less than or equal to 0, an empty bytes is returned.

The operation is commutative. For example:

writeln(b"Hello! "*3);
or
writeln(3*b"Hello! ");

Results in:

Hello! Hello! Hello!