Bytes

Usage

use Bytes;

The following document shows functions and methods used to manipulate and process Chapel bytes variables. bytes is similar to a string but allows arbitrary data to be stored in it. Methods on bytes that interpret the data as characters assume that the bytes are ASCII characters.

Creating bytes

  • A bytes can be created using the literals similar to strings:
var b = b"my bytes";
  • If you need to create bytes using a specific buffer (i.e. data in another bytes, a c_string or a C pointer) you can use the factory functions shown below, such as createBytesWithNewBuffer.

bytes and string

As bytes can store arbitrary data, any string can be cast to bytes. In that event, the bytes will store UTF-8 encoded character data. However, a bytes can contain non-UTF-8 bytes and needs to be decoded to be converted to string.

var s = "my string";
var b = s:bytes;  // this is legal

/*
 The reverse is not. The following is a compiler error:

 var s2 = b:string;
*/

var s2 = b.decode(); // you need to decode a bytes to convert it to a string

See the documentation for the decode method for details.

Similarly, a bytes can be initialized using a string:

var s = "my string";
var b: bytes = s;

Casts from bytes to a Numeric Type

This module supports casts from bytes to numeric types. Such casts will interpret the bytes as ASCII characters and convert it to the numeric type and throw an error if the bytes does not match the expected format of a number. For example:

var b = b"a";
var number = b:int;

throws an error when it is executed, but

var b = b"1";
var number = b:int;

stores the value 1 in number.

To learn more about handling these errors, see the Error Handling technical note.

proc createBytesWithBorrowedBuffer(x: bytes)

Creates a new bytes which borrows the internal buffer of another bytes. If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments:s – The bytes to borrow the buffer from
Returns:A new bytes
proc createBytesWithBorrowedBuffer(x: c_string, length = x.size)

Creates a new bytes which borrows the internal buffer of a c_string. If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments:
  • sc_string to borrow the buffer from
  • length : int – Length of s’s buffer, excluding the terminating null byte.
Returns:

A new bytes

proc createBytesWithBorrowedBuffer(x: bufferType, length: int, size: int)

Creates a new bytes which borrows the memory allocated for a c_ptr(uint(8)). If the buffer is freed before the bytes returned from this function, accessing it is undefined behavior.

Arguments:
  • s : bufferType (i.e. c_ptr(uint(8))) – Buffer to borrow
  • length – Length of the buffer s, excluding the terminating null byte.
  • size – Size of memory allocated for s in bytes
Returns:

A new bytes

proc createBytesWithOwnedBuffer(x: c_string, length = x.size)

Creates a new bytes which takes ownership of the internal buffer of a c_string.The buffer will be freed when the bytes is deinitialized.

Arguments:
  • s – The c_string to take ownership of the buffer from
  • length : int – Length of s’s buffer, excluding the terminating null byte.
Returns:

A new bytes

proc createBytesWithOwnedBuffer(x: bufferType, length: int, size: int)

Creates a new bytes which takes ownership of the memory allocated for a c_ptr(uint(8)). The buffer will be freed when the bytes is deinitialized.

Arguments:
  • s : bufferType (i.e. c_ptr(uint(8))) – The buffer to take ownership of
  • length – Length of the buffer s, excluding the terminating null byte.
  • size – Size of memory allocated for s in bytes
Returns:

A new bytes

proc createBytesWithNewBuffer(x: bytes)

Creates a new bytes by creating a copy of the buffer of another bytes.

Arguments:s – The bytes to copy the buffer from
Returns:A new bytes
proc createBytesWithNewBuffer(x: c_string, length = x.size)

Creates a new bytes by creating a copy of the buffer of a c_string.

Arguments:
  • s – The c_string to copy the buffer from
  • length : int – Length of s’s buffer, excluding the terminating null byte.
Returns:

A new bytes

proc createBytesWithNewBuffer(x: bufferType, length: int, size = length+1)

Creates a new bytes by creating a copy of a buffer.

Arguments:
  • s : bufferType (i.e. c_ptr(uint(8))) – The buffer to copy
  • length – Length of buffer s, excluding the terminating null byte.
  • size – Size of memory allocated for s in bytes
Returns:

A new bytes

record bytes
proc length

Deprecated - please use bytes.size.

proc size
Returns:The number of bytes in the bytes.
proc indices
Returns:The indices that can be used to index into the bytes (i.e., the range 1..this.size)
proc numBytes
Returns:The number of bytes in the bytes.
proc localize(): bytes

Gets a version of the bytes that is on the currently executing locale.

Returns:A shallow copy if the bytes is already on the current locale, otherwise a deep copy is performed.
proc c_str(): c_string

Gets a c_string from a bytes. The returned c_string shares the buffer with the bytes.

Returns:A c_string
proc item(i: int): bytes

Gets an ASCII character from the bytes

Arguments:i – The index
Returns:A 1-length bytes
proc this(i: int): byteType

Gets a byte from the bytes

Arguments:i – The index
Returns:uint(8)
proc toByte(): uint(8)
Returns:The value of a single-byte bytes as an integer.
proc byte(i: int): byteType

Gets a byte from the bytes

Arguments:i – The index
Returns:The value of the i th byte as an integer.
iter items(): bytes

Iterates over the bytes, yielding ASCII characters.

Yields:1-length bytes
iter these(): byteType

Iterates over the bytes

Yields:uint(8)
iter chpl_bytes(): byteType

Iterates over the bytes byte by byte.

Yields:uint(8)
proc this(r: range(?)): bytes

Slices the bytes. Halts if r is non-empty and not completely inside the range this.indices when compiled with –checks. –fast disables this check.

Arguments:r – The range of indices the new bytes should be made from
Returns:a new bytes that is a slice within this.indices. If the length of r is zero, an empty bytes is returned.
proc isEmpty(): bool

Checks if the bytes is empty.

Returns:
  • true – when empty
  • false – otherwise
proc startsWith(needles: bytes ...): bool

Checks if the bytes starts with any of the given arguments.

Arguments:needlesbytes (s) to match against.
Returns:
  • true–when the bytes begins with one or more of the needles
  • false–otherwise
proc endsWith(needles: bytes ...): bool

Checks if the bytes ends with any of the given arguments.

Arguments:needlesbytes (s) to match against.
Returns:
  • true–when the bytes ends with one or more of the needles
  • false–otherwise
proc find(needle: bytes, region: range(?) = 1: idxType..): idxType

Finds the argument in the bytes

Arguments:
  • needlebytes to search for
  • region – an optional range defining the indices to search within, default is the whole. Halts if the range is not within this.indices
Returns:

the index of the first occurrence from the left of needle within the bytes, or 0 if the needle is not in the bytes.

proc rfind(needle: bytes, region: range(?) = 1: idxType..): idxType

Finds the argument in the bytes

Arguments:
  • needle – The bytes to search for
  • region – an optional range defining the indices to search within, default is the whole. Halts if the range is not within this.indices
Returns:

the index of the first occurrence from the right of needle within the bytes, or 0 if the needle is not in the bytes.

proc count(needle: bytes, region: range(?) = this.indices): int

Counts the number of occurrences of the argument in the bytes

Arguments:
  • needle – The bytes to search for
  • region – an optional range defining the substring to search within, default is the whole. Halts if the range is not within this.indices
Returns:

the number of times needle occurs in the bytes

proc replace(needle: bytes, replacement: bytes, count: int = -1): bytes
iter split(sep: bytes, maxsplit: int = -1, ignoreEmpty: bool = false): bytes

Splits the bytes on sep yielding the bytes between each occurrence, up to maxsplit times.

Arguments:
  • sep – The delimiter used to break the bytes into chunks.
  • maxsplit – The number of times to split the bytes, negative values indicate no limit.
  • ignoreEmpty
    • true– Empty bytes will not be yielded,
    • false– Empty bytes will be yielded
Yields:

bytes

iter split(maxsplit: int = -1): bytes

Works as above, but uses runs of whitespace as the delimiter.

Arguments:maxsplit – The maximum number of times to split the bytes, negative values indicate no limit.
Yields:bytes
proc join(const ref S: bytes ...): bytes

Returns a new bytes, which is the concatenation of all of the bytes passed in with the contents of the method receiver inserted between them.

var x = b"|".join(b"a",b"10",b"d");
writeln(x); // prints: "a|10|d"
Arguments:Sbytes values to be joined
Returns:A bytes
proc join(const ref x): bytes

Returns a new bytes, which is the concatenation of all of the bytes passed in with the contents of the method receiver inserted between them.

var tup = (b"a",b"10",b"d");
var x = b"|".join(tup);
writeln(x); // prints: "a|10|d"
Arguments:S : tuple or array of bytesbytes values to be joined
Returns:A bytes
proc strip(chars = b" trn", leading = true, trailing = true): bytes

Strips given set of leading and/or trailing characters.

Arguments:
  • chars – Characters to remove. Defaults to b” \t\r\n”.
  • leading – Indicates if leading occurrences should be removed. Defaults to true.
  • trailing – Indicates if trailing occurrences should be removed. Defaults to true.
Returns:

A new bytes with leading and/or trailing occurrences of characters in chars removed as appropriate.

proc partition(sep: bytes): 3*(bytes)

Splits the bytes on a given separator

Arguments:sep – The separator
Returns:a 3*bytes consisting of the section before sep, sep, and the section after sep. If sep is not found, the tuple will contain the whole bytes, and then two empty bytes.
proc decode(policy = decodePolicy.strict): string throws

Returns a UTF-8 string from the given bytes. If the data is malformed for UTF-8, policy argument determines the action.

Arguments:policy
  • decodePolicy.strict raises an error
  • decodePolicy.replace replaces the malformed character with UTF-8 replacement character
  • decodePolicy.drop drops the data silently
  • decodePolicy.escape escapes each illegal byte with private use codepoints
Throws:DecodeError if decodePolicy.strict is passed to the policy argument and the bytes contains non-UTF-8 characters.
Returns:A UTF-8 string.
proc isUpper(): bool

Checks if all the characters in the bytes are uppercase (A-Z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)

returns:
  • true–there is at least one uppercase and no lowercase characters
  • false–otherwise
proc isLower(): bool

Checks if all the characters in the bytes are lowercase (a-z) in ASCII. Ignores uncased (not a letter) and extended ASCII characters (decimal value larger than 127)

returns:
  • true–there is at least one lowercase and no uppercase characters
  • false–otherwise
proc isSpace(): bool

Checks if all the characters in the bytes are whitespace (‘ ‘, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’) in ASCII.

returns:
  • true – when all the characters are whitespace.
  • false – otherwise
proc isAlpha(): bool

Checks if all the characters in the bytes are alphabetic (a-zA-Z) in ASCII.

returns:
  • true – when the characters are alphabetic.
  • false – otherwise
proc isDigit(): bool

Checks if all the characters in the bytes are digits (0-9) in ASCII.

returns:
  • true – when the characters are digits.
  • false – otherwise
proc isAlnum(): bool

Checks if all the characters in the bytes are alphanumeric (a-zA-Z0-9) in ASCII.

returns:
  • true – when the characters are alphanumeric.
  • false – otherwise
proc isPrintable(): bool

Checks if all the characters in the bytes are printable in ASCII.

returns:
  • true – when the characters are printable.
  • false – otherwise
proc isTitle(): bool

Checks if all uppercase characters are preceded by uncased characters, and if all lowercase characters are preceded by cased characters in ASCII.

Returns:
  • true – when the condition described above is met.
  • false – otherwise
proc toLower(): bytes

Creates a new bytes with all applicable characters converted to lowercase.

Returns:A new bytes with all uppercase characters (A-Z) replaced with their lowercase counterpart in ASCII. Other characters remain untouched.
proc toUpper(): bytes

Creates a new bytes with all applicable characters converted to uppercase.

Returns:A new bytes with all lowercase characters (a-z) replaced with their uppercase counterpart in ASCII. Other characters remain untouched.
proc toTitle(): bytes

Creates a new bytes with all applicable characters converted to title capitalization.

Returns:A new bytes with all cased characters(a-zA-Z) following an uncased character converted to uppercase, and all cased characters following another cased character converted to lowercase.
proc +=(ref lhs: bytes, const ref rhs: bytes): void

Appends the bytes rhs to the bytes lhs.

proc =(ref lhs: bytes, rhs: bytes)

Copies the bytes rhs into the bytes lhs.

proc =(ref lhs: bytes, rhs_c: c_string)

Copies the c_string rhs_c into the bytes lhs.

Halts if lhs is a remote bytes.

proc +(s0: bytes, s1: bytes)
Returns:A new bytes which is the result of concatenating s0 and s1
proc *(s: bytes, n: integral)
Returns:A new bytes which is the result of repeating s n times. If n is less than or equal to 0, an empty bytes is returned.
proc comparisonDeprWarn()