Strings¶
The following documentation shows functions and methods used to manipulate and process Chapel strings.
Besides the functions below, some other modules proved routines that are
useful for working with strings. The IO
module provides
IO.string.format
which creates a string that is the result of
formatting. It also includes functions for reading and writing strings.
The Regexp
module also provides some routines for searching
within strings.
Warning
Casts from string
to the following types will throw an error
if they are invalid:
int
uint
real
imag
complex
enum
To learn more about handling these errors, see the Error Handling technical note.
Note
While string
is intended to be a Unicode string, there is much
left to do. As of Chapel 1.17, only ASCII strings can be expected to work
correctly with all functions.
Future work involves support for both ASCII and unicode strings, and allowing users to specify the encoding for individual strings.
-
record
string
¶ -
proc
string
(s: string, owned: bool = true)¶ Construct a new string from
s
. Ifowned
is set totrue
thens
will be fully copied into the new instance. If it isfalse
a shallow copy will be made such that any in-place modifications to the new string may appear ins
. It is the responsibility of the user to ensure that the underlying buffer is not freed while being used as part of a shallow copy.
-
proc
string
(cs: c_string, length: int = cs.length, owned: bool = true, needToCopy: bool = true) Construct a new string from the c_string cs. If owned is set to true, the backing buffer will be freed when the new record is destroyed. If needToCopy is set to true, the c_string will be copied into the record, otherwise it will be used directly. It is the responsibility of the user to ensure that the underlying buffer is not freed if the c_string is not copied in.
-
proc
string
(buff: bufferType, length: int, size: int, owned: bool = true, needToCopy: bool = true) Construct a new string from buff ( c_ptr(uint(8)) ). size indicates the total size of the buffer available, while len indicates the current length of the string in the buffer (the common case would be size-1 for a C-style string). If owned is set to true, the backing buffer will be freed when the new record is destroyed. If needToCopy is set to true, the c_string will be copied into the record, otherwise it will be used directly. It is the responsibility of the user to ensure that the underlying buffer is not freed if the c_string is not copied in.
-
proc
length
¶ Returns: The number of characters in the string.
-
proc
size
¶ Returns: The number of characters in the string.
-
proc
localize
(): string¶ Gets a version of the
string
that is on the currently executing locale.Returns: A shallow copy if the string
is already on the current locale, otherwise a deep copy is performed.
-
proc
c_str
(): c_string¶ Get a c_string from a
string
.Warning
This can only be called safely on a
string
whose home is the current locale. This property can be enforced by callingstring.localize()
beforec_str()
. If the string is remote, the program will halt.For example:
var my_string = "Hello!"; on different_locale { printf("%s", my_string.localize().c_str()); }
Returns: A c_string that points to the underlying buffer used by this string
. The returned c_string is only valid when used on the same locale as the string.
-
iter
these
(): string¶ Iterates over the string character by character.
For example:
var str = "abcd"; for c in str { writeln(c); }
Output:
a b c d
-
proc
this
(i: int): string¶ Index into a string
Returns: A string with the character at the specified index from 1..string.length
-
proc
this
(r: range(?)): string Slice a string. Halts if r is not completely inside the range 1..string.length.
Arguments: r -- range of the indices the new string should be made from Returns: a new string that is a substring within 1..string.length. If the length of r is zero, an empty string is returned.
-
proc
isEmptyString
(): bool¶ Returns: - true -- when the string is empty
- false -- otherwise
-
proc
startsWith
(needles: string ...): bool¶ Arguments: needles -- A varargs list of strings to match against. Returns: - true -- when the string begins with one or more of the needles
- false -- otherwise
-
proc
endsWith
(needles: string ...): bool¶ Arguments: needles -- A varargs list of strings to match against. Returns: - true -- when the string ends with one or more of the needles
- false -- otherwise
-
proc
find
(needle: string, region: range(?) = 1..): int¶ Arguments: - needle -- the string to search for
- region -- an optional range defining the substring to search within, default is the whole string. Halts if the range is not within 1..string.length
Returns: the index of the first occurrence of needle within a string, or 0 if the needle is not in the string.
-
proc
rfind
(needle: string, region: range(?) = 1..): int¶ Arguments: - needle -- the string to search for
- region -- an optional range defining the substring to search within, default is the whole string. Halts if the range is not within 1..string.length
Returns: the index of the first occurrence from the right of needle within a string, or 0 if the needle is not in the string.
-
proc
count
(needle: string, region: range(?) = 1..): int¶ Arguments: - needle -- the string to search for
- region -- an optional range defining the substring to search within, default is the whole string. Halts if the range is not within 1..string.length
Returns: the number of times needle occurs in the string
-
proc
replace
(needle: string, replacement: string, count: int = -1): string¶ Arguments: - needle -- the string to search for
- replacement -- the string to replace needle with
- count -- an optional integer specifying the number of replacements to make, values less than zero will replace all occurrences
Returns: a copy of the string where replacement replaces needle up to count times
-
iter
split
(sep: string, maxsplit: int = -1, ignoreEmpty: bool = false)¶ Splits the string on sep yielding the substring between each occurrence, up to maxsplit times.
Arguments: - sep -- The delimiter used to break the string into chunks.
- maxsplit -- The number of times to split the string, negative values indicate no limit.
- ignoreEmpty --
- When true -- Empty strings will not be yielded,
- and will not count towards maxsplit
- When false -- Empty strings will be yielded when
- sep occurs multiple times in a row.
-
iter
split
(maxsplit: int = -1) Works as above, but uses runs of whitespace as the delimiter.
Arguments: maxsplit -- The number of times to split the string, negative values indicate no limit.
-
proc
join
(const ref S: string ...): string¶ Returns a new string, which is the concatenation of all of the strings passed in with the receiving string inserted between them.
var x = "|".join("a","10","d"); writeln(x); // prints: "a|10|d"
-
proc
join
(const ref S): string Same as the varargs version, but with a homogeneous tuple of strings.
var x = "|".join("a","10","d"); writeln(x); // prints: "a|10|d"
-
proc
join
(const ref S: [] string): string Same as the varargs version, but with all the strings in an array.
var x = "|".join(["a","10","d"]); writeln(x); // prints: "a|10|d"
-
proc
_join
(const ref S): string¶
-
proc
strip
(chars: string = " trn", leading = true, trailing = true): string¶ Arguments: - chars -- A string containing each character to remove. Defaults to " \t\r\n".
- leading -- Indicates if leading occurrences should be removed. Defaults to true.
- trailing -- Indicates if trailing occurrences should be removed. Defaults to true.
Returns: A new string with leading and/or trailing occurrences of characters in chars removed as appropriate.
-
proc
partition
(sep: string): 3*(string)¶ Splits the string on sep into a 3*string consisting of the section before sep, sep, and the section after sep. If sep is not found, the tuple will contain the whole string, and then two empty strings.
-
proc
isUpper
(): bool¶ Checks if all the characters in the string are either uppercase (A-Z) or uncased (not a letter).
returns: - true -- if the string contains at least one uppercase
character and no lowercase characters, ignoring uncased characters.
false -- otherwise
-
proc
isLower
(): bool¶ Checks if all the characters in the string are either lowercase (a-z) or uncased (not a letter).
returns: - true -- when there are no uppercase characters in the string.
- false -- otherwise
-
proc
isSpace
(): bool¶ Checks if all the characters in the string are whitespace (' ', 't', 'n', 'v', 'f', 'r').
returns: - true -- when all the characters are whitespace.
- false -- otherwise
-
proc
isAlpha
(): bool¶ Checks if all the characters in the string are alphabetic (a-zA-Z).
returns: - true -- when the characters are alphabetic.
- false -- otherwise
-
proc
isDigit
(): bool¶ Checks if all the characters in the string are digits (0-9).
returns: - true -- when the characters are digits.
- false -- otherwise
-
proc
isAlnum
(): bool¶ Checks if all the characters in the string are alphanumeric (a-zA-Z0-9).
returns: - true -- when the characters are alphanumeric.
- false -- otherwise
-
proc
isPrintable
(): bool¶ Checks if all the characters in the string are printable. Characters are defined as being printable if they are within the range of 0x20 - 0x7e including the bounds.
returns: - true -- when the characters are printable.
- false -- otherwise
-
proc
isTitle
(): bool¶ Checks if all uppercase characters are preceded by uncased characters, and if all lowercase characters are preceded by cased characters.
Returns: - true -- when the condition described above is met.
- false -- otherwise
-
proc
toLower
(): string¶ Returns: A new string with all uppercase characters replaced with their lowercase counterpart.
-
proc
toUpper
(): string¶ Returns: A new string with all lowercase characters replaced with their uppercase counterpart.
-
proc
toTitle
(): string¶ Returns: A new string with all cased characters following an uncased character converted to uppercase, and all cased characters following another cased character converted to lowercase.
-
proc
-
proc =(ref lhs: string, rhs: string)
Copies the string rhs into the string lhs.
-
proc =(ref lhs: string, rhs_c: c_string)
Copies the c_string rhs_c into the string lhs.
Halts if lhs is a remote string.
-
proc
+
(s0: string, s1: string)¶ Returns: A new string which is the result of concatenating s0 and s1
-
proc
*
(s: string, n: integral)¶ Returns: A new string which is the result of repeating s n times. If n is less than or equal to 0, an empty string is returned. For example:
writeln("Hello! " * 3);
Results in:
Hello! Hello! Hello!
-
proc
+
(s: string, x: numeric) The following concatenation functions return a new string which is the result of casting the non-string argument to a string, and concatenating that result with s.
-
proc
+
(x: numeric, s: string)
-
proc
+
(s: string, x: enumerated)
-
proc
+
(x: enumerated, s: string)
-
proc
+
(s: string, x: bool)
-
proc
+
(x: bool, s: string)
-
proc +=(ref lhs: string, const ref rhs: string): void
Appends the string rhs to the string lhs.
-
proc
ascii
(a: string): uint(8)¶ Returns: The byte value of the first character in a as an integer.
-
proc
asciiToString
(i: uint(8))¶ Returns: A string with the single character with the ASCII value i.