Chapter 11 13 min read

Strings

string is a built-in primitive in Glide. A string is an immutable, UTF-8 encoded byte sequence backed by a const char*. All methods listed here are declared with impl string { ... } and operate on raw bytes — index offsets, len, and cmp are byte-based, so ASCII behaves intuitively while multi-byte UTF-8 sequences sort and index by their raw bytes.

Import

No import needed — string is a built-in primitive type. Methods are always in scope.

fn main() -> i32 {
    let s: string = "hello";
    println!(s.len());   // 5
    return 0;
}

Method catalog

The full public surface of impl string, plus the runtime helpers (len/at/eq/concat/substring) that the methods build on:

Method Signature Returns
is_empty pub fn is_empty(self: string) -> bool bool
len runtime — self.len() i32 (bytes)
eq runtime — self.eq(other) bool
at runtime — self.at(i) char
cmp pub fn cmp(self: string, other: string) -> i32 i32 (sign)
contains pub fn contains(self: string, sub: string) -> bool bool
index_of pub fn index_of(self: string, sub: string) -> i32 i32 (-1 if absent)
starts_with pub fn starts_with(self: string, pre: string) -> bool bool
ends_with pub fn ends_with(self: string, suf: string) -> bool bool
substring runtime — self.substring(a, b) string
split pub fn split(self: string, sep: string) -> *Vector<string> *Vector<string>
replace pub fn replace(self: string, find: string, repl: string) -> string string
trim pub fn trim(self: string) -> string string
trim_left pub fn trim_left(self: string) -> string string
trim_right pub fn trim_right(self: string) -> string string
to_upper pub fn to_upper(self: string) -> string string
to_lower pub fn to_lower(self: string) -> string string
repeat pub fn repeat(self: string, n: i32) -> string string
concat runtime — self.concat(other) string
parse_int pub fn parse_int(self: string) -> i32 i32 (0 on failure)
try_parse_int pub fn try_parse_int(self: string) -> !i32 !i32
try_parse_float pub fn try_parse_float(self: string) -> !f64 !f64
try_parse_bool pub fn try_parse_bool(self: string) -> !bool !bool

Inspection & comparison

Method Signature Description
is_empty pub fn is_empty(self: string) -> bool true when the string has zero bytes.
len runtime helper — self.len() Length of the string in bytes (not code points).
eq runtime helper — self.eq(other) Byte-exact equality; returns bool.
at runtime helper — self.at(i) Byte at offset i as a char; use .to_int() for its code value.
cmp pub fn cmp(self: string, other: string) -> i32 Lexicographic byte comparison: negative if self sorts first, 0 if equal, positive if after.

is_empty is just self.len() == 0:

"".is_empty();        // true
" ".is_empty();       // false  (a space is a byte)
"abc".is_empty();     // false

cmp compares byte-by-byte. For ASCII this matches code-point order; on a prefix match, the longer string sorts after the shorter. The source returns exactly -1, 0, or 1:

"apple".cmp("banana");   // -1
"banana".cmp("apple");   //  1
"apple".cmp("apple");    //  0
"apple".cmp("app");      //  1   (longer wins on prefix match)

Branching on the sign of cmp to drive an ordering decision:

fn order(a: string, b: string) -> string {
    let c: i32 = a.cmp(b);
    if c < 0 { return a.concat(" < ").concat(b); }
    if c > 0 { return a.concat(" > ").concat(b); }
    return a.concat(" == ").concat(b);
}

fn main() -> i32 {
    println!(order("apple", "banana"));   // apple < banana
    println!(order("pear", "pear"));      // pear == pear
    return 0;
}

A combined inspection program — note at(i) yields a char, so call .to_int() for its byte value:

fn main() -> i32 {
    let s: string = "hello";
    println!(s.len());            // 5
    println!(s.is_empty());       // false
    println!("".is_empty());      // true
    println!(s.eq("hello"));      // true
    println!("apple".cmp("banana"));  // -1
    let c: char = s.at(0);
    println!(c.to_int());         // 104 ('h')
    return 0;
}

Searching

Method Signature Description
contains pub fn contains(self: string, sub: string) -> bool true when sub appears anywhere in self.
index_of pub fn index_of(self: string, sub: string) -> i32 Byte offset of the first occurrence of sub, or -1 if absent. Empty needle returns 0.
starts_with pub fn starts_with(self: string, pre: string) -> bool true when self begins with pre byte-for-byte.
ends_with pub fn ends_with(self: string, suf: string) -> bool true when self ends with suf byte-for-byte.

All four are case-sensitive. contains is defined as self.index_of(sub) >= 0. index_of does a naive left-to-right byte scan and returns the offset of the first match.

fn main() -> i32 {
    let url: string = "https://example.com/report.pdf";
    println!(url.contains("example"));        // true
    println!(url.index_of("example"));        // 8
    println!(url.index_of("missing"));        // -1
    println!(url.starts_with("https://"));    // true
    println!(url.ends_with(".pdf"));          // true
    println!(url.ends_with(".PDF"));          // false (case-sensitive)
    return 0;
}
fn icontains(hay: string, needle: string) -> bool {
    return hay.to_lower().contains(needle.to_lower());
}

fn main() -> i32 {
    println!(icontains("Report.PDF", "pdf"));   // true
    println!(icontains("Report.PDF", "xml"));   // false
    return 0;
}

Slicing & splitting

Method Signature Description
substring runtime helper — self.substring(a, b) Bytes in the half-open range [a, b).
split pub fn split(self: string, sep: string) -> *Vector<string> Split on every occurrence of sep. Empty sep yields one slice per byte.

substring(a, b) returns the bytes from offset a up to (but not including) b.

split returns a *Vector<string> (see the Vector reference for .len(), .get(i), iteration, etc.). Two adjacent separators produce an empty piece between them; an empty separator splits into one slice per byte, which is handy for walking characters.

fn main() -> i32 {
    let s: string = "a,b,c";
    println!(s.substring(0, 1));   // "a"
    println!(s.substring(2, 3));   // "b"

    let parts: *Vector<string> = s.split(",");
    println!(parts.len());         // 3
    println!(parts.get(1));        // "b"

    let chars: *Vector<string> = "abc".split("");
    println!(chars.len());         // 3   ["a", "b", "c"]
    return 0;
}
output
"a,,b".split(",").len();   // 3   (the middle piece is "")

split → loop

The common idiom is split then iterate the resulting vector, trimming or parsing each field:

fn main() -> i32 {
    let csv: string = "  alice , bob ,carol  ";
    let parts: *Vector<string> = csv.split(",");
    for let i: i32 = 0; i < parts.len(); i++ {
        let field: string = parts.get(i).trim();
        println!(i, field);
    }
    return 0;
}
output
0 alice
1 bob
2 carol

Walking bytes with at

For a single pass over bytes you don't need split("") — index with at(i) and inspect the code value. This counts ASCII digits:

fn main() -> i32 {
    let s: string = "ab12c3";
    let mut digits: i32 = 0;
    for let i: i32 = 0; i < s.len(); i++ {
        let c: i32 = s.at(i).to_int();
        if c >= 48 && c <= 57 { digits = digits + 1; }
    }
    println!(digits);   // 3
    return 0;
}

Transformation

Every method here returns a brand-new string.

Method Signature Description
replace pub fn replace(self: string, find: string, repl: string) -> string Replace every non-overlapping occurrence of find with repl. Empty find returns self unchanged.
trim pub fn trim(self: string) -> string Strip whitespace (space, tab, CR, LF, FF, VT) from both ends.
trim_left pub fn trim_left(self: string) -> string Strip leading whitespace only.
trim_right pub fn trim_right(self: string) -> string Strip trailing whitespace only.
to_upper pub fn to_upper(self: string) -> string Upper-case ASCII letters; non-ASCII bytes pass through unchanged.
to_lower pub fn to_lower(self: string) -> string Lower-case ASCII letters; non-ASCII bytes pass through unchanged.
repeat pub fn repeat(self: string, n: i32) -> string Concatenate self to itself n times. Returns "" when n <= 0.
concat runtime helper — self.concat(other) Join two strings into a new string.

replace matches left-to-right and is non-overlapping, so "aaaa".replace("aa", "b") yields "bb". The whitespace set recognised by trim* is space (0x20), tab (0x09), LF (0x0A), CR (0x0D), FF (0x0C), and VT (0x0B).

fn main() -> i32 {
    println!("hello world".replace("world", "there"));  // "hello there"
    println!("aaaa".replace("aa", "b"));                // "bb"

    println!("  hello  ".trim());        // "hello"
    println!("   hello".trim_left());    // "hello"
    println!("hello   ".trim_right());   // "hello"

    println!("Hello, World!".to_upper());  // "HELLO, WORLD!"
    println!("Hello, World!".to_lower());  // "hello, world!"

    println!("ab".repeat(3));   // "ababab"
    println!("-".repeat(10));   // "----------"

    println!("foo".concat("bar"));  // "foobar"
    return 0;
}

Building strings

concat is the primitive for assembling text. For a join with a separator, guard the separator on the first element:

fn join(parts: *Vector<string>, sep: string) -> string {
    let mut out: string = "";
    for let i: i32 = 0; i < parts.len(); i++ {
        if i > 0 { out = out.concat(sep); }
        out = out.concat(parts.get(i));
    }
    return out;
}

fn main() -> i32 {
    let words: *Vector<string> = "a,b,c".split(",");
    println!(join(words, "-"));   // "a-b-c"
    return 0;
}

Parsing

Method Signature Description
parse_int pub fn parse_int(self: string) -> i32 Decimal integer parse with optional +/- sign. Returns 0 on any failure.
try_parse_int pub fn try_parse_int(self: string) -> !i32 Like parse_int but reports failure as err(...).
try_parse_float pub fn try_parse_float(self: string) -> !f64 Parse a decimal float (sign, integer/fraction parts, optional e exponent).
try_parse_bool pub fn try_parse_bool(self: string) -> !bool Parse exactly "true" or "false" (case-sensitive).

parse_int vs try_parse_int

parse_int returns 0 for empty input, a lone sign, or any non-digit byte — which is ambiguous with a legitimate "0". Use try_parse_int when you must distinguish "invalid input" from "the value really was 0".

"42".parse_int();      // 42
"-7".parse_int();      // -7
"abc".parse_int();     // 0   (and so does "0" — be careful)

try_parse_int surfaces the failure reason — "empty string", "no digits" (a lone sign), or "non-digit byte" — through the !i32 result. Read it via .ok / .val / .err, propagate with postfix ?, or supply a fallback with ??:

fn main() -> i32 {
    let r: !i32 = "42".try_parse_int();
    if r.ok { println!(r.val); }   // 42

    let bad: !i32 = "abc".try_parse_int();
    if !bad.ok { println!(bad.err); }   // "non-digit byte"
    return 0;
}

You can also match the result to handle both arms at once:

fn classify(s: string) -> string {
    let r: !i32 = s.try_parse_int();
    match r {
        ok(v) => format!("ok: {}", v),
        err(e) => format!("bad: {}", e),
    }
}

fn main() -> i32 {
    println!(classify("42"));    // ok: 42
    println!(classify(""));      // bad: empty string
    println!(classify("-"));     // bad: no digits
    println!(classify("4x"));    // bad: non-digit byte
    return 0;
}

In a function (or main) returning !T, postfix ? propagates the error and ?? provides a default:

fn main() -> !i32 {
    let n: i32 = "100".try_parse_int()?;
    let m: i32 = "23".try_parse_int()?;
    println!(n + m);   // 123

    let port: i32 = "8080".try_parse_int() ?? 80;
    println!(port);    // 8080
    return ok(0);
}

try_parse_float

Accepts an optional sign, an integer part, an optional fractional part (.NNN), and an optional exponent (e[+-]NNN / E[+-]NNN). Either the integer or the fractional part may be empty, but at least one digit must be present overall. Failure modes: "empty string", "no digits", "malformed exponent", and "non-digit byte".

"3.14".try_parse_float();    // ok(3.14)
".5".try_parse_float();      // ok(0.5)
"1e3".try_parse_float();     // ok(1000.0)
"1e".try_parse_float();      // err("malformed exponent")
"abc".try_parse_float();     // err("non-digit byte")

try_parse_bool

Strictly "true" or "false". Anything else — including "True", "1", or "yes" — is rejected with err("not a bool").

"true".try_parse_bool();    // ok(true)
"false".try_parse_bool();   // ok(false)
"yes".try_parse_bool();     // err("not a bool")

A program exercising try_parse_float and try_parse_bool with .ok/.val/.err:

fn main() -> i32 {
    let f: !f64 = "3.14".try_parse_float();
    if f.ok { println!(f.val); }     // 3.14

    let g: !f64 = "1e3".try_parse_float();
    if g.ok { println!(g.val); }     // 1000

    let bad: !f64 = "1e".try_parse_float();
    if !bad.ok { println!(bad.err); } // malformed exponent

    let b: !bool = "true".try_parse_bool();
    if b.ok { println!(b.val); }     // true

    let nb: !bool = "yes".try_parse_bool();
    if !nb.ok { println!(nb.err); }  // not a bool
    return 0;
}

Byte (UTF-8) semantics

len, at, substring, index_of, and cmp all work on raw bytes, not Unicode code points. ASCII text behaves intuitively, but multi-byte UTF-8 sequences count as more than one byte, and slicing/indexing in the middle of a sequence splits it.

fn main() -> i32 {
    let s: string = "café";          // 'é' is 2 bytes in UTF-8
    println!(s.len());                // 5 bytes, not 4 code points
    println!(s.substring(0, 3));      // "caf"
    println!("naive".to_upper());     // "NAIVE" (ASCII only)
    return 0;
}

Edge cases at a glance

fn main() -> i32 {
    println!("a,,b".split(",").len());   // 3 (middle piece is "")
    println!("x".repeat(0));             // "" (n <= 0)
    println!("abc".replace("z", "?"));   // "abc" (no match)
    println!("abc".replace("", "?"));    // "abc" (empty find is a no-op)
    println!("abc".index_of(""));        // 0 (empty needle)
    println!("aaaa".replace("aa", "b")); // "bb" (non-overlapping)
    return 0;
}

See also

  • Vector — the *Vector<string> returned by split.
  • Formattingformat!/println! for assembling output from mixed
  • values.