Chapter 24 13 min read

Encoding, crypto & compression

Glide's standard library ships the byte-level primitives most wire protocols need: Base64 and hex text codecs, a growable ByteBuffer, SHA-256/SHA-1 and HMAC-SHA-256 hashing, gzip/zlib inflate, and an in-memory tar extractor. All of these treat a Glide string as an opaque byte sequence (the same convention across the whole module group).

Import

import stdlib::base64::*;
import stdlib::hex::*;
import stdlib::bytes::*;
import stdlib::crypto::*;
import stdlib::compress::*;
import stdlib::tar::*;

Each ## section below corresponds to one module; import only the ones you use.

Public surface at a glance

Module	Items
`base64`	`b64_encode`, `b64_decode`
`hex`	`hex_encode`, `hex_decode`
`bytes`	struct `ByteBuffer` + `new`/`free`/`len`/`cap`/`push`/`push_str`/`at`/`clear`/`as_string`
`crypto`	`sha256`, `sha256_hex`, `sha1`, `sha1_hex`, `hmac_sha256`, `hmac_sha256_hex`
`compress`	`gzip_decode`
`tar`	`tar_extract`

base64

Standard Base64 (RFC 4648, with = padding). Encoding always returns a string; decoding returns a !string because it can reject malformed input.

Function	Signature	Description
`b64_encode`	`pub fn b64_encode(s: string) -> string`	Encode bytes to Base64. Output length is always a multiple of four.
`b64_decode`	`pub fn b64_decode(s: string) -> !string`	Decode Base64. `err` on a bad alphabet char, bad padding, or invalid length.

b64_encode maps each group of three input bytes to four output characters, padding the final group with = so the output is always a multiple of four. b64_decode is its exact inverse; it validates the padding and alphabet and rejects any stray character — whitespace included. The alphabet is the standard one (A-Z a-z 0-9 + /); there is no URL-safe (- / _) variant.

Decode errors surface through the !string result:

Condition	`err` message
Char outside the Base64 alphabet	`"invalid base64 character"`
Non-`=` byte after padding started	`"invalid base64 padding"`
Non-pad length `% 4 == 1`	`"invalid base64 length"`

import stdlib::base64::*;

fn main() -> i32 {
    let enc: string = b64_encode("Hello");   // "SGVsbG8="
    println!(enc);
    println!(b64_encode("foobar"));          // "Zm9vYmFy"
    println!(b64_encode(""));                // "" (empty in, empty out)

    let r: !string = b64_decode(enc);
    if r.ok {
        println!(r.val);                     // "Hello"
    }

    // Errors: one per failure mode.
    let e1: !string = b64_decode("***");     // out-of-alphabet byte
    if !e1.ok { println!(e1.err); }          // "invalid base64 character"

    let e2: !string = b64_decode("AB=C");    // data after a pad byte
    if !e2.ok { println!(e2.err); }          // "invalid base64 padding"

    let e3: !string = b64_decode("A");       // length % 4 == 1
    if !e3.ok { println!(e3.err); }          // "invalid base64 length"

    // Default-on-error with `??`.
    let dec: string = b64_decode("not!valid") ?? "fallback";
    println!(dec);                           // "fallback"

    return 0;
}

hex

Lowercase hexadecimal codec — two hex digits per input byte. Like base64, encoding is infallible and decoding returns !string.

Function	Signature	Description
`hex_encode`	`pub fn hex_encode(s: string) -> string`	Lowercase hex of every byte (two digits each).
`hex_decode`	`pub fn hex_decode(s: string) -> !string`	Decode hex. Accepts upper- and lower-case digits.

hex_encode always emits lowercase (a-f). hex_decode accepts both cases ("414243" and "414243".to_upper() decode to the same bytes) and errs on:

Condition	`err` message
Odd-length input	`"odd length"`
Any non-hex byte	`"non-hex char"`

import stdlib::hex::*;

fn main() -> !i32 {
    let enc: string = hex_encode("ABC");     // "414243"
    println!(enc);
    println!(hex_encode(""));                // "" (empty)

    let dec: string = hex_decode(enc)?;      // propagate on error
    println!(dec);                           // "ABC"

    // Case-insensitive decode.
    let up: string = hex_decode("414243")?;
    let lo: string = hex_decode("414243".to_lower())?;
    if up.eq(lo) { println!("case-insensitive ok"); }

    let e1: !string = hex_decode("abc");     // odd number of digits
    if !e1.ok { println!(e1.err); }          // "odd length"

    let e2: !string = hex_decode("xy");      // not hex digits
    if !e2.ok { println!(e2.err); }          // "non-hex char"

    return ok(0);
}

bytes — `ByteBuffer`

A mutable, growable byte buffer backed by malloc'd memory. Use it to build binary payloads (wire formats, outgoing HTTP bodies) without the per-concat allocation that string operators trigger. Capacity doubles on growth, so appends are amortised O(1).

Struct

pub struct ByteBuffer {
    pub data: *u8,   // raw byte storage (hand to C-side write syscalls)
    pub len: i32,    // number of valid bytes; always <= cap
    pub cap: i32,    // allocated capacity; always >= len
}

data is public on purpose: binary protocols (HTTP/2, TLS, custom wire formats) can pass the pointer plus len straight to a C write.

Methods

Method	Signature	Description
`new`	`pub fn new() -> *ByteBuffer`	Allocate an empty buffer with 16-byte initial capacity.
`free`	`pub fn free(self: *ByteBuffer)`	Release the data buffer and the struct. Pointer dangles afterwards.
`len`	`pub fn len(self: *ByteBuffer) -> i32`	Bytes currently in the buffer.
`cap`	`pub fn cap(self: *ByteBuffer) -> i32`	Allocated capacity (`>= len()`).
`push`	`pub fn push(self: *ByteBuffer, b: u8)`	Append one byte; reallocates when full.
`push_str`	`pub fn push_str(self: *ByteBuffer, s: string)`	Append every byte of `s` (no NUL appended).
`at`	`pub fn at(self: *ByteBuffer, i: i32) -> u8`	Read the byte at index `i`. No bounds check.
`clear`	`pub fn clear(self: *ByteBuffer)`	Reset `len` to 0; capacity preserved for reuse.
`as_string`	`pub fn as_string(self: *ByteBuffer) -> string`	Borrow as a NUL-terminated string aliasing `data`.

Building and inspecting a buffer

import stdlib::bytes::*;

fn main() -> i32 {
    let b: *ByteBuffer = ByteBuffer::new();
    defer b.free();

    println!(b.cap());            // 16 (initial capacity)

    b.push_str("hello, ");
    b.push_str("world");
    b.push(33 as u8);             // '!'

    println!(b.len());            // 13

    let first: u8 = b.at(0);
    println!(first as i32);       // 104 = 'h'

    println!(b.as_string());      // "hello, world!"

    // Push past the initial capacity to force a reallocation.
    let mut i: i32 = 0;
    while i < 50 { b.push(65 as u8); i = i + 1; }
    println!(b.len());            // 63
    println!(b.cap() >= b.len()); // true

    b.clear();
    println!(b.len());            // 0 (capacity is preserved)

    return 0;
}

Handing the raw pointer to a syscall

data + len are exactly what a C write wants — build the payload once, then pass the pointer.

import stdlib::bytes::*;

fn main() -> i32 {
    let req: *ByteBuffer = ByteBuffer::new();
    defer req.free();

    req.push_str("GET / HTTP/1.1\r\n");
    req.push_str("Host: example.com\r\n");
    req.push_str("\r\n");

    let p: *u8 = req.data;        // hand `p` + `req.len` to a write()
    let n: i32 = req.len;
    println!(n);

    return 0;
}

crypto — hashes & HMAC

SHA-256 (RFC 6234), SHA-1 (RFC 3174) and HMAC-SHA-256 (RFC 2104). The implementation is hand-written C inside a c_raw! block (no OpenSSL dependency). Each algorithm comes in two flavours: a raw variant returning the digest as raw bytes packed in a string, and a `_hex` variant returning lowercase hex.

Function	Signature	Output
`sha256`	`pub fn sha256(data: string) -> string`	32 raw bytes
`sha256_hex`	`pub fn sha256_hex(data: string) -> string`	64 lowercase hex chars
`sha1`	`pub fn sha1(data: string) -> string`	20 raw bytes
`sha1_hex`	`pub fn sha1_hex(data: string) -> string`	40 lowercase hex chars
`hmac_sha256`	`pub fn hmac_sha256(key: string, msg: string) -> string`	32 raw bytes
`hmac_sha256_hex`	`pub fn hmac_sha256_hex(key: string, msg: string) -> string`	64 lowercase hex chars

The _hex and raw flavours hash the same bytes, so hex_encode(sha256(x)) equals sha256_hex(x) (and likewise for SHA-1 / HMAC).

import stdlib::crypto::*;
import stdlib::hex::*;

fn main() -> i32 {
    // SHA-256
    println!(sha256_hex(""));
    // e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
    println!(sha256_hex("abc"));
    // ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

    // Raw digest -> hex equals the _hex variant.
    let raw: string = sha256("hello");
    println!(hex_encode(raw));
    // 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

    // SHA-1
    println!(sha1_hex(""));      // da39a3ee5e6b4b0d3255bfef95601890afd80709
    println!(sha1_hex("abc"));   // a9993e364706816aba3e25717850c26c9cd0d89d
    println!(hex_encode(sha1("abc")));

    // HMAC-SHA-256
    let mac: string = hmac_sha256_hex(
        "key", "The quick brown fox jumps over the lazy dog");
    println!(mac);
    // f7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8

    let macraw: string = hmac_sha256("secret", "payload");
    println!(hex_encode(macraw));
    // b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4

    return 0;
}

Known reference digests:

Call	Result
`sha256_hex("")`	`e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855`
`sha256_hex("abc")`	`ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad`
`sha1_hex("")`	`da39a3ee5e6b4b0d3255bfef95601890afd80709`
`sha1_hex("abc")`	`a9993e364706816aba3e25717850c26c9cd0d89d`
`hmac_sha256_hex("key", "The quick brown fox jumps over the lazy dog")`	`f7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8`

Worked example: the RFC 6455 WebSocket handshake

The one place SHA-1 is still mandated: a server computes Sec-WebSocket-Accept as base64(sha1(client_key + magic_guid)).

import stdlib::crypto::*;
import stdlib::base64::*;

const WS_GUID: string = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";

fn ws_accept(client_key: string) -> string {
    let combined: string = client_key.concat(WS_GUID);
    let digest: string = sha1(combined);     // 20 raw bytes
    return b64_encode(digest);
}

fn main() -> i32 {
    // RFC 6455 example vector.
    println!(ws_accept("dGhlIHNhbXBsZSBub25jZQ=="));
    // s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
    return 0;
}

compress — gzip / zlib inflate

A single decode function backed by zlib (-lz). It auto-detects a gzip or zlib wrapper (inflateInit2(&zs, 15 + 32)). There is no encode counterpart in this module — it exists to inflate fetched tarballs.

Function	Signature	Description
`gzip_decode`	`pub fn gzip_decode(input: string, in_len: i32, out_len: i32) -> void`	Inflate a gzip/zlib buffer. Returns a fresh `malloc`'d buffer; the decoded byte count is written through `out_len`.

The returned pointer is null on any decode error (and out_len is set to 0). On success the caller owns the buffer: either free it, or wrap it with the internal __glide_string_from_buf(buf, out_len) (which copies into arena-tracked memory) and free the original.

import stdlib::compress::*;
import stdlib::fs::*;

extern fn __glide_string_from_buf(buf: *void, n: i32) -> string;

fn main() -> i32 {
    let mut raw_len: i32 = 0;
    let payload: string = fs_read_bytes("archive.gz", &raw_len);
    if raw_len <= 0 { return 1; }

    let mut out_len: i32 = 0;
    let buf: *void = gzip_decode(payload, raw_len, &out_len);
    if buf == null { return 1; }             // null on any decode error

    let decoded: string = __glide_string_from_buf(buf, out_len);
    free(buf);
    println!(out_len);

    return 0;
}

tar — in-memory extraction

A USTAR / GNU tar reader. It takes an archive held in memory as a Glide string (binary-safe when produced by __glide_string_from_buf) and writes each entry to out_dir.

Function	Signature	Description
`tar_extract`	`pub fn tar_extract(data: string, n: i32, out_dir: string, strip_components: i32) -> i32`	Extract `n` bytes of tar `data` into `out_dir`. Returns the count of regular files written, or `-1` if a write fails.

strip_components matches GNU tar's --strip-components: it discards that many leading /-separated (or \-separated) path segments from each entry name before writing. GitHub codeload tarballs wrap everything under a top-level <repo>-<sha>/ directory, so callers fetching from there pass strip_components = 1. An entry that has fewer segments than strip_components is silently skipped.

Recognised entry types:

Type byte	Meaning	Action
`'0'` / `0x00`	Regular file	Written to disk (parent dirs created)
`'5'`	Directory	`mkdir -p`
`'L'`	GNU long name	Buffered for the next header
`'x'` / `'g'`	pax extended header	Skipped

The typical pipeline is fetch -> gzip_decode -> tar_extract:

import stdlib::compress::*;
import stdlib::tar::*;
import stdlib::fs::*;

extern fn __glide_string_from_buf(buf: *void, n: i32) -> string;

fn main() -> i32 {
    let mut raw_len: i32 = 0;
    let payload: string = fs_read_bytes("repo.tar.gz", &raw_len);
    if raw_len <= 0 { return 1; }

    // Inflate the gzip wrapper.
    let mut out_len: i32 = 0;
    let buf: *void = gzip_decode(payload, raw_len, &out_len);
    if buf == null { return 1; }
    let decoded: string = __glide_string_from_buf(buf, out_len);
    free(buf);

    // Extract, stripping the leading <repo>-<sha>/ directory.
    let n: i32 = tar_extract(decoded, out_len, "/tmp/extract", 1);
    if n < 0 { return 1; }                   // -1 = a file write failed
    println!(n);                             // regular files written

    // Or keep full paths with strip_components = 0.
    let m: i32 = tar_extract(decoded, out_len, "/tmp/full", 0);
    println!(m);

    return 0;
}

Import

Public surface at a glance

base64

hex

bytes — ByteBuffer

Struct

Methods

Building and inspecting a buffer

Handing the raw pointer to a syscall

crypto — hashes & HMAC

Worked example: the RFC 6455 WebSocket handshake

compress — gzip / zlib inflate

tar — in-memory extraction

bytes — `ByteBuffer`