Chapter 24 13 min read

Encoding, crypto & compression

Glide's standard library ships the byte-level primitives most wire protocols need: Base64 and hex text codecs, a growable ByteBuffer, SHA-256/SHA-1 and HMAC-SHA-256 hashing, gzip/zlib inflate, and an in-memory tar extractor. All of these treat a Glide string as an opaque byte sequence (the same convention across the whole module group).

Import

import stdlib::base64::*;
import stdlib::hex::*;
import stdlib::bytes::*;
import stdlib::crypto::*;
import stdlib::compress::*;
import stdlib::tar::*;

Each ## section below corresponds to one module; import only the ones you use.

Public surface at a glance

Module Items
base64 b64_encode, b64_decode
hex hex_encode, hex_decode
bytes struct ByteBuffer + new/free/len/cap/push/push_str/at/clear/as_string
crypto sha256, sha256_hex, sha1, sha1_hex, hmac_sha256, hmac_sha256_hex
compress gzip_decode
tar tar_extract

base64

Standard Base64 (RFC 4648, with = padding). Encoding always returns a string; decoding returns a !string because it can reject malformed input.

Function Signature Description
b64_encode pub fn b64_encode(s: string) -> string Encode bytes to Base64. Output length is always a multiple of four.
b64_decode pub fn b64_decode(s: string) -> !string Decode Base64. err on a bad alphabet char, bad padding, or invalid length.

b64_encode maps each group of three input bytes to four output characters, padding the final group with = so the output is always a multiple of four. b64_decode is its exact inverse; it validates the padding and alphabet and rejects any stray character — whitespace included. The alphabet is the standard one (A-Z a-z 0-9 + /); there is no URL-safe (- / _) variant.

Decode errors surface through the !string result:

Condition err message
Char outside the Base64 alphabet "invalid base64 character"
Non-= byte after padding started "invalid base64 padding"
Non-pad length % 4 == 1 "invalid base64 length"
import stdlib::base64::*;

fn main() -> i32 {
    let enc: string = b64_encode("Hello");   // "SGVsbG8="
    println!(enc);
    println!(b64_encode("foobar"));          // "Zm9vYmFy"
    println!(b64_encode(""));                // "" (empty in, empty out)

    let r: !string = b64_decode(enc);
    if r.ok {
        println!(r.val);                     // "Hello"
    }

    // Errors: one per failure mode.
    let e1: !string = b64_decode("***");     // out-of-alphabet byte
    if !e1.ok { println!(e1.err); }          // "invalid base64 character"

    let e2: !string = b64_decode("AB=C");    // data after a pad byte
    if !e2.ok { println!(e2.err); }          // "invalid base64 padding"

    let e3: !string = b64_decode("A");       // length % 4 == 1
    if !e3.ok { println!(e3.err); }          // "invalid base64 length"

    // Default-on-error with `??`.
    let dec: string = b64_decode("not!valid") ?? "fallback";
    println!(dec);                           // "fallback"

    return 0;
}

hex

Lowercase hexadecimal codec — two hex digits per input byte. Like base64, encoding is infallible and decoding returns !string.

Function Signature Description
hex_encode pub fn hex_encode(s: string) -> string Lowercase hex of every byte (two digits each).
hex_decode pub fn hex_decode(s: string) -> !string Decode hex. Accepts upper- and lower-case digits.

hex_encode always emits lowercase (a-f). hex_decode accepts both cases ("414243" and "414243".to_upper() decode to the same bytes) and errs on:

Condition err message
Odd-length input "odd length"
Any non-hex byte "non-hex char"
import stdlib::hex::*;

fn main() -> !i32 {
    let enc: string = hex_encode("ABC");     // "414243"
    println!(enc);
    println!(hex_encode(""));                // "" (empty)

    let dec: string = hex_decode(enc)?;      // propagate on error
    println!(dec);                           // "ABC"

    // Case-insensitive decode.
    let up: string = hex_decode("414243")?;
    let lo: string = hex_decode("414243".to_lower())?;
    if up.eq(lo) { println!("case-insensitive ok"); }

    let e1: !string = hex_decode("abc");     // odd number of digits
    if !e1.ok { println!(e1.err); }          // "odd length"

    let e2: !string = hex_decode("xy");      // not hex digits
    if !e2.ok { println!(e2.err); }          // "non-hex char"

    return ok(0);
}

bytes — ByteBuffer

A mutable, growable byte buffer backed by malloc'd memory. Use it to build binary payloads (wire formats, outgoing HTTP bodies) without the per-concat allocation that string operators trigger. Capacity doubles on growth, so appends are amortised O(1).

Struct

pub struct ByteBuffer {
    pub data: *u8,   // raw byte storage (hand to C-side write syscalls)
    pub len: i32,    // number of valid bytes; always <= cap
    pub cap: i32,    // allocated capacity; always >= len
}

data is public on purpose: binary protocols (HTTP/2, TLS, custom wire formats) can pass the pointer plus len straight to a C write.

Methods

Method Signature Description
new pub fn new() -> *ByteBuffer Allocate an empty buffer with 16-byte initial capacity.
free pub fn free(self: *ByteBuffer) Release the data buffer and the struct. Pointer dangles afterwards.
len pub fn len(self: *ByteBuffer) -> i32 Bytes currently in the buffer.
cap pub fn cap(self: *ByteBuffer) -> i32 Allocated capacity (>= len()).
push pub fn push(self: *ByteBuffer, b: u8) Append one byte; reallocates when full.
push_str pub fn push_str(self: *ByteBuffer, s: string) Append every byte of s (no NUL appended).
at pub fn at(self: *ByteBuffer, i: i32) -> u8 Read the byte at index i. No bounds check.
clear pub fn clear(self: *ByteBuffer) Reset len to 0; capacity preserved for reuse.
as_string pub fn as_string(self: *ByteBuffer) -> string Borrow as a NUL-terminated string aliasing data.

Building and inspecting a buffer

import stdlib::bytes::*;

fn main() -> i32 {
    let b: *ByteBuffer = ByteBuffer::new();
    defer b.free();

    println!(b.cap());            // 16 (initial capacity)

    b.push_str("hello, ");
    b.push_str("world");
    b.push(33 as u8);             // '!'

    println!(b.len());            // 13

    let first: u8 = b.at(0);
    println!(first as i32);       // 104 = 'h'

    println!(b.as_string());      // "hello, world!"

    // Push past the initial capacity to force a reallocation.
    let mut i: i32 = 0;
    while i < 50 { b.push(65 as u8); i = i + 1; }
    println!(b.len());            // 63
    println!(b.cap() >= b.len()); // true

    b.clear();
    println!(b.len());            // 0 (capacity is preserved)

    return 0;
}

Handing the raw pointer to a syscall

data + len are exactly what a C write wants — build the payload once, then pass the pointer.

import stdlib::bytes::*;

fn main() -> i32 {
    let req: *ByteBuffer = ByteBuffer::new();
    defer req.free();

    req.push_str("GET / HTTP/1.1\r\n");
    req.push_str("Host: example.com\r\n");
    req.push_str("\r\n");

    let p: *u8 = req.data;        // hand `p` + `req.len` to a write()
    let n: i32 = req.len;
    println!(n);

    return 0;
}

crypto — hashes & HMAC

SHA-256 (RFC 6234), SHA-1 (RFC 3174) and HMAC-SHA-256 (RFC 2104). The implementation is hand-written C inside a c_raw! block (no OpenSSL dependency). Each algorithm comes in two flavours: a raw variant returning the digest as raw bytes packed in a string, and a `_hex` variant returning lowercase hex.

Function Signature Output
sha256 pub fn sha256(data: string) -> string 32 raw bytes
sha256_hex pub fn sha256_hex(data: string) -> string 64 lowercase hex chars
sha1 pub fn sha1(data: string) -> string 20 raw bytes
sha1_hex pub fn sha1_hex(data: string) -> string 40 lowercase hex chars
hmac_sha256 pub fn hmac_sha256(key: string, msg: string) -> string 32 raw bytes
hmac_sha256_hex pub fn hmac_sha256_hex(key: string, msg: string) -> string 64 lowercase hex chars

The _hex and raw flavours hash the same bytes, so hex_encode(sha256(x)) equals sha256_hex(x) (and likewise for SHA-1 / HMAC).

import stdlib::crypto::*;
import stdlib::hex::*;

fn main() -> i32 {
    // SHA-256
    println!(sha256_hex(""));
    // e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
    println!(sha256_hex("abc"));
    // ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

    // Raw digest -> hex equals the _hex variant.
    let raw: string = sha256("hello");
    println!(hex_encode(raw));
    // 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

    // SHA-1
    println!(sha1_hex(""));      // da39a3ee5e6b4b0d3255bfef95601890afd80709
    println!(sha1_hex("abc"));   // a9993e364706816aba3e25717850c26c9cd0d89d
    println!(hex_encode(sha1("abc")));

    // HMAC-SHA-256
    let mac: string = hmac_sha256_hex(
        "key", "The quick brown fox jumps over the lazy dog");
    println!(mac);
    // f7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8

    let macraw: string = hmac_sha256("secret", "payload");
    println!(hex_encode(macraw));
    // b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4

    return 0;
}

Known reference digests:

Call Result
sha256_hex("") e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
sha256_hex("abc") ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
sha1_hex("") da39a3ee5e6b4b0d3255bfef95601890afd80709
sha1_hex("abc") a9993e364706816aba3e25717850c26c9cd0d89d
hmac_sha256_hex("key", "The quick brown fox jumps over the lazy dog") f7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8

Worked example: the RFC 6455 WebSocket handshake

The one place SHA-1 is still mandated: a server computes Sec-WebSocket-Accept as base64(sha1(client_key + magic_guid)).

import stdlib::crypto::*;
import stdlib::base64::*;

const WS_GUID: string = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";

fn ws_accept(client_key: string) -> string {
    let combined: string = client_key.concat(WS_GUID);
    let digest: string = sha1(combined);     // 20 raw bytes
    return b64_encode(digest);
}

fn main() -> i32 {
    // RFC 6455 example vector.
    println!(ws_accept("dGhlIHNhbXBsZSBub25jZQ=="));
    // s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
    return 0;
}

compress — gzip / zlib inflate

A single decode function backed by zlib (-lz). It auto-detects a gzip or zlib wrapper (inflateInit2(&zs, 15 + 32)). There is no encode counterpart in this module — it exists to inflate fetched tarballs.

Function Signature Description
gzip_decode pub fn gzip_decode(input: string, in_len: i32, out_len: *i32) -> *void Inflate a gzip/zlib buffer. Returns a fresh malloc'd buffer; the decoded byte count is written through out_len.

The returned pointer is null on any decode error (and out_len is set to 0). On success the caller owns the buffer: either free it, or wrap it with the internal __glide_string_from_buf(buf, out_len) (which copies into arena-tracked memory) and free the original.

import stdlib::compress::*;
import stdlib::fs::*;

extern fn __glide_string_from_buf(buf: *void, n: i32) -> string;

fn main() -> i32 {
    let mut raw_len: i32 = 0;
    let payload: string = fs_read_bytes("archive.gz", &raw_len);
    if raw_len <= 0 { return 1; }

    let mut out_len: i32 = 0;
    let buf: *void = gzip_decode(payload, raw_len, &out_len);
    if buf == null { return 1; }             // null on any decode error

    let decoded: string = __glide_string_from_buf(buf, out_len);
    free(buf);
    println!(out_len);

    return 0;
}

tar — in-memory extraction

A USTAR / GNU tar reader. It takes an archive held in memory as a Glide string (binary-safe when produced by __glide_string_from_buf) and writes each entry to out_dir.

Function Signature Description
tar_extract pub fn tar_extract(data: string, n: i32, out_dir: string, strip_components: i32) -> i32 Extract n bytes of tar data into out_dir. Returns the count of regular files written, or -1 if a write fails.

strip_components matches GNU tar's --strip-components: it discards that many leading /-separated (or \-separated) path segments from each entry name before writing. GitHub codeload tarballs wrap everything under a top-level <repo>-<sha>/ directory, so callers fetching from there pass strip_components = 1. An entry that has fewer segments than strip_components is silently skipped.

Recognised entry types:

Type byte Meaning Action
'0' / 0x00 Regular file Written to disk (parent dirs created)
'5' Directory mkdir -p
'L' GNU long name Buffered for the next header
'x' / 'g' pax extended header Skipped

The typical pipeline is fetch -> gzip_decode -> tar_extract:

import stdlib::compress::*;
import stdlib::tar::*;
import stdlib::fs::*;

extern fn __glide_string_from_buf(buf: *void, n: i32) -> string;

fn main() -> i32 {
    let mut raw_len: i32 = 0;
    let payload: string = fs_read_bytes("repo.tar.gz", &raw_len);
    if raw_len <= 0 { return 1; }

    // Inflate the gzip wrapper.
    let mut out_len: i32 = 0;
    let buf: *void = gzip_decode(payload, raw_len, &out_len);
    if buf == null { return 1; }
    let decoded: string = __glide_string_from_buf(buf, out_len);
    free(buf);

    // Extract, stripping the leading <repo>-<sha>/ directory.
    let n: i32 = tar_extract(decoded, out_len, "/tmp/extract", 1);
    if n < 0 { return 1; }                   // -1 = a file write failed
    println!(n);                             // regular files written

    // Or keep full paths with strip_components = 0.
    let m: i32 = tar_extract(decoded, out_len, "/tmp/full", 0);
    println!(m);

    return 0;
}