engram listens to a live performance and generates a continuous strip of procedural glyphs, one every five seconds, formatted for a thermal receipt printer. The strip is a physical artifact of the set: a 384-pixel-wide scroll that unspools in real time, accumulating until you stop the recording.
Each glyph is 32×32 pixels, 1-bit black and white. Twelve glyphs composite into a full-width row. The renderer draws nothing symbolic — no notation, no waveform, no meter. Every pixel is computed from the spectral analysis of that five-second window.
The Problem
Live audio performance leaves almost nothing behind. Video capture is passive and bulky. Spectrograms are readable to engineers but opaque to everyone else. The question was whether there was a middle ground: a visual record that is obviously tied to the music without being a literal transcription of it. Something you could look at and recognize the shape of a long harmonic drone versus a busy percussive section, but that also looks like its own thing — not a graph.
The thermal printer format made the constraints tight enough to be interesting. 384 pixels wide, one bit per pixel, paper scrolling out at the speed of the performance.
Key Decisions
Flow fiber instead of symbolic notation. The first version drew a baseline, a per-pitch-class symbol, onset ticks, and frequency risers — closer to a notation system than an image. It was readable but looked mechanical. The replacement is a flow fiber renderer: a vector field built from the audio features, with streamlines traced through it. The field shape comes from the chroma data, the line density from loudness, the line length from harmonic content. The result looks like fiber or smoke at 32×32 pixels and reads as an image rather than a diagram.
Chroma as the field. The twelve pitch classes each contribute a sinusoidal wave in their angular direction (C at 0°, C# at 30°, and so on). The full chroma vector — which pitch classes are active and how strongly — determines how those twelve waves interfere. A sustained C major chord produces a characteristic radial interference pattern. A different chord rotates and reshapes it. Spread chroma (as in jazz harmony or noise) produces a more complex and turbulent field than a clean tonal center. The visual fingerprint of the key is stable enough to be legible across multiple glyphs but sensitive enough that chord changes register.
hpssRatio for fiber length. Separating harmonic from percussive content is the single most musically meaningful distinction the spectral analysis makes. Mapping it to fiber length has an immediate perceptual payoff: highly harmonic windows produce long smooth curves that read as sustained; highly percussive windows produce short tangled fragments that read as rhythmic. A transition from one to the other — a drum fill entering a chord, or a chord dropping out into a beat — is visible as a texture change across adjacent glyphs on the strip.
Direct pixel buffer rendering. The original renderer used CGContext path drawing — stroking lines, filling ellipses, translating coordinate systems. The flow fiber algorithm sets individual pixels, which is awkward in CGContext (a fillRect of one pixel per call). Switching to a raw [UInt8] buffer and constructing the CGImage directly via CGDataProvider is both faster and a better fit: the field-tracing loop just writes indices into an array, and the image is assembled once at the end.
Hardware sample rate detection. AVFoundation's input node runs at the hardware's native sample rate — 48 kHz on device — not the 44.1 kHz the original code assumed. The tap format mismatch crashes immediately on launch. The fix is to read inputNode.outputFormat(forBus: 0) after the audio session activates, size the ring buffer from that value, and propagate the detected rate to the feature extractor before the first analysis tick.
Architecture
MicCapture ring buffer at hardware sample rate
↓
FeatureExtractor vDSP FFT → spectral centroid, rolloff, bandwidth, flatness,
chroma[12], MFCC[4], onset detection, HPSS ratio, tempo
↓
SigilRenderer builds 32×32 vector field from chroma interference,
traces streamlines → [UInt8] pixel buffer → CGImage
↓
GlyphState accumulates 12 glyphs → composites 384×32 row
↓
PrintQueue actor-backed async queue → PrinterProtocol.printRow()
SessionStore saves glyphs and rows as PNGs to Documents/sets/
StripStitcher composites full session strip on stop
AnalysisCycle drives the whole pipeline from a 5-second Timer on the main actor. Feature extraction runs in a detached task to keep the main thread free; rendering and state updates are back on the main actor.
Status / What's Next
The full pipeline runs: mic capture, spectral analysis, glyph rendering, row compositing, session archiving, and strip export. The UI has a live row preview (twelve cells updating as each glyph completes), an archive browser, and a renderer preview for tuning the algorithm without running the full pipeline.
The thermal printer output is currently stubbed — ConsolePrinter logs row dimensions, MockPrinter is a no-op for debug builds. ESC/POS formatting and Bluetooth thermal printer integration are the next step.