Metatronic Lab
A three-layer audio-reactive visualization that ties geometry, music theory, and digital signal processing together. Below is how each layer works.
The reflective shape at the center is a regular icosahedron — 12 vertices, 30 edges, 20 equilateral faces.
What's special: per the paper "General Theory of Music by Icosahedron" (Imai, Dellby, Tanaka — arXiv:2103.10272), the 12 chromatic pitches (C, C#, D, …, B) map onto the 12 icosahedron vertices in exactly four valid ways that satisfy the symmetry constraints:
We're using Type 1 of the four mappings. When chord notes are detected in your audio, the corresponding golden-triangle outline glows on the surface. When two notes a tritone apart hit, the axis line through the icos center pulses. Each chroma class also lights its own vertex pip.
Six concentric shells of platonic solids surround the icosahedron — tetrahedra, cubes, octahedra, dodecahedra, icosahedra, and spheres. 72 crystals total (6 shells × 12 chroma vertices), each hard-mapped to a specific (note, octave) pair:
So a sustained C-major chord at middle-C, for example, would light up three radial spokes (C, E, G) at the matching octave shell. Bass notes light inner shells; high harmonics light outer shells. The pattern across the scene shows you the harmonic structure of the music.
Audio analysis runs on a complex Morlet continuous wavelet transform (CWT) in an off-thread AudioWorklet. 96 kernels covering A0 through B7 (8 octaves × 12 semitones), running at 48 kHz with a 256-sample hop (≈187 frames/second).
Why complex Morlet instead of a regular FFT spectrogram? Complex wavelets give the envelope directly — magnitude = √(real² + imag²). That's a smooth amplitude curve, phase-invariant, well-localized in both time and frequency at the same instant. FFTs make you choose one or the other.
Each kernel is energy-normalized and shaped to ~4 cycles of its target frequency (about 9 ms for high notes, ~85 ms for bass). Per hop, every kernel does a sliding inner product against the most recent samples, producing one envelope value per bin.
All 96 kernels precomputed once, all per-frame work runs in C++-speed AudioWorklet thread (~30M MACs/sec — small fraction of one core). Main thread reads the latest frame from a ref each render tick — the audio→visual loop never goes through React, no re-renders, no garbage. That's how it stays smooth even at full visual complexity.
Use the right-side panel to tune everything in real time. Notable knobs: