Learn Creative Coding (#22) - Mini-Project: Audio Visualizer
Phase 3 finale. We've learned easing, state machines, physics, audio analysis, recording, and shaders. Time to smash the best parts together into one cohesive piece: a full audio visualizer with microphone input, beat detection, particles, and smooth layered rendering.
This is my favorite mini-project in the series so far. In the galaxy project (episode 15) we combined particles, noise, and color. This time we're combining six different techniques into something you can actually perform with -- put on some music, turn up the volume, and watch your screen react. It doesn't get much better than that :-)
The plan: four visual layers stacked on top of each other, each driven by a different aspect of the audio. A background that breathes with bass, a glowing orb that pulses, a ring of frequency spikes radiating outward, and particles that burst on every beat. Each layer is simple on its own. Together they create something that feels like it understands the music -- even though it's just numbers driving pixels.
Let's build it.
The architecture
Before we write anything, let me explain how this fits together. Our visualizer has four layers, rendered back to front:
- Background -- color shifts slowly with bass energy
- Center orb -- pulses with bass, breathes with midrange
- Frequency ring -- spikes radiating from center, one per frequency bin
- Particles -- burst outward on beats, drift and fade
Each layer reads from the same audio data but responds to different frequency bands. This separation keeps the code clean -- you can swap out any layer without touching the others. Want a different background? Change one function. Want squares instead of particles? Change another. Same modular thinking we used in the galaxy project, just applied to audio-reactive visuals.
Audio setup
We need the microphone. The browser picks up whatever's playing through your speakers (or you can play music on your phone nearby -- that works surprisingly well).
Remember the Web Audio API from episode 19? Same setup, but this time we're splitting into five frequency bands instead of three, and each band gets its own smoothing speed:
let audioCtx, analyser, dataArray, waveform;
let bands = { bass: 0, lowMid: 0, mid: 0, highMid: 0, treble: 0 };
let smooth = { bass: 0, lowMid: 0, mid: 0, highMid: 0, treble: 0 };
async function initAudio() {
audioCtx = new AudioContext();
analyser = audioCtx.createAnalyser();
analyser.fftSize = 512;
dataArray = new Uint8Array(analyser.frequencyBinCount);
waveform = new Uint8Array(analyser.fftSize);
let stream = await navigator.mediaDevices.getUserMedia({ audio: true });
let source = audioCtx.createMediaStreamSource(stream);
source.connect(analyser);
}
function updateAudio() {
analyser.getByteFrequencyData(dataArray);
let len = dataArray.length;
bands.bass = avg(dataArray, 0, len * 0.08);
bands.lowMid = avg(dataArray, len * 0.08, len * 0.2);
bands.mid = avg(dataArray, len * 0.2, len * 0.45);
bands.highMid = avg(dataArray, len * 0.45, len * 0.7);
bands.treble = avg(dataArray, len * 0.7, len);
// different smoothing per band
smooth.bass = lerp(smooth.bass, bands.bass, 0.12);
smooth.lowMid = lerp(smooth.lowMid, bands.lowMid, 0.15);
smooth.mid = lerp(smooth.mid, bands.mid, 0.18);
smooth.highMid = lerp(smooth.highMid, bands.highMid, 0.22);
smooth.treble = lerp(smooth.treble, bands.treble, 0.25);
}
function avg(arr, start, end) {
start = Math.floor(start);
end = Math.floor(end);
let sum = 0;
for (let i = start; i < end; i++) sum += arr[i];
return sum / (end - start) / 255;
}
function lerp(a, b, t) { return a + (b - a) * t; }
Five bands instead of three gives us more nuance. We split the spectrum unevenly -- bass gets a small slice (0-8%) because low frequencies are packed tight, while treble gets a wide range (70-100%). The smoothing speeds follow the same principle from episode 19: bass is slowest (0.12) because kick drums sustain and you want the visual response to match. Treble is fastest (0.25) because hi-hats and cymbals are sharp transients that need to feel snappy. The in-between bands graduate smoothly.
Why five bands when episode 19 used three? Because we have four visual layers now. More bands means we can assign different audio properties to different layers without everything pulsing in sync. If bass drove everything, the whole screen would throb as one blob. With five bands, each layer can have its own rhythm.
Beat detection
Straight from episode 19, with one addition -- we track the beat's intensity so we can scale the particle burst:
let beatThreshold = 0.5;
let beatIntensity = 0;
let isBeat = false;
function detectBeat(bass) {
if (bass > beatThreshold && bass > 0.3) {
isBeat = true;
beatIntensity = bass;
beatThreshold = bass * 1.2;
} else {
isBeat = false;
beatIntensity *= 0.9;
beatThreshold *= 0.985;
beatThreshold = Math.max(beatThreshold, 0.25);
}
}
The adaptive threshold is the same pattern: raise it when a beat is detected (so we don't double-trigger), decay it gradually so it adjusts to whatever volume the music is playing at. The beatIntensity value is new -- it remembers how hard the last beat hit, so we can spawn more particles for louder beats and fewer for quiet ones. A gentle snare gets 15 particles. A massive bass drop gets 45. The visual response scales with the musical energy.
That && bass > 0.3 floor prevents false triggers during silence. Without it, the threshold decays to 0.25 and even ambient room noise triggers "beats." Learned that one the hard way during testing with no music playing :-)
Particles
Burst particles with eased fade-out. This combines the particle system from episode 11, the easing from episode 16, and the physics from episode 18:
let particles = [];
class Particle {
constructor(x, y, intensity) {
this.x = x;
this.y = y;
let angle = Math.random() * Math.PI * 2;
let speed = (2 + Math.random() * 4) * intensity;
this.vx = Math.cos(angle) * speed;
this.vy = Math.sin(angle) * speed;
this.life = 1.0;
this.decay = 0.008 + Math.random() * 0.015;
this.size = 1 + Math.random() * 3;
// color based on intensity -- louder beats = warmer hue
this.hue = 200 + intensity * 60;
}
update() {
this.x += this.vx;
this.y += this.vy;
this.vx *= 0.98; // friction
this.vy *= 0.98;
this.life -= this.decay;
}
draw(ctx) {
if (this.life <= 0) return;
let alpha = this.life * this.life; // quadratic fade = eased
ctx.beginPath();
ctx.arc(this.x, this.y, this.size * this.life, 0, Math.PI * 2);
ctx.fillStyle = `hsla(${this.hue}, 80%, 70%, ${alpha})`;
ctx.fill();
}
}
function spawnBurst(cx, cy, count, intensity) {
for (let i = 0; i < count; i++) {
particles.push(new Particle(cx, cy, intensity));
}
}
function updateParticles(ctx) {
for (let i = particles.length - 1; i >= 0; i--) {
particles[i].update();
particles[i].draw(ctx);
if (particles[i].life <= 0) {
particles.splice(i, 1);
}
}
}
See the this.life * this.life in the draw method? That's a quadratic ease-out -- the same principle from episode 16 where we squared the t value for easing curves. Instead of fading linearly (which looks robotic), the particle stays bright for most of its life and then drops off sharply at the end. It's subtle but it makes the particles feel sparkly instead of dull.
The friction at 0.98 means particles keep 98% of their speed each frame. They don't stop abruptly -- they drift outward, slowing gradually, fading as they go. Combine that with the varying decay rates (0.008 to 0.023) and each particle in a burst has a slightly different lifespan, creating a natural dispersal pattern instead of everything vanishing at once.
Each particle also gets its initial speed scaled by the beat intensity. A hard bass hit sends particles flying fast and far. A gentle thump keeps them close. The music shapes the physics. That's the whole point of this project.
Layer 1: Background
The simplest layer. Bass shifts the hue, mid adjusts saturation:
function drawBackground(ctx, w, h) {
let hue = 220 + smooth.bass * 40; // blue shifts toward purple on bass
let sat = 15 + smooth.mid * 20;
let light = 5 + smooth.bass * 8;
ctx.fillStyle = `hsl(${hue}, ${sat}%, ${light}%)`;
ctx.fillRect(0, 0, w, h);
}
When there's no music, the background is dark muted blue. When bass hits, it brightens and shifts toward purple. When melodic midrange comes in, saturation increases. The changes are subtle because they're smoothed -- you don't notice the background changing consciously, but you feel the overall mood shift. It's like stage lighting that responds to the DJ.
The bass smoothing at 0.12 means this layer moves slowly. Big sweeping color changes that follow the song's structure, not individual beats. If you were to use the raw unsmoothed bass value here, the background would flicker on every kick drum and it would look awful. Smoothing is the difference between "reacts to music" and "has a seizure."
Layer 2: Center orb
A radial gradient that pulses with bass and breathes with midrange:
function drawOrb(ctx, cx, cy) {
let baseR = 60;
let pulseR = baseR + smooth.bass * 50 + smooth.mid * 20;
// outer glow
let gradient = ctx.createRadialGradient(cx, cy, pulseR * 0.2, cx, cy, pulseR * 2);
let hue = 200 + smooth.bass * 40;
gradient.addColorStop(0, `hsla(${hue}, 70%, 60%, ${0.3 + smooth.bass * 0.3})`);
gradient.addColorStop(0.5, `hsla(${hue + 20}, 60%, 40%, ${0.1 + smooth.mid * 0.1})`);
gradient.addColorStop(1, 'hsla(220, 50%, 20%, 0)');
ctx.beginPath();
ctx.arc(cx, cy, pulseR * 2, 0, Math.PI * 2);
ctx.fillStyle = gradient;
ctx.fill();
// core
let coreGradient = ctx.createRadialGradient(cx, cy, 0, cx, cy, pulseR);
coreGradient.addColorStop(0, `hsla(${hue - 10}, 80%, 80%, 0.9)`);
coreGradient.addColorStop(0.6, `hsla(${hue}, 70%, 50%, 0.5)`);
coreGradient.addColorStop(1, `hsla(${hue + 20}, 60%, 30%, 0)`);
ctx.beginPath();
ctx.arc(cx, cy, pulseR, 0, Math.PI * 2);
ctx.fillStyle = coreGradient;
ctx.fill();
}
Two radial gradients stacked. The outer glow extends to twice the pulse radius, creating a soft halo. The core is brighter and more saturated, fading at the edge. Together they create that "glowing energy ball" look you see at EDM festivals.
The radius is driven by two bands: smooth.bass * 50 for the big pulse, smooth.mid * 20 for a gentler breathing motion on top. Bass gives you the thump, midrange gives you the swell. They oscillate at different rates because they're tracking different frequency content, so the orb never settles into a boring repetitive pattern.
The alpha values also react to audio -- 0.3 + smooth.bass * 0.3 means the glow is dimmer during quiet sections and brighter when the bass is pumping. This is the "layered mapping" principle from episode 19: audio drives size AND brightness AND color, each band contributing a different dimension. The more properties you map, the more the visual feels connected to the music.
Layer 3: Frequency ring
Spikes radiating outward from the orb, one per frequency bin. This is where the trigonometry from episode 13 comes in hard:
function drawFrequencyRing(ctx, cx, cy) {
let innerR = 80 + smooth.bass * 40;
ctx.lineCap = 'round';
for (let i = 0; i < dataArray.length; i++) {
let value = dataArray[i] / 255;
if (value < 0.05) continue; // skip silence
let angle = (i / dataArray.length) * Math.PI * 2 - Math.PI / 2;
let spikeLen = value * 80 + value * value * 40; // quadratic emphasis on loud
let x1 = cx + Math.cos(angle) * innerR;
let y1 = cy + Math.sin(angle) * innerR;
let x2 = cx + Math.cos(angle) * (innerR + spikeLen);
let y2 = cy + Math.sin(angle) * (innerR + spikeLen);
// color: bass = warm, treble = cool
let ratio = i / dataArray.length;
let hue = 200 + ratio * 160; // blue -> magenta -> red
ctx.beginPath();
ctx.moveTo(x1, y1);
ctx.lineTo(x2, y2);
ctx.strokeStyle = `hsla(${hue}, 80%, ${50 + value * 30}%, ${0.3 + value * 0.5})`;
ctx.lineWidth = 1.5 + value * 2;
ctx.stroke();
}
}
Each frequency bin gets a spike at its corresponding angle around the circle. The angle calculation (i / dataArray.length) * Math.PI * 2 is the same polar coordinate mapping from episode 13 -- evenly distributing N points around a full circle. The - Math.PI / 2 offset rotates everything so the first bin (lowest bass) starts at the top instead of the right.
The spike length has a quadratic component: value * 80 + value * value * 40. This means quiet frequencies get short spikes, but loud frequencies get disproportionately long ones. The loudest bin isn't just twice as tall as a half-loud bin -- it's nearly three times as tall. This visual emphasis makes the ring feel punchy and dramatic instead of flat and uniform.
The color mapping goes from blue (200) through magenta to red (360), matching the intuitive association of warm colors with bass and cool colors with treble. Well, actually the opposite -- bass bins are blue and treble is red here, which is a deliberate inversion. Traditional visualizers put red on bass and blue on treble, but I think the blue-center-to-red-edges look is more interesting. Change the hue formula if you disagree, that's the point of having it exposed as a simple calculation.
The inner radius of the ring also moves with bass (80 + smooth.bass * 40), so the whole ring breathes outward on heavy kicks. This links the ring's motion to the orb's pulsing, creating visual cohesion between layers.
Main loop
Now we tie it all together:
const canvas = document.getElementById('visualizer');
const ctx = canvas.getContext('2d');
let audioReady = false;
function resize() {
canvas.width = window.innerWidth;
canvas.height = window.innerHeight;
}
window.addEventListener('resize', resize);
resize();
// click to start (browser requires user gesture for audio)
canvas.addEventListener('click', async () => {
if (!audioReady) {
await initAudio();
audioReady = true;
}
});
function animate() {
let cx = canvas.width / 2;
let cy = canvas.height / 2;
if (audioReady) {
updateAudio();
detectBeat(bands.bass);
}
// layer 1: background
drawBackground(ctx, canvas.width, canvas.height);
// layer 2: orb
drawOrb(ctx, cx, cy);
// layer 3: frequency ring
if (audioReady) {
drawFrequencyRing(ctx, cx, cy);
}
// spawn particles on beat
if (isBeat) {
let count = Math.floor(15 + beatIntensity * 30);
spawnBurst(cx, cy, count, beatIntensity);
}
// layer 4: particles
updateParticles(ctx);
// subtle scan line effect
ctx.fillStyle = 'rgba(0, 0, 0, 0.03)';
for (let y = 0; y < canvas.height; y += 3) {
ctx.fillRect(0, y, canvas.width, 1);
}
requestAnimationFrame(animate);
}
animate();
Notice the render order: background first, then orb, then ring, then particles. Back to front, painters algorithm. Each layer paints over the previous one. The particles are last because they need to be on top of everything -- they burst outward from the center, crossing over the ring and the orb.
The beat spawning line is worth looking at: Math.floor(15 + beatIntensity * 30). A beat at minimum intensity (0.3 floor) spawns about 24 particles. A full-power beat at intensity 1.0 spawns 45. The particle count scales linearly with how hard the beat hits. This means quiet sections produce gentle bursts and drops produce explosions. The visualizer breathes with the music's dynamics.
That scan line overlay at the bottom is a cheap trick that adds a lot of character. Every third pixel row gets a barely-visible dark line drawn over it, creating a subtle CRT monitor effect. At 3% opacity you barely notice it consciously, but it adds texture to an otherwise clean digital look. Remove it and the visualizer looks flatter. Add it and it has that retro-futuristic vibe. Almost no performance cost since it's just filled rectangles.
The HTML
<!DOCTYPE html>
<html>
<head>
<style>
body { margin: 0; background: #000; overflow: hidden; cursor: crosshair; }
canvas { display: block; }
.hint {
position: fixed; bottom: 20px; left: 50%;
transform: translateX(-50%);
color: rgba(255,255,255,0.3);
font-family: monospace; font-size: 14px;
pointer-events: none;
}
</style>
</head>
<body>
<canvas id="visualizer"></canvas>
<div class="hint">click to start - play music nearby</div>
<script src="visualizer.js"></script>
</body>
</html>
Full page, no scrollbars, crosshair cursor for that techy feel. The hint text is barely visible (30% opacity white) and has pointer-events: none so it doesn't interfere with the click-to-start handler. Once you click and the audio starts, the hint just sits there unobtrusively. You could fade it out after audio starts if you want -- set it to display: none inside the click handler.
Making it yours
The visualizer works. Now let's talk about variations that change the feel completley.
Waveform ring -- replace the frequency spikes with the raw audio waveform drawn around the orb:
function drawWaveformRing(ctx, cx, cy) {
analyser.getByteTimeDomainData(waveform);
let innerR = 80 + smooth.bass * 40;
ctx.beginPath();
for (let i = 0; i < waveform.length; i++) {
let value = (waveform[i] - 128) / 128;
let angle = (i / waveform.length) * Math.PI * 2;
let r = innerR + value * 40;
let x = cx + Math.cos(angle) * r;
let y = cy + Math.sin(angle) * r;
i === 0 ? ctx.moveTo(x, y) : ctx.lineTo(x, y);
}
ctx.closePath();
ctx.strokeStyle = 'rgba(100, 200, 255, 0.5)';
ctx.lineWidth = 2;
ctx.stroke();
}
The waveform wraps around the orb as a continuous line instead of individual spikes. It looks like an oscilloscope display bent into a circle. With percussive music the line gets spiky and chaotic. With smooth pads it gently undulates. Visually very different from the frequency ring, and it uses getByteTimeDomainData instead of getByteFrequencyData -- the raw audio signal shape we explored in episode 19.
Mirrored mode -- only draw frequency spikes on the top half, mirrored to the bottom:
// replace the angle calculation in drawFrequencyRing
let angle = (i / dataArray.length) * Math.PI - Math.PI / 2; // half circle only
// draw once at angle, once at -angle for the mirror
This creates symmetry along the horizontal axis. Symmetrical visuals feel more "designed" and less chaotic. It's a simple change but the visual impact is big -- suddenly the visualizer looks like a butterfly or an eye.
Color palette swaps -- change the hue formula for completely different moods:
// warm sunset palette
let hue = 20 + ratio * 40;
// neon cyberpunk
let hue = 280 + ratio * 80;
// monochrome with brightness variation
ctx.strokeStyle = `hsla(200, 10%, ${30 + value * 70}%, ${0.3 + value * 0.5})`;
Color changes everything about how the visualizer feels. The blue-to-red default is energetic and techy. Warm sunset feels organic and mellow. Neon cyberpunk is aggressive and club-ready. Monochrome with brightness is minimal and elegant. Same audio data, same geometry, completely different energy. This is why I keep saying color is the most underrated tool in creative coding -- it shapes the emotional response more than almost anything else.
Performance: keeping 60fps
Audio visualizers are demanding. You're doing FFT analysis, drawing hundreds of lines, managing a particle system, and creating radial gradients -- all every single frame. Here are the things that matter for performance:
FFT size: 512 is the sweet spot. You get 256 frequency bins, which is plenty of visual detail. Going to 1024 doubles the bin count but the CPU cost of the FFT goes up and your eyes can't tell the difference at visualization scale. Going to 2048 is outright wasteful unless you're doing serious audio analysis.
Particle management: Always iterate backwards when removing particles (for (let i = particles.length - 1; i >= 0; i--)). Splicing from the front shifts every subsequent element. Splicing from the back doesn't shift anything. With hundreds of particles, this matters. Also cap your total particle count -- I'd add a while (particles.length > 800) particles.shift() safety valve. Without it, a long session with heavy beats could accumulate thousands of dead particles.
Radial gradients: They're more expensive than linear gradients. The orb uses two per frame, which is fine. But don't put a radial gradient inside a loop that runs 256 times (once per frequency spike). I considered making each spike have its own gradient but the performance hit wasn't worth it. Solid colors with varying alpha look almost as good.
Canvas size: window.innerWidth * window.innerHeight pixels getting redrawn every frame. On a 1920x1080 display that's over 2 million pixels. If you're dropping frames, try reducing the canvas size: canvas.width = window.innerWidth * 0.75. Scale it back up with CSS width: 100vw; height: 100vh. You lose some sharpness but gain a lot of performance. Many professional visualizers render at half resolution and upscale -- nobody notices at 60fps.
A smooth 30fps visualizer looks better than a stuttering 60fps one. If you're on a weaker machine and can't hold 60, target consistency. Drop the particle count, reduce the canvas resolution, simplify the scan line overlay (every 4th row instead of every 3rd, or skip it entirely). Your audience won't notice 30fps, but they will notice frame drops.
What we combined
Let's count the techniques from previous episodes:
- Audio analysis (episode 19) -- FFT frequency data, five-band splitting, smoothing
- Beat detection (episode 19) -- adaptive threshold, intensity tracking
- Easing/lerp (episode 16) -- smoothed audio values, quadratic particle fade
- Particle systems (episode 11) -- burst spawning, lifetime management, backward iteration
- Physics (episode 18) -- velocity, friction, deceleration
- Trigonometry (episode 13) -- polar coordinates for the frequency ring
- Color mapping (episode 7) -- HSL hue shifts, alpha layering
Seven techniques, one cohesive piece. Each one we learned in isolation. Together they create something that's more than the sum of its parts. That's the payoff of building skills incrementally -- you don't just learn techniques, you learn to combine them.
And this is the real skill in creative coding. Anybody can learn the Web Audio API. Anybody can draw a circle. The creative part is deciding that bass should drive glow intensity while treble drives particle sparkle while midrange shifts the color palette. The mapping decisions are the art. The code is just the medium.
Allez, wa weten we nu allemaal?
- Layer your visualizer: background -> orb -> frequency ring -> particles, back to front
- Smooth audio values with per-band lerp speeds (bass slowest, treble fastest)
- Beat detection drives one-shot events (particle bursts, flashes)
- Continuous audio values drive continuous visual properties (size, color, glow intensity)
- Use quadratic emphasis (
value * value) to make loud frequencies visually dominant - The scan line overlay adds subtle retro texture with almost no performance cost
- Always require a user gesture before starting audio (browser security policy)
- Cap your particle count to prevent memory issues during long sessions
- Render at reduced resolution and upscale with CSS if you need more performance
Phase 3 done! We've gone from basic easing and lerp (episode 16) to state machines (17), springs and flocking (18), sound-reactive input (19), capturing and exporting (20), GPU programming with shaders (21), and now a full mini-project that ties it all together. Our creative coding toolkit is genuinely stacked now. We can make things move naturally, respond to sound, render on the GPU, and export in any format.
Next up is Phase 4 -- generative art. We're going to explore what makes a piece of art "generative" in the first place, how deterministic randomness works, and how to build systems that create art autonomously. The creative coding journey shifts from interactive sketches to pieces that generate themselvs. It's going to be a whole different energy.
Sallukes! Thanks for reading.
X