Deep Learning Math  ·  Chapter 04

Vectors & Dot Products

how meaning becomes an arrow

A machine has never seen a sunset or heard the word “king.” So how can it tell that king is to queen as man is to woman? It turns every idea into an arrow — and measures meaning by how those arrows line up.

SCROLL
The problem · a number is too thin

Some things can't be captured by a single number.

“Walk 5 miles” is useless without a direction. Wind has speed and heading. A nudge in space has an amount and a way it points. The world is full of quantities that need more than one number to pin down — and a plain number can't hold them.

A vector is the fix: a little bundle of numbers we picture as an arrow, with a direction and a length. And here's the leap that powers modern AI — meaning behaves the same way. “Royal,” “feminine,” “old” become directions, and every word becomes an arrow in a space of meaning.

Intuition · arrows you can stack

Two things you can do with arrows: stack them, and stretch them.

Add them tip to tail: walk one, then the other, and the green arrow is where you end up.

× 2.5

Scale one by a number: same direction, longer (or, with a minus sign, flipped).

The key move · the dot product

One number that says: do these two point the same way?

The dot product takes two arrows and returns a single number measuring their agreement. Pointing the same way → a big positive number. At right angles → exactly zero, total strangers. Pointing apart → negative. Picture the shadow one arrow casts along the other — that shadow, times the length, is the dot product. Swing the amber arrow and watch.

a b
dot product {{ dotDisplay }}
they are {{ relationLabel }}
angle {{ angleDisplay }}
swing arrow b
its length

the green bar is b's shadow on a. when b stands straight up, the shadow vanishes and the dot product is zero.

The aha

The dot product is a similarity meter.

That's the whole reason it rules machine learning. Turn two things into arrows, take their dot product, and you get a score for how alike they are. Two documents about the same topic? Their arrows align, big dot product. A search query and the perfect result? Aligned. A cat photo and the word “cat”? Aligned. Strip away the length and you get cosine similarity — pure agreement of direction, from −1 to +1.

Who built this

It started as a way to do physics with geometry.

portrait:
W.R. Hamilton
~1843

Chasing a way to multiply directions in 3D, William Rowan Hamilton invented quaternions — carving the founding equation into a Dublin bridge in a flash of insight. Out of his system, and Hermann Grassmann's parallel work, the modern vector was distilled.

portrait:
J.W. Gibbs
~1881

It was Josiah Willard Gibbs and Oliver Heaviside who, in the 1880s, trimmed all that into the clean “dot and cross” we use now. The dot product began life measuring physical work — force along distance — and only later became the meaning-meter inside a neural net.

Where you'll meet it again

This is the engine of attention.

Every large language model runs on dot products, billions per word.

Attention

A transformer decides how much each word should “listen” to every other word by taking the dot product of their arrows. High agreement = pay attention. It is the literal core of how ChatGPT reads a sentence.

Embeddings & search

Words, images, and whole documents become arrows. Semantic search and recommendations just hunt for the arrows that point most like yours.

In the real world

Physics (work and energy), 3D graphics and lighting, GPS, navigation, and the forces on every bridge and aircraft — all spoken in vectors.

Now the symbols can't scare you

Two ways to write the same idea.

Hover or tap each piece.

a · b = |a||b| cos θ = a₁b₁ + a₂b₂ + …
{{ termTitle }}

{{ termBody }}

The middle form connects to Chapter 1 — there's the cosine again, measuring alignment. The right form is how a computer actually does it: multiply matching numbers and add them up. Same answer, two faces. You've now met the operation a GPU performs more than almost any other.