how meaning becomes an arrow
A machine has never seen a sunset or heard the word “king.” So how can it tell that king is to queen as man is to woman? It turns every idea into an arrow — and measures meaning by how those arrows line up.
“Walk 5 miles” is useless without a direction. Wind has speed and heading. A nudge in space has an amount and a way it points. The world is full of quantities that need more than one number to pin down — and a plain number can't hold them.
A vector is the fix: a little bundle of numbers we picture as an arrow, with a direction and a length. And here's the leap that powers modern AI — meaning behaves the same way. “Royal,” “feminine,” “old” become directions, and every word becomes an arrow in a space of meaning.
Add them tip to tail: walk one, then the other, and the green arrow is where you end up.
Scale one by a number: same direction, longer (or, with a minus sign, flipped).
The dot product takes two arrows and returns a single number measuring their agreement. Pointing the same way → a big positive number. At right angles → exactly zero, total strangers. Pointing apart → negative. Picture the shadow one arrow casts along the other — that shadow, times the length, is the dot product. Swing the amber arrow and watch.
the green bar is b's shadow on a. when b stands straight up, the shadow vanishes and the dot product is zero.
That's the whole reason it rules machine learning. Turn two things into arrows, take their dot product, and you get a score for how alike they are. Two documents about the same topic? Their arrows align, big dot product. A search query and the perfect result? Aligned. A cat photo and the word “cat”? Aligned. Strip away the length and you get cosine similarity — pure agreement of direction, from −1 to +1.
Chasing a way to multiply directions in 3D, William Rowan Hamilton invented quaternions — carving the founding equation into a Dublin bridge in a flash of insight. Out of his system, and Hermann Grassmann's parallel work, the modern vector was distilled.
It was Josiah Willard Gibbs and Oliver Heaviside who, in the 1880s, trimmed all that into the clean “dot and cross” we use now. The dot product began life measuring physical work — force along distance — and only later became the meaning-meter inside a neural net.
Every large language model runs on dot products, billions per word.
A transformer decides how much each word should “listen” to every other word by taking the dot product of their arrows. High agreement = pay attention. It is the literal core of how ChatGPT reads a sentence.
Words, images, and whole documents become arrows. Semantic search and recommendations just hunt for the arrows that point most like yours.
Physics (work and energy), 3D graphics and lighting, GPS, navigation, and the forces on every bridge and aircraft — all spoken in vectors.
Hover or tap each piece.
{{ termBody }}
The middle form connects to Chapter 1 — there's the cosine again, measuring alignment. The right form is how a computer actually does it: multiply matching numbers and add them up. Same answer, two faces. You've now met the operation a GPU performs more than almost any other.