One-line definition
A matrix represents a linear map defined by . Composition of linear maps corresponds to matrix multiplication; the columns of are the images of the standard basis vectors.
Why it matters
Every layer in a neural network is a linear map (followed by a non-linearity). Every embedding lookup, every attention score, every gradient backward pass is a matrix multiplication. Understanding what a matrix does geometrically. Rather than just how to compute with it. Is the foundation for reasoning about model capacity, conditioning, and gradient flow.
The geometry
For :
- Columns of = images of . Span them and you get the column space (range of the map).
- Rows of = linear functionals; span the row space.
- Null space = . Directions the map collapses.
- Rank = dimension of column space = dimension of row space.
If is square and invertible, is a bijection: it stretches, rotates, and reflects without losing information. If rank , collapses dimensions.
Composition and multiplication
If and , then . Matrix multiplication is the composition of linear maps. This is why multiplication is associative () but not commutative (order of operations matters).
Special families
| Matrix | Geometric action |
|---|---|
| Orthogonal () | Rotation or reflection (preserves length and angle) |
| Diagonal | Independent scaling along each axis |
| Symmetric | Has real eigenvalues; orthogonal eigenvector basis |
| Positive definite | Symmetric + all eigenvalues > 0; defines an inner product |
| Permutation | Reorders coordinates |
| Projection () | Maps onto a subspace, kills orthogonal complement |
Common pitfalls
- Treating matrix multiplication as element-wise. Use Hadamard () for element-wise; matrix multiplication is composition.
- Forgetting that shapes determine the map. is a map , not the other way around.
- Confusing column space with row space. Both have dimension = rank, but they live in different spaces ( vs ).