One-line definition
A symmetric matrix is positive semi-definite (PSD) if for all , and positive definite (PD) if for all . Equivalently: all eigenvalues are non-negative (PSD) or strictly positive (PD).
Why it matters
PSD matrices are the matrices that can serve as covariance matrices, kernel matrices (Gram matrices), inner-product weight matrices, and Hessians at local minima. The PSD cone is the natural domain for many optimization problems (semidefinite programming, Gaussian processes, kernel methods).
Equivalent characterizations
For symmetric :
- for all (definition).
- All eigenvalues .
- for some matrix (factorization, e.g., Cholesky with lower triangular).
- is the covariance matrix of some random vector.
- All principal minors (determinants of upper-left blocks) are non-negative.
For PD: same with strict inequalities everywhere.
The Cholesky factorization
Every PD matrix has a unique decomposition with lower triangular and positive diagonal. This is the standard way to:
- Solve when is PD ( instead of for general LU).
- Sample from a Gaussian: if then .
- Compute Gaussian log-likelihoods: .
PSD (not strictly PD) matrices admit Cholesky-like decompositions but with possible zero diagonal entries; use pivoted Cholesky or LDL.
The PSD cone
The set of PSD matrices forms a convex cone (closed under non-negative combinations). This is why semidefinite programming generalizes linear programming. It optimizes over a different cone.
Operations preserving PSD:
- is PSD if are PSD.
- is PSD for .
- is PSD for any compatible .
- Element-wise (Hadamard) product (Schur product theorem).
Operations not preserving PSD:
- General matrix product (only if commute).
- Inverse: PD matrices have PD inverses; PSD with zero eigenvalue is not invertible.
Geometric intuition
For PD, the set is a closed ellipsoid centered at the origin. Eigenvectors of give the axes; eigenvalues give . PSD matrices that are not PD give degenerate ellipsoids (flat in some direction).
Common pitfalls
- Calling a non-symmetric matrix PSD. PSD is defined for symmetric matrices. For asymmetric , the relevant object is .
- Trusting numerical eigenvalues at machine precision. A theoretically PSD covariance computed from data can have tiny negative eigenvalues from rounding. Use jitter () before Cholesky.
- Confusing PSD with diagonally dominant. Diagonally dominant with positive diagonal PSD, but the converse is false.
- Inverting near-singular PSD matrices. Always check the smallest eigenvalue or condition number first; regularize if needed.