Skip to content
mentorship

concepts

Positive (semi-)definite matrices

Matrices that define inner products and proper covariances. The geometry of PSD: ellipsoids, not arbitrary shapes.

Reviewed · 2 min read

One-line definition

A symmetric matrix is positive semi-definite (PSD) if for all , and positive definite (PD) if for all . Equivalently: all eigenvalues are non-negative (PSD) or strictly positive (PD).

Why it matters

PSD matrices are the matrices that can serve as covariance matrices, kernel matrices (Gram matrices), inner-product weight matrices, and Hessians at local minima. The PSD cone is the natural domain for many optimization problems (semidefinite programming, Gaussian processes, kernel methods).

Equivalent characterizations

For symmetric :

  • for all (definition).
  • All eigenvalues .
  • for some matrix (factorization, e.g., Cholesky with lower triangular).
  • is the covariance matrix of some random vector.
  • All principal minors (determinants of upper-left blocks) are non-negative.

For PD: same with strict inequalities everywhere.

The Cholesky factorization

Every PD matrix has a unique decomposition with lower triangular and positive diagonal. This is the standard way to:

  • Solve when is PD ( instead of for general LU).
  • Sample from a Gaussian: if then .
  • Compute Gaussian log-likelihoods: .

PSD (not strictly PD) matrices admit Cholesky-like decompositions but with possible zero diagonal entries; use pivoted Cholesky or LDL.

The PSD cone

The set of PSD matrices forms a convex cone (closed under non-negative combinations). This is why semidefinite programming generalizes linear programming. It optimizes over a different cone.

Operations preserving PSD:

  • is PSD if are PSD.
  • is PSD for .
  • is PSD for any compatible .
  • Element-wise (Hadamard) product (Schur product theorem).

Operations not preserving PSD:

  • General matrix product (only if commute).
  • Inverse: PD matrices have PD inverses; PSD with zero eigenvalue is not invertible.

Geometric intuition

For PD, the set is a closed ellipsoid centered at the origin. Eigenvectors of give the axes; eigenvalues give . PSD matrices that are not PD give degenerate ellipsoids (flat in some direction).

Common pitfalls

  • Calling a non-symmetric matrix PSD. PSD is defined for symmetric matrices. For asymmetric , the relevant object is .
  • Trusting numerical eigenvalues at machine precision. A theoretically PSD covariance computed from data can have tiny negative eigenvalues from rounding. Use jitter () before Cholesky.
  • Confusing PSD with diagonally dominant. Diagonally dominant with positive diagonal PSD, but the converse is false.
  • Inverting near-singular PSD matrices. Always check the smallest eigenvalue or condition number first; regularize if needed.