loading

PCA — Step-by-Step Visualization

mediumAIMLDimensionality ReductionPCA

Step through Principal Component Analysis — watch data get centered, the covariance matrix computed, and eigenvectors found to project high-variance data into fewer dimensions.

Algorithm Pattern

Eigendecomposition of the Covariance Matrix

Key Idea

PCA finds the directions (principal components) of maximum variance, enabling dimensionality reduction with minimum information loss.

Step-by-Step Approach

  1. Center the data by subtracting the mean from each feature.
  2. Compute the covariance matrix C = (X_centered.T @ X_centered) / (n−1).
  3. Find eigenvectors and eigenvalues of C — eigenvectors are the principal components.
  4. Sort eigenvectors by descending eigenvalue — PC1 captures the most variance.
  5. Project data onto the top-k eigenvectors: X_reduced = X_centered @ PC[:k].T

Common Gotchas

  • PCA is sensitive to scale — always standardize features before applying PCA.
  • Variance explained by PC_k = eigenvalue_k / sum(all eigenvalues).
  • PCA is linear — it cannot capture non-linear structure (use t-SNE or UMAP for that).

Related Problems