pca-in-c/README.md
2025-05-30 23:48:57 +01:00

25 lines
1021 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Principal Component Analysis in C
The committed files are a fully functioning implementation of the KarhunenLoève
transform (KLT) as defined on Wikipedia, and verified using multiple languages -
Python, R and MATLAB. Note. It may be common for libraries to call this algorithm
'covariance', 'eig' or similar to distinguish from the SVD alogrithm or
otherwise. The algorithm aims to solve the following system of linear
expressions:
$$
\mathbf{Y} = \mathbb{KLT}\{\mathbf{X}\},
C = \frac{1}{n-1}\mathbf{X}*\mathbf{X},
\mathbf{V}^{-1}\matbf{C}\mathbf{V} = \mathbf{D},
\mathit{W} \subset \mathit{V},
T = \mathbf{B} \cdot \mathbf{W}.
$$
The initial dataset $\mathbf{X}$ is reduced in dimensionality by matrix
$\mathit{W}$ which is a subset of eigenvectors, explicitly or intuitively
selected. Matrix $\mathbf{T}$ is the newly reduced data with an explained
variance ratio $\lt1.0$, though typically very high.
The source code is built entirely on the GNU GSL (2.7) library, to
utilise BLAS optimised functionality.