1.2 KiB
Principal Component Analysis in C
The committed files are a fully functioning implementation of the Karhunen–Loève transform (KLT) as defined on Wikipedia, and verified using multiple languages - Python, R and MATLAB. Note. It may be common for libraries to call this algorithm 'covariance', 'eig' or similar to distinguish from the SVD alogrithm or otherwise. The algorithm aims to solve the following system of linear expressions:
\mathbf{Y} = \mathbb{KLT}\{\mathbf{X}\}, \\
\mathbf{C} = \frac{1}{n-1}\mathbf{X}*\mathbf{X}, \\
\mathbf{V}^{-1}\mathbf{C}\mathbf{V} = \mathbf{D}, \\
\mathit{W} \subset \mathit{V}, \\
\mathbf{T} = \mathbf{B} \cdot \mathbf{W}.
The initial dataset \mathbf{X} is reduced in dimensionality by matrix
\mathit{W} which is a subset of eigenvectors, explicitly or intuitively
selected. Matrix \mathbf{T} is the newly reduced data with an explained
variance ratio \lt1.0, though typically very high.
For further information, or if the \LaTeX does not render properly or in full,
see Principal Component Analysis
The source code is built entirely on the GNU GSL (2.7) library, to utilise BLAS optimised functionality.