pca-in-c/README.md

# Principal Component Analysis in C

The committed files are a fully functioning implementation of the Karhunen–Loève
transform (KLT) as defined on Wikipedia, and verified using multiple languages -
Python, R and MATLAB. Note. It may be common for libraries to call this algorithm
'covariance', 'eig' or similar to distinguish from the SVD alogrithm or
otherwise. The algorithm aims to solve the following system of linear
expressions:

$$
\mathbf{Y} = \mathbb{KLT}\{\mathbf{X}\}, \\
\mathbf{C} = \frac{1}{n-1}\mathbf{X}*\mathbf{X}, \\
\mathbf{V}^{-1}\mathbf{C}\mathbf{V} = \mathbf{D}, \\
\mathit{W} \subset \mathit{V}, \\
\mathbf{T} = \mathbf{B} \cdot \mathbf{W}.
$$

The initial dataset $\mathbf{X}$ is reduced in dimensionality by matrix
$\mathit{W}$ which is a subset of eigenvectors, explicitly or intuitively
selected. Matrix $\mathbf{T}$ is the newly reduced data with an explained
variance ratio $\lt1.0$, though typically very high.

For further information, or if the $\LaTeX$ does not render properly or in full,
see [Principal Component Analysis](https://en.wikipedia.org/wiki/Principal_component_analysis#Computation_using_the_covariance_method)

The source code is built entirely on the GNU GSL (2.7) library, to
utilise BLAS optimised functionality.