28 lines
1.2 KiB
Markdown
28 lines
1.2 KiB
Markdown
# Principal Component Analysis in C
|
||
|
||
The committed files are a fully functioning implementation of the Karhunen–Loève
|
||
transform (KLT) as defined on Wikipedia, and verified using multiple languages -
|
||
Python, R and MATLAB. Note. It may be common for libraries to call this algorithm
|
||
'covariance', 'eig' or similar to distinguish from the SVD alogrithm or
|
||
otherwise. The algorithm aims to solve the following system of linear
|
||
expressions:
|
||
|
||
$$
|
||
\mathbf{Y} = \mathbb{KLT}\{\mathbf{X}\}, \\
|
||
\mathbf{C} = \frac{1}{n-1}\mathbf{X}*\mathbf{X}, \\
|
||
\mathbf{V}^{-1}\mathbf{C}\mathbf{V} = \mathbf{D}, \\
|
||
\mathit{W} \subset \mathit{V}, \\
|
||
\mathbf{T} = \mathbf{B} \cdot \mathbf{W}.
|
||
$$
|
||
|
||
The initial dataset $\mathbf{X}$ is reduced in dimensionality by matrix
|
||
$\mathit{W}$ which is a subset of eigenvectors, explicitly or intuitively
|
||
selected. Matrix $\mathbf{T}$ is the newly reduced data with an explained
|
||
variance ratio $\lt1.0$, though typically very high.
|
||
|
||
For further information, or if the $\LaTeX$ does not render properly or in full,
|
||
see [Principal Component Analysis](https://en.wikipedia.org/wiki/Principal_component_analysis#Computation_using_the_covariance_method)
|
||
|
||
The source code is built entirely on the GNU GSL (2.7) library, to
|
||
utilise BLAS optimised functionality.
|