# Principal Component Analysis

Principal component analysis (PCA) is a mathematical procedure that uses an orthogonal linear transformation to convert a set of observations of possibly correlated variables into values of linearly uncorrelated variables called principal components.
In other words, the PCA procedure summarizes the variation in correlated variables to a set of uncorrelated components, each of which is a particular linear combination of the original variables. The objective of PCA is to achieve parsimony and reduce dimensionality by extracting the smallest number of components that account for most of the variation in the original multivariate data and to summarize the data with little loss of information.

Let's define a matrix $\mathbf{X}$, where each column corresponds to one variable, and each row corresponds to a different repetition (or measurement) of the experiment:

$$\mathbf{X} = \begin{pmatrix} \mathbf{x}^{\rm T}_1 \\ \mathbf{x}^{\rm T}_2 \\ \vdots \\ \mathbf{x}^{\rm T}_n \end{pmatrix} = \begin{pmatrix} x_{11} & \cdots & x_{1p} \\ x_{21} & \cdots & x_{2p} \\ \vdots & \ddots & \vdots \\ x_{n1} & \cdots & x_{np} \end{pmatrix}$$

Furthermore, each column (variable) has a zero empirical mean (the empirical (sample) mean of the distribution has been subtracted from the data set).

The PCA transformation that preserves dimensionality (that is, gives the same number of principal components as original variables) Y is then given by:

$$\mathbf{Y}^{\rm T} = \mathbf{X}^{\rm T}\mathbf{W}$$

Using signular value decomposition (SVD) for the $\mathbf{X}^{\rm T}$, we can express the PCA transform as

$$\mathbf{Y}^{\rm T} = (\mathbf{W}\mathbf{\Sigma}\mathbf{V}^{\rm T})^{\rm T}\mathbf{W}$$
Where

• $\mathbf{W}$ is the matrix of eigenvectors of the covariance matrix $\mathbf{X} \mathbf{X}^{\rm T}$
• $\mathbf{V}$ is the matrix of eigenvectors of the matrix $\mathbf{X}^{\rm T} \mathbf{X}$
• $\mathbf{\Sigma}$ is a rectangle matrix with nonnegative real numbers on the diagonal

The PCA transformation $\mathbf{Y}$ is given by:

$$\mathbf{Y}^{\rm T} = \mathbf{V}\mathbf{\Sigma}^{\rm T}$$

Remarks

1. The number of principal components is less than or equal to the number of original variables.
2. Principal components analysis (PCA), a popular multivariate technique, is mainly used to reduce the dimensionality of p multi-attributes to two or three dimensions.
3. This transformation is defined so that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible). Each succeeding component in turn, has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components.
4. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain-specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix.
5. PCA is the simplest of the true eigenvector-based multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data.
6. PCA is sensitive to the scaling of the variables.
7. For principal component regression, PCR is a two-stage procedure; first reduces the predictor variables using principal component analysis then uses the reduced variables in an OLS regression fit.
8. PCR is often used when the number of predictor variables is large, or when strong correlations exist among the predictor variables.
9. The partial least squares regression is the extension of the PCR method which does not suffer from the mentioned deficiency.