# CollinearityTest - Test for the Presence of Multi-colinearity

Returns the p-value of the multi-collinearity test (i.e. whether one variable can be linearly predicted from the others with a non-trivial degree of accuracy).

## Syntax

X is the independent variables data matrix, such that each column represents one variable.

Mask is the boolean array to select a subset of the input variables in X. If missing, all variables in X are included.

Method is the statistics to compute (1 = Condition Number (default), 2 = VIF, 3 = Determinant, 4 = Eigenvalues).

Method Description
1 Condition Number (Kappa)
2 Variance Inflation Factor (VIF)

Column Index is a switch to designate the explanatory variable to examine (not required for condition number).

## Remarks

1. The sample data may include missing values.
2. Each column in the input matrix corresponds to a separate variable.
3. Each row in the input matrix corresponds to an observation.
4. Observations (i.e. row) with missing values are removed.
5. In the variance inflation factor (VIF) method, a series of regressions models are constructed, where one variable is the dependent variable against the remaining predictors.
6. $$\textrm{Tolerance}_i = 1-R_i^2$$

$$\textrm{VIF}_i =\frac{1}{\textrm{Tolearance}_i} = \frac{1}{1-R_i^2}$$
Where:
• $R_i^2$ is the coefficient of determination of a regression of explanator $i$ on all the other explanators.
7. A tolerance of less than 0.20 or 0.10 and/or a VIF of 5 or 10 and above indicates a multicollinearity problem.
8. The condition number ($\kappa$) test is a standard measure of ill-conditioning in a matrix; It will indicate that the inversion of the matrix is numerically unstable with finite-precision numbers (standard computer floats and doubles).
9. $$X = \begin{bmatrix} 1 & X_{11} & \cdots & X_{k1} \\ \vdots & \vdots & & \vdots \\ 1 & X_{1N} & \cdots & X_{kN} \end{bmatrix}$$

$$\kappa = \sqrt{\frac{\lambda_{max}}{\lambda_{min}}}$$
Where:
• $\lambda_{max}$ is the maximum eigenvalue.
• $\lambda_{min}$ is the minimum eigenvalue.
10. As a rule of thumb, a condition number ($\kappa$) greater or equal to 30 indicates a severe multi-collinearity problem.
11. The CollinearityTest function is available starting with version 1.60 APACHE.

## References

• Farrar Donald E. and Glauber, Robert R (1967). "Multicollinearity in Regression Analysis: The Problem Revisited". The Review of Economics and Statistics 49(1):92-107.