CollinearityTest - Test for the Presence of Multi-colinearity

Returns the p-value of the multi-collinearity test (i.e. whether one variable can be linearly predicted from the others with a non-trivial degree of accuracy).

 

Syntax

CollinearityTest(X, Mask, Method, Column Index)

X is the independent variables data matrix, such that each column represents one variable.

Mask is the boolean array to select a subset of the input variables in X. If missing, all variables in X are included.

Method is the statistics to compute (1 = Condition Number (default), 2 = VIF, 3 = Determinant, 4 = Eigenvalues).

Method Description
1 Condition Number (Kappa)
2 Variance Inflation Factor (VIF)

Column Index is a switch to designate the explanatory variable to examine (not required for condition number).

 

Remarks

  1. The sample data may include missing values.
  2. Each column in the input matrix corresponds to a separate variable.
  3. Each row in the input matrix corresponds to an observation.
  4. Observations (i.e. row) with missing values are removed.
  5. In the variance inflation factor (VIF) method, a series of regressions models are constructed, where one variable is the dependent variable against the remaining predictors.
  6. $$\textrm{Tolerance}_i = 1-R_i^2$$

    $$\textrm{VIF}_i =\frac{1}{\textrm{Tolearance}_i} = \frac{1}{1-R_i^2}$$
    Where:
    • $R_i^2$ is the coefficient of determination of a regression of explanator $i$ on all the other explanators.
  7. A tolerance of less than 0.20 or 0.10 and/or a VIF of 5 or 10 and above indicates a multicollinearity problem.
  8. The condition number ($\kappa$) test is a standard measure of ill-conditioning in a matrix; It will indicate that the inversion of the matrix is numerically unstable with finite-precision numbers (standard computer floats and doubles).
  9. $$ X = \begin{bmatrix} 1 & X_{11} & \cdots & X_{k1} \\ \vdots & \vdots & & \vdots \\ 1 & X_{1N} & \cdots & X_{kN} \end{bmatrix} $$

    $$\kappa = \sqrt{\frac{\lambda_{max}}{\lambda_{min}}}$$
    Where:
    • $\lambda_{max}$ is the maximum eigenvalue.
    • $\lambda_{min}$ is the minimum eigenvalue.
  10. As a rule of thumb, a condition number ($\kappa$) greater or equal to 30 indicates a severe multi-collinearity problem.
  11. The CollinearityTest function is available starting with version 1.60 APACHE.

Files Examples

References

  • Farrar Donald E. and Glauber, Robert R (1967). "Multicollinearity in Regression Analysis: The Problem Revisited". The Review of Economics and Statistics 49(1):92-107.
Have more questions? Submit a request

0 Comments