Returns the p-value of the multi-collinearity test (i.e., whether one variable can be linearly predicted from the others with a non-trivial degree of accuracy).
Syntax
CollinearityTest(X, Mask, Method, Column Index)
- X
- is the independent variables data matrix, such that each column represents one variable.
- Mask
- is the boolean array to select a subset of the input variables in X. If missing, all variables in X are included.
- Method
- is the statistics to compute (1 = Condition Number (default), 2 = VIF, 3 = Determinant, 4 = Eigenvalues).
Method Description 1 Condition Number (Kappa). 2 Variance Inflation Factor (VIF). - Column Index
- is a switch to designate the explanatory variable to examine (not required for condition number).
Remarks
- The sample data may include missing values.
- Each column in the input matrix corresponds to a separate variable.
- Each row in the input matrix corresponds to an observation.
- Observations (i.e., row) with missing values are removed.
- In the variance inflation factor (VIF) method, a series of regression models are constructed, where one variable is the dependent variable against the remaining predictors.$$\textrm{Tolerance}_i = 1-R_i^2$$ $$\textrm{VIF}_i =\frac{1}{\textrm{Tolearance}_i} = \frac{1}{1-R_i^2}$$ Where:
- $R_i^2$ is the coefficient of determination of a regression of explanator $i$ on all the other explanators.
- A tolerance of less than 0.20 or 0.10 and/or a VIF of 5 or 10 and above indicates a multicollinearity problem.
- The condition number ($\kappa$) test is a standard measure of ill-conditioning in a matrix; It will indicate that the inversion of the matrix is numerically unstable with finite-precision numbers (standard computer floats and doubles).$$ X = \begin{bmatrix} 1 & X_{11} & \cdots & X_{k1} \\ \vdots & \vdots & & \vdots \\ 1 & X_{1N} & \cdots & X_{kN} \end{bmatrix} $$ $$\kappa = \sqrt{\frac{\lambda_{max}}{\lambda_{min}}}$$ Where:
- $\lambda_{max}$ is the maximum eigenvalue.
- $\lambda_{min}$ is the minimum eigenvalue.
- As a rule of thumb, a condition number ($\kappa$) greater or equal to 30 indicates a severe multi-collinearity problem.
- The CollinearityTest function is available starting with version 1.60 APACHE.
Files Examples
Related Links
- Wikipedia - Multicollinearity.
- Wikipedia - Variance inflation factor.
- Wikipedia - Condition number.
- Wikipedia - Multiple regression.
References
- Farrar Donald E. and Glauber, Robert R (1967). "Multicollinearity in Regression Analysis: The Problem Revisited". The Review of Economics and Statistics 49(1):92-107.
Comments
Article is closed for comments.