Calculates the p-value and related statistics of the partial f-test (used for testing the inclusion/exclusion variables).

## Syntax

**MLR_PRFTest**(

**X**,

**Y**,

**Intercept**,

**Mask1**,

**Mask2**,

**Return_type**,

**Alpha**)

**X** is the independent (explanatory) variables data matrix, such that each column represents one variable.

**Y** is the response or the dependent variable data array (one dimensional array of cells (e.g. rows or columns)).

**Intercept** is the constant or the intercept value to fix (e.g. zero). If missing, an intercept will not be fixed and is computed normally.

**Mask1** is the boolean array for the explanatory variables in the first model. If missing, all variables in X are included.

**Mask2** is the boolean array for the explanatory variables in the second model. If missing, all variables in X are included.

**Return_type** is a switch to select the return output (1 = P-value (default), 2 = test stats, 3 = critical value.)

Method | Description |
---|---|

1 | P-value |

2 | Test statistics (e.g. Z-score) |

3 | Critical value |

**Alpha** is the statistical significance of the test (i.e. alpha). If missing or omitted, an alpha value of 5% is assumed.

## Remarks

- The underlying model is described here.
- Model 1 must be a sub-model of Model 2. In other words, all variables included in Model 1 must be included in Model 2.
- The coefficient of determination (i.e. $R^2$) increases in value as we add variables to the regression model, but we often wish to test whether the improvement in R-square by adding those variables is statistically significant.
- To do so, we developed an inclusion/exclusion test for those variables. First, let's start with a regression model with $K_1$ variables:

$$Y_t = \alpha + \beta_1 \times X_1 + \cdots + \beta_{K_1} \times X_{K_1}$$

Now, let's add a few more variables $\left(X_{K_1+1} \cdots X_{K_2}\right)$:

$$Y_t = \alpha + \beta_1 \times X_1 + \cdots + \beta_{K_1} \times X_{K_1} + \cdots + \beta_{K_1+1} \times X_{K_1+1} + \cdots + \beta_{K_2} \times X_{K_2}$$ - The test of the hypothesis is as follows:

$$H_o : \beta_{K_1+1} = \beta_{K_1+2} = \cdots = beta_{K_2} = 0$$

$$H_1 : \exists \beta_{i} \neq 0, i \in \left[K_1+1 \cdots K_2\right]$$ - Using the change in the coefficient of determination (i.e. $R^2$) as we add new variables, we can calculate the test statistics:

$$\mathrm{f}=\frac{(R^2_{f}-R^2_{r})/(K_2-K_1)}{(1-R^2_f)/(N-K_2-1)}\sim \mathrm{F}_{K_2-K_1,N-K2-1}$$

Where:- $R^2_f$ is the $R^2$ of the full model (with added variables).
- $R^2_r$ is the $R^2$ of the reduced model (without the added variables).
- $K_1$ is the number of variables in the reduced model.
- $K_2$ is the number of variables in the full model.
- $N$ is the number of observations in the sample data.

- The sample data may include missing values.
- Each column in the input matrix corresponds to a separate variable.
- Each row in the input matrix corresponds to an observation.
- Observations (i.e. rows) with missing values in X or Y are removed.
- The number of rows of the response variable (Y) must be equal to the number of rows of the explanatory variable (X).
- The MLR_ANOVA function is available starting with version 1.60 APACHE.

## Files Examples

## References

- Hamilton, J .D.; Time Series Analysis , Princeton University Press (1994), ISBN 0-691-04289-6
- Kenney, J. F. and Keeping, E. S. (1962) "Linear Regression and Correlation." Ch. 15 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 252-285

## 0 Comments