Returns a list of the selected variables after performing the stepwise regression.

## Syntax

**MLR_STEPWISE **(**X**, Mask, **Y**, Intercept, Method, Alpha)

**X**- is the independent (explanatory) variables data matrix, such that each column represents one variable.
**Mask**- is the boolean array to choose the explanatory variables in the model. If missing, all variables in X are included.
**Y**- is the response or the dependent variable data array (a one-dimensional array of cells (e.g., rows or columns)).
**Intercept**- is the constant or the intercept value to fix (e.g., zero). If missing, an intercept will not be fixed and is computed normally.
**Method**- is a switch to select the variable's inclusion/exclusion approach (1 = forward selection (default), 2 = backward elimination, 3 = bi-directional elimination)
Value Method 1 Forward selection ( **default**).2 Backward elimination. 3 Bi-direction elimination. **Alpha**- is the statistical significance of the inclusion/exclusion test (i.e., alpha). If missing or omitted, an alpha value of 5% is assumed.

## Remarks

- The underlying model is described here.
- The stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure. The procedure takes the form of a sequence of f-tests in selecting or eliminating explanatory variables.
- The three main approaches are:
**Forward Selection**which involves starting with no variables in the model, testing the addition of each variable using a chosen model comparison criterion, adding the variable (if any) that improves the model the most, and repeating this process until no additional variables improve the model.**Backward Elimination**which involves starting with all candidate variables, testing the deletion of each variable using a chosen model comparison criterion, deleting the variable (if any) that improves the model the most by being deleted, and repeating this process until no further improvement is possible.**Bidirectional Elimination**a combination of the above tests, involves testing at each step for variables to be included or excluded.

- One of the main issues with stepwise regression is that it searches a large space of possible models. Hence it is prone to overfitting the data.
- The initial values in the mask array define the variables set that MLR_STEPWISE works with. In other words, variables that are not selected will not be considered during the regression.
- The sample data may include missing values.
- Each column in the input matrix corresponds to a separate variable.
- Each row in the input matrix corresponds to an observation.
- Observations (i.e., rows) with missing values in X or Y are removed.
- The number of rows of the response variable (Y) must be equal to the number of rows of the explanatory variable (X).
- The MLR_STEPWISE function is available starting with version 1.60 APACHE.

## Files Examples

## Related Links

## References

- Hamilton, J.D.; Time Series Analysis, Princeton University Press (1994), ISBN 0-691-04289-6.
- Kenney, J. F. and Keeping, E. S. (1962) "Linear Regression and Correlation." Ch. 15 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 252-285.

## Comments

Article is closed for comments.