# SLR_ANOVA - Simple Regression Analysis of Variance (ANOVA)

Calculates the regression model analysis of the variance (ANOVA) values.

## Syntax

SLR_ANOVA(X, Y, Intercept, Return_type)

X is the independent (aka explanatory or predictor) variable data array (one dimensional array of cells (e.g. rows or columns)).

Y is the response or the dependent variable data array (one dimensional array of cells (e.g. rows or columns)).

Intercept is the constant or the intercept value to fix (e.g. zero). If missing, an intercept will not be fixed and is computed normally.

Return_type is a switch to select the output (1 = SSR (default), 2 = SSE, 3 = SST, 4 = MSR, 5 = MSE, 6 = F-Stat, 7 = Significance F).

Method Description
1 SSR (sum of squares of the regression)
2 SSE (sum of squares of the residuals)
3 SST (sum of squares of the dependent variable)
4 MSR (mean squares of the regression)
5 MSE (mean squares error or residuals)
6 F-Stat (test score)
7 Significance F (P-value of the test)

## Remarks

1. The underlying model is described here.
2. $$\mathbf{y} = \alpha + \beta \times \mathbf{x}$$
3. The regression ANOVA table which examines the following hypothesis:
$$\mathbf{H}_o: \beta = 0$$
$$\mathbf{H}_1: \beta \neq 0$$
4. In other words, the regression ANOVA examines the probability that regression does NOT explain the variation in $\mathbf{y}$, i.e. that any fit is due purely to chance.
5. The SLR_ANOVA calculates the different values in the ANOVA tables as follows:
$$\mathbf{SST}=\sum_{i=1}^N \left(Y_i - \bar Y \right )^2$$
$$\mathbf{SSR}=\sum_{i=1}^N \left(\hat Y_i - \bar Y \right )^2$$
$$\mathbf{SSR}=\sum_{i=1}^N \left(Y_i - \hat Y_i \right )^2$$
Where:
• $N$ is the number of non-missing observations in the sample data.
• $\bar Y$ is the empirical sample average for the dependent variable.
• $\hat Y_i$ is the regression model estimate value for the i-th observation.
• $\mathbf{SST}$ is the total sum of squares for the dependent variable.
• $\mathbf{SSR}$ is the total sum of squares for the regression (i.e. $\hat y$) estimate.
• $\mathbf{SSE}$ is the total sum of error (aka residuals $\epsilon$) terms for the regression (i.e. $\epsilon = y - \hat y$) estimate.
• $\mathbf{SST} = \mathbf{SSR} + \mathbf{SSE}$.
AND
$$\mathbf{MSR} = \frac{\mathbf{SSR} }{1} = \mathbf{SSR}$$
$$\mathbf{MSE} = \frac{ \mathbf{SSE} }{N-2}$$
$$\mathbf{F-Stat} = \frac{\mathbf{MSR} }{ \mathbf{MSE} }$$
Where:
• $\mathbf{MSR}$ is the mean squares of the regression. For SLR, the $\mathbf{MSR} = \mathbf{SSR}$.
• $\mathbf{MSE}$ is the mean squares of the residuals.
• $\textrm{F-Stat}$ is the test score of the hypothesis.

• $\textrm{F-Stat} \sim \mathbf{F}\left(1,N-2\right)$.
6. The sample data may include missing values.
7. Each row in the input matrix corresponds to an observation.
8. Observations (i.e. row) with missing values in X or Y are removed.
9. The number of rows of the response variable (Y) must be equal to the number of rows of the explanatory variables (X).
10. The SLR_ANOVA function is available starting with version 1.60 APACHE.

## References

• Hamilton, J .D.; Time Series Analysis , Princeton University Press (1994), ISBN 0-691-04289-6
• Kenney, J. F. and Keeping, E. S. (1962) "Linear Regression and Correlation." Ch. 15 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 252-285