Text Book Example - Airline Passenger Data

In this paper, we will analyze the international passenger data series (G), as mentioned in the textbook Time Series: Forecast and Control by Box, Jenkins, and Reinsel (ISBN: 978-0470272848). This textbook was first published in the late 1960s and is considered by many practitioners as the definite foundation textbook on the time series topic.

The international airline passenger series describes monthly totals of the international passengers for the period between Jan 1949 and Dec 1960.

The objective here is to follow the same analysis in the book and to demonstrate the accuracy of NumXL calculations. Furthermore, SAS – a leading statistical software vendor – demonstrate their own analysis for the same data set, so we strongly recommend our users to review their results as well, via this link:

SAS Procedural reference - Example 7.2 Seasonal Model for the Airline Series

This figure shows the Airline Passenger Data Plot

Step 1: Data Transformation

Using the Descriptive Statistic wizard (pictured below), examine the different summary stats of the sample data.

This figure shows the NumXL Desc Statistics Wizard

In the summary statistics table (pictured below), the data series exhibits serial correlation (i.e. failed the white noise test) and fat tails (significant excess kurtosis and ARCH effect).

This figure shows the Summary Statistics output for airline passenger data

The original analysis converts the data series using the natural logarithm function (i.e. LN). Follow the same technique, as shown in the graph below:

This figure shows the plot for the log airline passenger data

This technique should yield the following summary statistics:

This figure shows the summary stats for log airline passenger data
Notice that the transformed data series is smoother than the original data and the time trend appears more linear than the original.

Step 2: Correlogram Analysis

Using the NumXL toolbar, launch the Correlogram wizard.

This figure shows the NumXL Correlogram Wizard

Highlight the log data and select 24 lags for ACF and PACF. Then, create a correlogram for the data.

This figure shows the Correlogram output for log airline passenger data

Examining the ACF plot, the data appear to be integrated at lag one(1) and at lag twelve (12). Difference the data for both lags (i.e. $ \left(1-L\right)\left(1-L^{12}\right) $),as shown in the graph below:

This figure shows the Differenced airline passenger data

The differenced data set should generate the following correlogram:

This figure shows the Correlogram for differenced log airline passenger data

Also, notice that the ACF plot of the differenced data series shows a significant autocorrelation at lag one (1) and lag twelve (12).

Step 3: Airline Modeling

The proposed model for the log passenger data series is an airline model with a season length of 12 months.

$$\left(1-L\right)\left(1-l^{12}\right)\ln{X_t}=\mu \left(1+\theta L\right)\left(1+\Theta L^{12}\right)a_t$$

Where

  • L = the backshift operator (aka B).
  • $a_t$ = the error term, shock, innovation, or simply the model residual at time t.
  • $\mu$ = the mean of the seasonal differenced time series.

Using the NumXL toolbar, click on the Airline icon to launch the Airline model wizard.

This figure shows the Airline model wizard

This figure shows the Airline model initial values table

Step 4: Calibration

Select the cell at the top of the airline model table (i.e. “AIRLINE(12)”) and click on the Calibration icon in the toolbar.

This figure shows the Airline passenger data calibration

The Excel solver will try to determine the optimal values for the airline model’s parameters (i.e. $\theta,\Theta$ )

This figure shows the Excel solver results dialog

The new optimal values for the model parameters are shown below:

This figure shows the Calibrated values for Airline passenger data

Examining the residuals analysis table, the calibrated values satisfy all assumptions of the underlying model (i.e. Gaussian distributed residuals).

The parameter values of the calibrated model on the SAS website are slightly different from the ones we calculated earlier:

This figure shows the SAS Airline Model ParametersThis figure shows the Airline model goodness of fit

However, our values are within the error tolerance limits (i.e. ) and our Akaike’s information criterion (AIC) is better.

The core difference between NumXL and SAS values - we believe - can be explained by noting that we did not set the intercept ($\mu$ ) value to zero.

Step 5: Forecast

The residuals of the calibrated model satisfy the assumptions of an airline model. Now, we are ready to conduct a 24-month forecast for the monthly international airline passenger totals.

The forecast will follow two stages:

  • Forecasting for the log of the monthly totals
  • Transforming the forecast back into regular monthly totals

Select the cell labeled “AIRLINE(12)” and click on the Forecast icon in the toolbar.

This figure shows the Airline passenger forecast wizard

Please note that for forecast purposes, the input time series here refers to the latest 13 months, or the observations between Nov 1959 and Dec 1960. The output table is shown below:

This figure shows the Forecast table for log airline passenger monthly totals

This figure shows the Forecast for log airline passenger monthly totals

To convert back to regular monthly totals, use these equations:

$$UL=e^{UL_{log}}$$

$$LL=e^{LL_{log}}$$

$$\mu=e^{\mu_{log}+\frac{\sigma_{log}^2}{2}}$$

This figure shows the Monthly Total Airline passenger Forecast table

This figure shows the Monthly Total Airline passenger Forecast plot

  Attachments

Comments

Please sign in to leave a comment.

Was this article helpful?
0 out of 0 found this helpful