In this tutorial, we will demonstrate the steps to construct a periodogram plot and identify key frequencies (if any) embedded in the time series. For this tutorial, we will use the monthly residential power demand of the city of Jeddah in the kingdom of Saudi-Arabia (KSA) between 2003 and 2010. We will do the whole tutorial using only NumXL 1.64 (release name Turret) and Microsoft Excel functionality.
First, we took the natural logarithm of the monthly power demand to stabilize the variance. So, in this tutorial, whenever we mention demand, we are referring to the logarithm value of the demand.
- Select an empty cell in your worksheet where you wish to generate your output periodogram table and plot. Next, using the NumXL Toolbar (or menu in Excel 2003), locate the “Periodogram” icon and click it.
- The Periodogram wizard (i.e. dialog box) appears. Note that the output range references the currently selected cell in your worksheet.
- Now, select the input data range in column ‘D’ of this worksheet.
Note: Once the input data is selected, the rest of the wizard options are now enabled and available.
- Now, click on the ”Options” tab.
- Before we can conduct the periodogram analysis, we need to remove any deterministic trend or stochastic drift (aka random walk). Under the data treatment, we can manually choose the method or leave it set to “Auto” to make it data-driven.
For the auto data treatment, the periodogram function uses the augmented dickey-fuller test (aka ADF) internally to test for stationarity and, if needed, best treatment (i.e. difference or detrend).
Next, by default, the wizard generates a plot for the periodogram and displays the first 42 (half the size of the input data set) frequency components. Let’s leave all the settings unchanged.
- Click on the “Missing Values” tab.
- The settings in this tab are very similar to the missing values setting of other wizards in NumXL. In essence, you can instruct NumXL on how to handle missing values, if any are found.
By default, the “Don’t accept Missing Values” option is selected. Since our data set does not have any missing value, we will leave the default option unchanged.
- Click “OK” now.
The periodogram table and plot is generated in your worksheet now.
In the periodogram plot, one can easily observe a strong spike at point seven and, to a smaller extent, another spike at point 14. Let’s mark these two data points clearly in our plot by enlarging their corresponding markers and changing their background colors.
In conclusion, the periodogram plot shows that the residential monthly demand has a strong seasonality with a length of seven months. This may sound counter-intuitive, as we commonly believe in a natural seasonality of 12 months, but this finding can be attributed to the specific climate of Saudi Arabia.
Q1: What about the second spike in the periodogram?
A: The first frequency has a much higher periodogram value (aka “Power”) relative to that of the second spike, so, for all practical considerations, we can focus our attention on modeling this frequency first.
Q2: Is the seasonality deterministic or stochastic?
A: We don’t know yet. The periodogram does not draw a distinction between the two scenarios. A separate test is in order. Why? Knowing the nature of seasonality will affect future modeling decisions.
Q3: What is the ideal model for this time series?
Case 1: Assuming a stochastic seasonality, a SARIMA model (with a season length equal to seven) is hypothesized. Further analysis is needed to find the proper order of its seasonal and non-seasonal components.
Case 2: Assuming a deterministic seasonality, an ARMAX-type model may be appropriate. The deterministic seasonality is modeled as an exogenous (i.e. sine and/or cosine inputs) factor. Again, finding the proper order of the ARMA model component requires further analysis.