Augmented Dickey-Fuller Test in Python

When performing time series analysis, most statistical forecasting methods assume that the time series is approximately stationary. How can you determine if a time series is stationary? The Augmented Dickey-Fuller Test is a well known statistical test that can help determine if your time series is stationary. In this article I will show you how to perform the Augmented Dickey-Fuller Test (ADF) test in python.

Stationary vs. Non-Stationary

In a stationary time series, statistical properties such as mean and variance are constant over time. In a non-stationary series, these properties are dependent on time.

Here is an example of a Non-Stationary time series:

import matplotlib.pyplot as plt
import random
random.seed(5)

non_stationary_series = []
for i in range(0,100):
    non_stationary_series.append(random.random()+(i*.02))
    
plt.plot(non_stationary_series)
plt.title('Non-Stationary Time Series')
plt.show()

Non Stationary Time Series

There is clearly a trend in the above series of 100 randomly generated data points. Take a closer look at the code and you’ll easily spot the 2% being multiplied to the time component (i).

In a Stationary time series, there is no visible trend. Here is an example of a Stationary time series.

import matplotlib.pyplot as plt
import random
random.seed(5)

stationary_series = []
for i in range(0,100):
    stationary_series.append(random.random())
    
plt.plot(stationary_series)
plt.title('Stationary Time Series')
plt.show()

Stationary Time Series

 

Augmented Dickey-Fuller Test (ADF) Statistical Test

To determine if a time series is stationary or not, we will use the ADF test which is a type of unit root test. Unit roots are a cause for non-stationarity, the ADF test will test if unit root is present.

A time series is stationary if a single shift in time doesn't change the time series statistical properties, in which case unit root does not exist.

The Null and Alternate hypothesis of the Augmented Dickey-Fuller test are defined as follows:

  • Null Hypothesis states there is presence of a unit root.
  • Alternate Hypothesis states there is no unit root. In other words, Stationarity exists.

 

ADF Python Code

To implement the ADF test in python, we will be using the statsmodel implementation. Statsmodels is a Python module that provides functions and classes for the estimation of many statistical models. The function to perform ADF is called adfuller.

First, import the required dependencies. Import the statsmodel module and the adfuller class from the tsa.statstools namespace. We will also need Pandas.

import statsmodels
from statsmodels.tsa.stattools import adfuller
import pandas as pd

We will create a class called StationarityTests to hold the ADF function. Our class constructor accepts a significance level as a parameter. This is defaulted to a significance level of 5%. It also contains an isStationary variable that will hold the results of the Augmented Dickey-Fuller test. If the time series is stationary, isStationary will be True, otherwise it will be False.

class StationarityTests:
    def __init__(self, significance=.05):
        self.SignificanceLevel = significance
        self.pValue = None
        self.isStationary = None

Lastly, we add the ADF implementation via a function called ADF_Stationarity_Test. This function takes a 1d-Array as input and a variable defaulted to True to determine if the function should print the full ADF results. The Akaike Information Criterion (AIC) is used to determine the lag.

The adfuller function returns a tuple of statistics from the ADF test such as the Test Statistic, P -Value, Number of Lags Used, Number of Observations used for the ADF regression and a dictionary of Critical Values.

If the P-Value is less than the Significance Level defined, we reject the Null Hypothesis that the time series contains a unit root. In other words, by rejecting the Null hypothesis, we can conclude that the time series is stationary.

If the pValue is very close to your significance level, you can use the Critical Values to help you reach a conclusion regarding the stationarity of your time series.

    def ADF_Stationarity_Test(self, timeseries, printResults = True):

        #Dickey-Fuller test:
        adfTest = adfuller(timeseries, autolag='AIC')
        
        self.pValue = adfTest[1]
        
        if (self.pValue<self.SignificanceLevel):
            self.isStationary = True
        else:
            self.isStationary = False
        
        if printResults:
            dfResults = pd.Series(adfTest[0:4], index=['ADF Test Statistic','P-Value','# Lags Used','# Observations Used'])

            #Add Critical Values
            for key,value in adfTest[4].items():
                dfResults['Critical Value (%s)'%key] = value

            print('Augmented Dickey-Fuller Test Results:')
            print(dfResults)

Testing Time Series for Stationarity

With our class now defined, its easy to test a time series for stationarity through the Augmented Dickey-Fuller test. Will test our first 2 series defined in this article for stationarity.

First, let’s see if our non_stationary_series is stationary. Instantiate the class and provide the non_stationary_series to the ADF_Stationarity_Test function like below. Then, print the results which will be holded by our isStationary variable in the class. 

sTest = StationarityTests()
sTest.ADF_Stationarity_Test(non_stationary_series, printResults = True)
print("Is the time series stationary? {0}".format(sTest.isStationary))

ADF Non-Stationary Time Series Statistics

In this case, it is easy to see that the series is not stationary. P-Value of 0.83 is greater than our 5% significance level, therefore we fail to reject the null hypothesis that unit root does exist. 

Another way to interpret this test is using the critical value which comes up to -0.75. This is greater than the 5% critical value of -2.89 (or the significance that you need) and therefore we fail to reject the null hypothesis. 

 

Let’s run this same test but on our stationary series.

sTest = StationarityTests()
sTest.ADF_Stationarity_Test(stationary_series, printResults = True)
print("Is the time series stationary? {0}".format(sTest.isStationary))

ADF Test on a Stationary Time Series

As you would expect, the results show that the series is actually stationary. In this case, the P-Value from our ADF test is much smaller than our 5% significance level, therefore we can reject the Null hypothesis and instead accept the alternate hypothesis that stationarity exists. 

Taking a look at the critical value yields the same conclusion. The tests critical value ends up being -10.458 which is much smaller than the 5% critical value of -2.89 and so we have enough evidence to conclude that unit root does not exist. In other words, series is stationary.

Conclusion

You have now learned how to test for stationarity using the Augmented Dickey-Fuller Test (ADF) and are able to interpret the test using the P-Value or the Critical Values returned by the test. We created our own class which implements the ADF test from the statsmodels python package. Knowledge of this statistical test will greatly help you when you are building time series forecasting models in which stationarity is many times a strong underlying assumption for various models.

 

MJ

Advanced analytics professional currently practicing in the healthcare sector. Passionate about Machine Learning, Operations Research and Programming. Enjoys the outdoors and extreme sports.

Related Articles

>