## Thursday, 8 December 2011

### Movement around the mean "Stationary" OR "Unit root"

The idea of modelling the time series of GNP, and other macroeconomic variables, data for US as a trend stationary (TS) process was brought into question by Nelson and Plosser in their groundbreaking research paper in 1982. Their research paper marked a paradigm shift in the way time-series econometrics was done post the 80's. The profound idea that prompted them to look for an alternative to the prevalent TS process, was that the series of GNP does not have any tendency to return back to a time trend following a shock. This means that following a shock (for example technological innovations), the series keeps moving away from the time trend rather than return back to it. If the series keeps moving away from the time trend, movements of the series would not be captured by a trend-stationary model.

This marked a radical change which transformed the idea of stationarity to include another class of processes, difference stationary (DS) processes. More on this in my previous post. But as a student of basic time series the phenomenon of non-stationarity was not very easy for me to digest. Does it mean that if a series fluctuates around a mean, is it necessarily stationary? The answer happens to be No (now that I have completed the course I can proudly and confidently answer that question). According to the definition of stationarity, a series is stationary if any group of consecutive data points in the series, have the same mean. Sounds confusing? Let me illustrate this using the example of 2 Indian macro series and some R codes. The daily 3-month MIBOR rates and the daily INR/USD exchange rates for the past 10 years.

###############################
# Access the relevant files ###
###############################

#################################
## Dealing with missing values ##
#################################

## Dealing with blanks in the MIBOR rates ##

mibor[, 2] <- approx(as.Date(mibor\$Dates, '%d-%b-%y'), mibor[ ,2], as.Date(mibor\$Dates, '%d-%b-%y'))\$y
for(k in 2:nrow(mibor))  # Calculating the %age change
{
mibor\$Change1 <- diff(mibor\$MIBOR) / mibor\$MIBOR[-length(mibor\$MIBOR)]
}

## Dealing with blanks in the exchange rates ##

exchange[, 2] <- approx(as.Date(exchange\$Year,'%d-%b-%y'), exchange[ ,2], as.Date(exchange\$Year, '%d-%b-%y'))\$y
exchange\$Change <- as.numeric(exchange\$Change)
for(j in 2:nrow(exchange)) # Calculating the %age change
{
exchange\$Change <- diff(exchange\$Exchange.rates)/exchange\$Exchange.rates[-length(exchange\$Exchange.rates)
}

## Plotting the variables ##

png("indep_var_ns.png", width = 480, height = 480)
par(mfrow = c(2, 1))
plot(as.Date(mibor\$Dates,'%d-%b-%y'), mibor\$MIBOR, xlab= "Date",
ylab= "3-month MIBOR rates (%age)", type='l', col='red',
main="3-month MIBOR rates")
abline(h = 0, lty = 8, col = "gray")
plot(as.Date(exchange\$Year, '%d-%b-%y'), exchange\$Exchange.rates, xlab= "Date",
ylab= "IND/USD Exchange rates", type='l', col='red',
main="IND/USD Exchange rate")
abline(h = 0, lty = 8, col = "gray")
dev.off()

Eyeballing the above plots one can see that the series do not have any trend in them, as in the series are moving more of less about a mean. But if we look at the MIBOR for example, the mean of the series is different in the period 2000-02 and different for 2003-04. This is the catch here, which I think is quite probable to be overlooked by many. A unit root would also cause long forays away from the mean, so to conduct a test for non-stationarity we shall check if the above series has a unit root in the auto-regressive (AR) polynomial using the ADF test. And now that we can see that the mean is changing substantially over the time horizon, we would expect there to be a unit root in the series. Let us see what the results have to show.

Dickey-Fuller = -1.9266, Lag order = 13, p-value = 0.6094 ## Cannot reject the null of non-stationarity
alternative hypothesis: stationary
Dickey-Fuller = -2.1925, Lag order = 13, p-value = 0.4968 ## Cannot reject the null of non-stationarity
alternative hypothesis: stationary
Dickey-Fuller = -11.8633, Lag order = 13, p-value = 0.00 ## Can reject the null of non-stationarity
alternative hypothesis: stationary

So we see that the null of unit root cannot be rejected for MIBOR and INR/USD, but the null is rejected for NIFTY returns. Why its rejected for NIFTY is because the fluctuations around the mean are of a very high frequency, so even if we took 2 different time periods the statistical difference between their means would be negligible. Thus the NIFTY returns gives us a stationary series. MIBOR and INR/USD series are also made stationary by taking first difference of the series. The stationary plot look like:

## Plot for the %age changes of the variables:
png("indep_var.png", width = 480, height = 480)
par(mfrow = c(3, 1))
plot(as.Date(mibor\$Dates,'%d-%b-%y'), mibor\$Change1, xlab= "Date",
ylab= "Change in 3-month MIBOR(%age)", type='l', col='royalblue',
main="%age change in MIBOR rates")
abline(h = 0, lty = 8, col = "gray")
plot(as.Date(nifty\$Date,'%d-%b-%y'), nifty\$S...P.Cnx.Nifty, xlab= "Date",
ylab= "NIFTY returns(%age)", type='l', col='royalblue',
main="NIFTY returns")
abline(h = 0, lty = 8, col = "gray")
plot(as.Date(exchange\$Year, '%d-%b-%y'), exchange\$Change, xlab= "Date",
ylab= "IND/USD Exchange rates change(%age)", type='l', col='royalblue',
main="IND/USD Exchange rate changes(%age)")
abline(h = 0, lty = 8, col = "gray")
dev.off()

So there are 2 takes from the exercise above (1) Series fluctuating about a mean need not necessarily be stationary (empirically shown) (2) 3-month MIBOR and INR/USD exhibit unit roots in the given (10 year daily) sample for India. The first point might be a trivial statement for advanced econometricians, but for the novice and the amateurs I think this would serve as a good basic exercise.

In case you wish to replicate the exercise, data can be obtained from here: MIBORINR/USDNIFTY.

1. Thank you very much for such a detailed explanation. I have been roaming the internet in search of something like this on ADF tests but without any success.

2. Also, I am a beginner in R. Could you please explain:
(i) The use of the approx() function you used
(ii) What was your basis for saying that the null hypothesis was not rejected for MIBOR and Exchange Rate and rejected for NIFTY

Thank you very much.

1. (i) The approx() function in R is a simple way of linearly interpolating the missing data. For example in the above illustration the script:

mibor[, 2] <- approx(as.Date(mibor\$Dates, '%d-%b-%y'), mibor[ ,2], as.Date(mibor\$Dates, '%d-%b-%y'))\$y

Uses linear interpolation to substitute a numeric value in place of the NA's in MIBOR rates. And in order to identify the blanks/NA's we have specified the date(mibor\$Dates)

(ii) Looking at the p-values from the ADF test we can reach the above conclusion. For exchange rates and MIBOR, my p-value is quite high 0.6 and 0.49 respectively. Hence I cannot reject my null hypothesis (of no unit root) given such a high p-value.

Hope this answers your question. Feel free to comment/mail me in case you need any further clarification.

~
Shreyes

3. Thank you for answering so promptly.

My thought is basically that any test statistic should be compared with critical values. In this case, what are the critical values you compared your test statistics 0.6 and 0.49 with?

Also, what does the "lag-order" in the output of the adf.test() function mean? If lag-order = 3, does it mean that my series becomes stationary after differencing the AR process 3 times?

Thank you so much in advance and I am recommending your blogspot in all my social networks. It is JUST GREAT!

1. (i) The value reported in the result is the p-value which is nothing but the exact level of significance. For example, a p-value of 0.05 would mean that there is a 5% chance that I might observe this test statistic(T-statistic or F-statistic) when the null hypothesis is actually true. So a p-value of 0.6 means that there is a 60% chance that I might end up with a t-stat as large as the one that I have given the null hypothesis is true. So under such cases we cannot reject the null hypothesis of unit root.

I would recommend you to be thorough with the understanding of p-value as it is very often misunderstood even by statisticians. You can refer to my previous post http://programming-r-pro-bro.blogspot.in/2011/10/predictability-of-stock-returns-using.html#comment-form for a brief discussion on p-value.

(ii) The lag order is nothing but the order of the AR regression run to test for unit root (you might have to read up about ADF test in detail to understand the concept of lag order, I can share some helpful lecture notes if you want). What you are talking about is the order of integration (the number of times you have to difference the series to make it stationary).

~
Shreyes

2. I'd be more than glad to have some lecture notes since I am a little confused about the whole thing.

Also, I am very sure that you use Google+. Me too actually. It would be an honor to hangout with you on Google+ for one or two tips in R.

Thank you for making the world a better place.

4. Also, can I please have your email so I can email you for furthur enquiries (hope you'll have time for me :) )
But I'll use the blogspot more often since it will benefit others who may have similar questions.

1. You can feel free to email me at shreyes.upadhyay@gmail.com

2. 5. 