The idea of modelling the time series
of GNP, and other macroeconomic variables, data for US as a trend
stationary (TS) process was brought into question by Nelson and Plosser in
their groundbreaking research paper in
1982. Their research paper marked a paradigm shift in the way time-series
econometrics was done post the 80's. The profound idea that prompted them to
look for an alternative to the prevalent TS process, was that the
series of GNP does not have any tendency to return back to a time trend following a shock. This means that following a shock (for example technological
innovations), the series keeps moving away from the time trend rather than return
back to it. If the series keeps moving away from the time trend, movements of
the series would not be captured by a trend-stationary model.

This marked a radical change which transformed the
idea of stationarity to include another class of processes,

**difference stationary (DS)**processes. More on this in my previous post. But as a student of basic time series the phenomenon of non-stationarity was not very easy for me to digest. Does it mean that if a series fluctuates around a mean, is it necessarily stationary? The answer happens to be No (now that I have completed the course I can proudly and confidently answer that question). According to the definition of stationarity, a series is stationary if any group of consecutive data points in the series, have the same mean. Sounds confusing? Let me illustrate this using the example of 2 Indian macro series and some R codes. The daily 3-month MIBOR rates and the daily INR/USD exchange rates for the past 10 years.
###############################

# Access the relevant files ###

###############################

mibor <- read.csv("MIBOR.csv", na.strings="#N/A")

exchange <- read.csv("Exchange_rates.csv",
na.strings="#N/A")

nifty <- read.csv("Nifty_returns.csv")

#################################

## Dealing with missing values ##

#################################

## Dealing with blanks in the MIBOR rates ##

mibor[, 2] <- approx(as.Date(mibor$Dates, '%d-%b-%y'), mibor[ ,2],
as.Date(mibor$Dates, '%d-%b-%y'))$y

for(k in 2:nrow(mibor)) # Calculating the
%age change

{

}

## Dealing with blanks in the exchange rates ##

exchange[, 2] <- approx(as.Date(exchange$Year,'%d-%b-%y'), exchange[
,2], as.Date(exchange$Year, '%d-%b-%y'))$y

exchange$Change <- as.numeric(exchange$Change)

for(j in 2:nrow(exchange)) #
Calculating the %age change

{

exchange$Change <- diff(exchange$Exchange.rates)/exchange$Exchange.rates[-length( exchange$Exchange.rates)

}

## Plotting the variables ##

png("indep_var_ns.png",
width = 480, height = 480)

par(mfrow = c(2, 1))

plot(as.Date(mibor$Dates,'%d-%b-%y'),
mibor$MIBOR, xlab= "Date",

ylab= "3-month MIBOR rates (%age)", type='l', col='red',

main="3-month MIBOR rates")

abline(h = 0, lty = 8,
col = "gray")

plot(as.Date(exchange$Year,
'%d-%b-%y'), exchange$Exchange.rates, xlab= "Date",

ylab= "IND/USD Exchange rates", type='l', col='red',

main="IND/USD Exchange rate")

abline(h = 0, lty = 8,
col = "gray")

dev.off()

Eyeballing the above plots one can see that the
series do not have any trend in them, as in the series are moving more of less
about a mean. But if we look at the MIBOR for example, the mean of the series
is different in the period 2000-02 and different for 2003-04. This is the catch
here, which I think is quite probable to be overlooked by many. A unit root would also cause long forays away from the mean, so to conduct a
test for non-stationarity we shall check if the above series has a unit root in
the auto-regressive (AR) polynomial using the

**ADF test**. And now that we can see that the mean is changing substantially over the time horizon, we would expect there to be a unit root in the series. Let us see what the results have to show.
> adf.test(exchange$Exchange.rates)

Dickey-Fuller
= -1.9266, Lag order = 13, p-value = 0.6094 ## Cannot reject the
null of non-stationarity

alternative
hypothesis: stationary

> adf.test(mibor$MIBOR)

Dickey-Fuller
= -2.1925, Lag order = 13, p-value = 0.4968 ## Cannot reject the
null of non-stationarity

alternative hypothesis: stationary

alternative hypothesis: stationary

> adf.test(nifty$S...P.Cnx.Nifty)

Dickey-Fuller
= -11.8633, Lag order = 13, p-value = 0.00 ## Can reject the null
of non-stationarity

alternative hypothesis: stationary

alternative hypothesis: stationary

So we see that the null of unit root cannot be
rejected for MIBOR and INR/USD, but the null is rejected for NIFTY returns. Why
its rejected for NIFTY is because the fluctuations around the mean are of a
very high frequency, so even if we took 2 different time periods the
statistical difference between their means would be negligible. Thus the NIFTY
returns gives us a stationary series. MIBOR and INR/USD series are also made
stationary by taking first difference of the series. The stationary plot look
like:

## Plot for the %age changes of the
variables:

png("indep_var.png", width =
480, height = 480)

par(mfrow = c(3, 1))

plot(as.Date(mibor$Dates,'%d-%b-%y'),
mibor$Change1, xlab= "Date",

ylab= "Change in 3-month MIBOR(%age)", type='l',
col='royalblue',

main="%age change in MIBOR rates")

abline(h = 0, lty = 8, col =
"gray")

plot(as.Date(nifty$Date,'%d-%b-%y'),
nifty$S...P.Cnx.Nifty, xlab= "Date",

ylab= "NIFTY returns(%age)", type='l', col='royalblue',

main="NIFTY returns")

abline(h = 0, lty = 8, col =
"gray")

plot(as.Date(exchange$Year, '%d-%b-%y'),
exchange$Change, xlab= "Date",

ylab= "IND/USD Exchange rates change(%age)", type='l',
col='royalblue',

main="IND/USD Exchange rate changes(%age)")

abline(h = 0, lty = 8, col =
"gray")

dev.off()

So there are 2 takes from the exercise above (1) Series fluctuating
about a mean need not necessarily be stationary (empirically shown) (2) 3-month
MIBOR and INR/USD exhibit unit roots in the given (10 year daily) sample for India. The first point might be a trivial statement for advanced
econometricians, but for the novice and the amateurs I think this would
serve as a good basic exercise.

Thank you very much for such a detailed explanation. I have been roaming the internet in search of something like this on ADF tests but without any success.

ReplyDeleteAlso, I am a beginner in R. Could you please explain:

ReplyDelete(i) The use of the approx() function you used

(ii) What was your basis for saying that the null hypothesis was not rejected for MIBOR and Exchange Rate and rejected for NIFTY

Thank you very much.

I am glad that you found the post helpful.

Delete(i) The approx() function in R is a simple way of linearly interpolating the missing data. For example in the above illustration the script:

mibor[, 2] <- approx(as.Date(mibor$Dates, '%d-%b-%y'), mibor[ ,2], as.Date(mibor$Dates, '%d-%b-%y'))$y

Uses linear interpolation to substitute a numeric value in place of the NA's in MIBOR rates. And in order to identify the blanks/NA's we have specified the date(mibor$Dates)

(ii) Looking at the p-values from the ADF test we can reach the above conclusion. For exchange rates and MIBOR, my p-value is quite high 0.6 and 0.49 respectively. Hence I cannot reject my null hypothesis (of no unit root) given such a high p-value.

Hope this answers your question. Feel free to comment/mail me in case you need any further clarification.

~

Shreyes

Thank you for answering so promptly.

ReplyDeleteMy thought is basically that any test statistic should be compared with critical values. In this case, what are the critical values you compared your test statistics 0.6 and 0.49 with?

Also, what does the "lag-order" in the output of the adf.test() function mean? If lag-order = 3, does it mean that my series becomes stationary after differencing the AR process 3 times?

Thank you so much in advance and I am recommending your blogspot in all my social networks. It is JUST GREAT!

(i) The value reported in the result is the p-value which is nothing but the exact level of significance. For example, a p-value of 0.05 would mean that there is a 5% chance that I might observe this test statistic(T-statistic or F-statistic) when the null hypothesis is actually true. So a p-value of 0.6 means that there is a 60% chance that I might end up with a t-stat as large as the one that I have given the null hypothesis is true. So under such cases we cannot reject the null hypothesis of unit root.

DeleteI would recommend you to be thorough with the understanding of p-value as it is very often misunderstood even by statisticians. You can refer to my previous post http://programming-r-pro-bro.blogspot.in/2011/10/predictability-of-stock-returns-using.html#comment-form for a brief discussion on p-value.

(ii) The lag order is nothing but the order of the AR regression run to test for unit root (you might have to read up about ADF test in detail to understand the concept of lag order, I can share some helpful lecture notes if you want). What you are talking about is the order of integration (the number of times you have to difference the series to make it stationary).

I am really glad that you found the posts helpful.

~

Shreyes

I'd be more than glad to have some lecture notes since I am a little confused about the whole thing.

DeleteAlso, I am very sure that you use Google+. Me too actually. It would be an honor to hangout with you on Google+ for one or two tips in R.

Thank you for making the world a better place.

Also, can I please have your email so I can email you for furthur enquiries (hope you'll have time for me :) )

ReplyDeleteBut I'll use the blogspot more often since it will benefit others who may have similar questions.

You can feel free to email me at shreyes.upadhyay@gmail.com

DeleteThank you very much. Your email address is noted.

DeleteAfter visiting your blogspot thoroughly, I must say it is one of the best for beginners like me! You're doing a wonderful job man!

ReplyDelete