Wednesday, 25 July 2012

Measuring persistence in a time series : Application of rolling window regression

During my final semester at IGIDR I did a project paper in macroeconomics involving timeseries econometrics. The concept that I focused on my study was unit root, which I have touched upon in my earlier posts. This study presents a novel applicative aspect of unit root test called persistence. We investigate the level of persistence exhibited by the inflation rate series in India and also see how this level has changed over time. 

Background and definition:

Persistence level is an important dynamic property of any timeseries that gives us an overview understanding of the series in question. Persistence, in English language, is defined as "continuance of an effect after the cause is removed". This pretty much captures the econometric definition too. If a series is given an external shock, the level of persistence would give us an idea as to what the impact of that shock will be on that series, will it soon revert to its mean path or will it be further pushed away from the mean path. In case of a highly persistence series, a shock to the series tends to persist for long and the series drifts away from its 
historical mean path. Opposite is case of a series with low level of persistence, post a shock to the series it has a tendency to get back to its historical mean path. 

Inflation is measured as percentage rise in the price index or, informally speaking, a general rise in the prices of all goods and services in the economy. Its important to note that rise in prices of just few commodities could be due to market conditions in that particular sector and might not cause a general rise in the price index. For a non-econ student, inflation could be thought of as the rate at which wealth is losing its value. If the inflation rate in your economy is 10% year on year (Y-o-Y) it means that what you can buy for 100 INR today would be worth 110 INR in the next year. There are a plenty of reasons why keeping the inflation level in check is an important proposition for any economy's policy makers but I shall not elaborate on that in this post.

Why is measuring the level of persistence in inflation series important? Well, a simple commonsense reason that one can think of is that the level of persistence would play a monumental role in the RBI's decision of tackling inflation. If the inflation series is highly persistent then a shock to the inflation series would have to be dealt with in a much more stringent manner as the shock might tend to last for a really long time with detrimental impacts. Those familiar with macroeconomics would be able to relate this to the concept of Taylor's rule that governs most of the central bankers ideologies about weighted importance of inflation and growth. This is of course one of the many reasons why it might be important, there could be others you could think of.

In a timeseries econometricians world there is a formal mathematical(or rather empirical) definition of persistence. The definition of persistence is intimately related to the concept of unit root that I have discussed in my earlier posts. I would assume some prior knowledge of timeseries going forward readers are requested to improve upon wherever I go overboard. Our predefined augmented dickey fuller(ADF) test uses the ADF test regression and compute the coefficient on z(t-1) and tests whether it is statistically different than 1. Now empirically, a series that has a unit root is supposed to be highly persistent. What we intent to do is to do a rolling regression and compute the persistence coefficient for each regression and plot the persistence values over time along with the 95% confidence interval band.(For a detailed explanation of the maths behind ADF test refer to Dr. Krishnan's notes here, refer to pg. 13 for the ADF test regression equation)


Let me try and explain the rolling window regression that I have used in my analysis here. Rolling window regression for a timeseries data is basically running multiple regression with different overlapping (or non-overlapping) window of values at a time. For example, if your dataset has values on a timeseries with 100 observations and you want to perform rolling regression, or for that matter any operation on a rolling window, the idea is to start with an initial window of say 40 values(1st to the 40th observation) perform the operation that you wish to and then roll the window with some values, lets say we roll the window by 5. Now, the second window of data would be the next 40 observations starting from the 5th observation (5th to the 45th observation). Similarly, the third window will be the next 40 values starting from the 10th value, and so on. The advantage of using this technique is basically to look at any changing property of a series over time. You will get an estimate of the property over time instead of one single constant measure for the entire period. 

I have used this above discussed idea to look at the persistence level of the inflation series over time. Using a  rolling window ADF test regression to compute the persistence parameter and plotting it over time along with the 95% confidence band. I would confess here that the codes that I have used are not the best that one can work with. I would be grateful to reader who could suggest better way of going about this exercise otherwise I might be convinced of this "timeseries handling shortcoming" with R.


Plots generated from the codes above:

Persistence level in CPI over time

Persistence level in WPI over time

Key observations:

We have some interesting observations when we look at the plots. The persistence level is generally on a  lower level when we look at the consumer price index(CPI) series, over time we see that the level of persistence has been constant. The story remains same for the wholesale price index(WPI) series. We see that the level of persistence is on the lower level where even the 95% confidence band is close to 0. There have been recent arguments about how sticky inflation rate series is for India, but the empirical investigation above does not confirm with that hypothesis. There also are arguments towards central banks faulty measure of WPI targeting. Currently, the RBI looks at the WPI series for keeping the inflation in check, but when we have a divergence in the dynamic property of the CPI and WPI inflation series it becomes difficult for the policy makers to decide to what series to target. There are research papers that throw some light on this recent divergence in the different measures of inflation in India and what importance it holds for policy makers. However, persistence seems to be a property that exhibits somewhat a similar feature across the 2 series. 

The results that we obtain above are consistent with this working paper at the RBI(the Indian central bank) which illustrates  low persistence level across various measure of inflation. We have essentially replicated this methodology using R codes and updated data till present to affirm the argument in this paper.

Data used: 

If you wish to replicate the exercise above the data can be obtained form here, CPI data, WPI data. The files contain 2 columns one with the raw data and the other with seasonally adjusted data. I have used the seasonally adjusted data for the analysis here. The seasonal adjustment was done using the X-12 ARIMA filter in EVIEWS. For more about seasonally adjusted and unadjusted data refer to my previous post here.

Readers critiques/feedback are welcome.


  1. try defining the regression to get the relevant values you want in a function and then use that function in rollapply(), should make your code significantly neater without using all the for loops...also upgrade to xts objects for better timeseries plots...

    1. Thank you for you comment.

      The regression that I am doing is not what is really bothering me, I would like to know if I can work with the values without using the window() function. The lags() function did not give me what I wanted, so I had to manually code it using window().

      But I will try and use the xts() for the timeseries plot.


    2. hi ,

      is there any way out to do the same analysis in SAS?