Tuesday, October 25, 2011

VAR or VECM When Testing for Granger Causality?

It never ceases to amaze me that my post titled "How Many Weeks are There in a Year?" is at the top of my all-time hits list! Interestingly, the second-placed post is the one I titled "Testing for Granger Causality". Let's call that one the number one serious post. As with many of my posts, I've received quite a lot of direct emails about that piece on Granger causality testing, in addition to the published comments.


One question that has come up a few times relates to the use of  a VAR model for the levels of the data as the basis for doing the non-causality testing, even when we believe that the series in question may be cointegrated. Why not use a VECM model as the basis for non-causality testing in this case?

On the face of it, this might seem like a good idea. It's been suggested that as the VECM incorporates the information abou the short-run dynamics, tests conducted within that framework may be more powerful than their counterparts within a VAR model. In fact, however, there's a very good reason for not using a VECM for this particular purpose.

First, let's recall the main message from my earlier post. A simple definition of Granger Causality, in the case of two time-series variables, X and Y is:
"X is said to Granger-cause Y if Y can be better predicted using the histories of both X and Y than it can by using the history of Y alone."
We can test for the absence of Granger causality by estimating the following VAR model:

Yt = a0 + a1Yt-1 + ..... + apYt-p + b1Xt-1 + ..... + bpXt-p + ut      (1)
Xt = c0 + c1Xt-1 + ..... + cpXt-p + d1Yt-1 + ..... + dpYt-p + vt      (2)

Then, testing H0: b1 = b2 = ..... = bp = 0, against HA: 'Not H0', is a test that X does not Granger-cause Y.

Similarly, testing H0: d1 = d2 = ..... = dp = 0, against HA: 'Not H0', is a test that Y does not Granger-cause X. In each case, a rejection of the null implies there is Granger causality.

Now, if any of the variables are non-stationary (whether or not they are cointegrated), the usual Wald test statistic for this testing will not have an asymptotic Chi-Square distribution. An easy way to deal with this is to use the following procedure proposed by Toda and Yamamoto (1995) - more details are provided in my previous post:
  1. Test each of the time-series to determine their order of integration.
  2. Let the maximum order of integration for the group of time-series be m.
  3. Set up a VAR model in the levels (not the differences) of the data, regardless of the orders of integration of the various time-series.
  4. Determine the appropriate maximum lag length for the variables in the VAR, say p, using the usual methods.
  5. Make sure that the VAR is well-specified.
  6. If two or more of the time-series have the same order of integration, at Step 1, then test to see if they are cointegrated.
  7. No matter what you conclude about cointegration at Step 6, this is not going to affect what follows. It just provides a possible cross-check on the validity of your results at the very end of the analysis.
  8. Take the preferred VAR model and add in m additional lags of each of the variables into each of the equations.
  9. Test for Granger non-causality as follows. For expository purposes, suppose that the VAR has two equations, one for X and one for Y. Test the hypothesis that the coefficients of (only) the first p lagged values of X are zero in the Y equation, using a standard Wald test. Then do the same thing for the coefficients of the lagged values of Y in the X equation.
  10. Make sure that you don't include the coefficients for the 'extra' m lags when you perform the Wald tests.
  11. The Wald test statistics will be asymptotically chi-square distributed with p d.o.f., under the null.
  12. Rejection of the null implies a rejection of Granger non-causality.
  13. Finally, look back at what you concluded in Step 6 about cointegration:

"If two or more time-series are cointegrated, then there must be Granger causality between them - either one-way or in both directions. However, the converse is not true."
(This last piece of information may provide a cross-check on your overall conclusions.)
O.K. - now back to VARs vs. VECMs!

Suppose that at Step 6 we come to the conclusion that the time-series are cointegrated. In general, the presence of cointegration would suggest that we should model the data using a VECM model, rather than using a VAR model. That's modelling the data, though, not testing for Granger non-causality.

Here's the deal.

To get to the point where we are considering using a VECM model as the basis for the causality testing, we had to go through the prior step of testing for cointegration; and only if we rejected the hypothesis of "no cointegration" would we even consider estimating a VECM model. This is a classic example of "preliminary test testing". That is, the framework (model) chosen as the basis for the non-causality test is conditional on the outcome of a previous test - a test for non-cointegration. It's as if the choice between a VAR model and a VECM model (as the framework within which to test for non-causality) is made by flipping a biased coin. (Remember that there are always Type I and Type II errors associated with any classical hypothesis test.)

So, one important question that arises is the following one:

If we first test for non-cointegration, and then (conditional on the outcome of this test) we perform another test, what are the properties of this second test?
You see, the second test (the test for non-causality) will be of one form if we decide to use the VAR, and of a different form if we decide to use a VECM. When we pre-test, the second test is actually a random mixture of two tests. The "actual" test statistic is a weighted sum of the test statistic that would be obtained if we used a VAR model, and the test statistic that would be obtained if  we used a VECM model. And the weights are random, with values that depend on the properties of the prior (non-cointegration) test.

The upshot of all of this is as follows. When we test for no cointegration, then decide on a VAR model or a VECM model, and then apply a Granger non-causality test, the properties of this last test aren't at all what we think they are. They're really messy, and the best way to find out what's going on is to conduct a Monte Carlo experiment.

In particular, there will almost certainly be some distortion in the significance level (and hence the power) of the final test. We may think we're applying the non-causality test at (say) the 5% level, but the true significance level (the actual rate of rejection of  the null hypothesis when this hypothesis is false) may be quite different. And this might (should?) bother us.

[As an aside, if you think that the "size distortion" that can arise from pre-test testing may not be a big deal, then take a look at the results in Table 2 of King and Giles, 1984.]

So, has anyone investigated the issue of the effects of pre-test testing in the case we're interested in here - the case of testing for Granger non-causality, after first testing to to see if there is cointegration, so that we effectively randomize the choice of a VAR or VECM model?

Of course they have! You can take a look at the studies by Toda and Phillips (1994), Dolado and  Lütkepohl (1996), Zapata and Rambaldi (1997), and Clarke and Mirza (2006) for lots of interesting details. I particularly recommend the last of these papers, by my colleague Judith Clarke and her former student, Sadaf Mirza.

Zapata and Rambaldi (1997, p.294) find that the T-Y Wald test is clearly preferred to the likelihood ratio test used in the context of a VECM model, unless the sample size is extremely small. (Would we really want to be going through all of this with a very small sample, especially when cointegration is a long-run phenomenon?)

The big take-home message from this research is very simple:
"We find that the practice of pretesting for cointegration can result in severe overrejections of the noncausal null, whereas overfitting [that's the T-Y methodology; DG] results in better control of the Type I error probability with often little loss in power." (Clarke & Mirza, 2006, p.207.)


Note: The links to the following references will be helpful only if your computer's IP address gives you access to the electronic versions of the publications in question. That's why a written References section is provided.

References

Clarke, J. A. and S. Mirza (2006). A comparison of some common methods for detecting Granger noncausality. Journal of Statistical Computation and Simulation, 76, 207-231.

Dolado, J. J and H. Lütkepohl (1996). Making Wald tests work for cointegrated VAR systems. Econometric Reviews, 15, 369-386.


Toda, H. Y. and P. C. B. Phillips (1994). Vector autoregressions and causality: a theoretical overview and simulation study. Econometric Reviews, 13, 259-285.

Toda, H. Y. and T. Yamamoto (1995). Statistical inferences in vector autoregressions with possibly integrated processes. Journal of Econometrics, 66, 225-250.

Zapata, H. O. and A. N. Rambaldi (1997). Monte Carlo evidence on cointegration and causation. Oxford Bulletin of Economics and Statistics, 59, 285-298.




© 2011, David E. Giles

34 comments:

  1. I am a bit ashamed to say this, but my coworkers and I have been kicking this post around and we are unclear on the 'why' of step 3 "Set up a VAR model in the levels (not the differences) of the data, regardless of the orders of integration of the various time-series."

    We've flipped through our text books from back in grad school and couldn't find the answer. Could you help?

    ReplyDelete
  2. Anaonymous: Thanks for the comment. Not your fault!
    There's a bit more information in my April post (linked in the post above). You won't find anything about this in any of the texts written prior to 1994. Indeed, even recent general grad. econometric texts don't cover it - you'd need to look at something like Helmut Lutkepohls' "Multiple Time Series" text. This is a great example of the books lagging behind the theory (and practice, actually). The point is this. If the data are non-stationary then the usual Wald test (or the LRT for that matter) for testing the restrictions involved in causality doesn't have its usual asymptotic (chi square) distribution. The distribution is non-standard and involves unknown "nuisance" parameters, so it can't be tabulated, and you don't have proper critical values to use - even with an infinite amount of data. Now, there are basically 2 equivalent ways to deal with this, the simpler of which to apply is the Toda-Yamamoto "trick". That's all it is - a trick to "fix up" the distribution of the Wald test statistic so it is asymptotically chi square. You fit the model in the levels (counter-intuitive, I know, if the data are non-stationary). It's the ADDITION of the extra lags (that are NOT included in the formlulation of the test) that gets you the result you want.

    Two things to note: (1) This will still be OK even if the data are stationary, so you can use the T-Y approach as an insurance policy, if you even "suspect" that one or more of the series may by I(1) or I(2); (2) This model in the levels, with the extra lags, is ONLY for causality testing. It's not to be used for forecasting, impulse response function analysis, or anything else. For those purposes you would still use a VAR in the differences, if the data were I(1) but not cointegrated, or a VECM if the data are in fact cointegrated.

    DO take a look at the T-Y paper: even just the abstract, intro. aned conclusions. It really will help. I hope that these comments do too.

    ReplyDelete
  3. Dear Prof. Giles,

    To begin with, thank you very much about your extraordinary clarifying blog. A blog like yours I think is an exellent example how science and teaching can work in the 21st millennium.

    This and your entry about Granger Causality explains the procedure sufficently, while e.g. in Lütkephol (it is a exellent book nevertheless) these issues are less clear presented and I think difficult to understand for many students. Actually, I have already read some (recently) published papers where Granger Causality tests were implemented in a questionable way. The prefered prodecure (or any other mentioned in Lütkephol) does not seem to be known in applied works all the time. Is there some published material that explains testing Granger-causality with respect to VEC, VECM, Integrated and Cointegrated data, etc. in a concise and lucid way like your blog entry? If not, would a clarifying methodological published note not be worth it.

    By the way, another methodological question. Testing for cointegration first and choosing the model for the causality test conditional on the first test is "preliminary test testing". However, why is testing for the order integration fist and including additional lags for the causality test conditional on the first test not basically "preliminary test testing" in similar way? (additional lags are supposably not the same such as a different model [VECM] as a whole)

    I will keep in touch with your great blog!

    Kind regards,
    Georg

    ReplyDelete
    Replies
    1. Georg: Thank you for your kind comments. It's good to know that the blog is being helpful.

      Regarding your first question, I can't think of an easy-to-read piece of material that's published. It's no doubt something that people would find helpful, though.

      Regarding your point about pre-testing: Yes! Absolutely - there are important pre-testing issue when you (i) test for unit roots, and then subsequently test for cointegration; (ii) test for units roots (and/or cointegration) & then test for Granger non-causality, etc.

      I've published a number of papers on pre-testing in the past - see my c.v. at
      web.uvic.ca/~dgiles/dgiles_cv.pdf .

      I have drafts of a couple of posts on pre-testing in general that I plan to put on the blog before too long: one on pre-test estimation; and one on pre-test testing.

      Hopefully, these will be of some interest.

      Delete
  4. Dear Prof Giles,
    Aside from the reason you posted in your previous blog entry:"This might occur if your sample size is too small to satisfy the asymptotics that the cointegration and causality tests rely on."
    Is there any other reason why there is no Granger causality between two cointegrated variables?
    I am investigating oil price benchmarks in real effective exchange rates. In particular, with the Chinese yuan. I have performed the Granger causality test as you have outlined (very clear and helpful by the way) but there is none present. I'm using data from 1994 to the present, seasonal dummy variables are used (monthly and exogenous) and even when I omit the financial crises data from 2008 onwards, there is still no granger causality.
    Many thanks in advance for your help!

    Kindest regards
    Anonymous

    ReplyDelete
    Replies
    1. Thanks for your question. Despite the data you have omitted, thre could be structural breaks that are affecitn either the cointegration testing or the causality testing. You say you have monthly data, so another possibility is that there are seasonal unit roots and/or seasonal cointegration.

      Delete
  5. Dear Prof. Giles,
    When you talk about a "VECM model as the basis for non-causality testing" which testing procedure are you referring to? Is it the likelihood ratio test due to Mosconi/Giannini 1992?

    Are these Granger causality-tests in a VECM context implemented in any standard econometrics software (I am using stata but I could not find any Granger causality-test in a VECM framework)?

    Thanks to you I can see the problem of a pretest bias when conducting tests in a VECM. But - given that we have cointegrated variables - shouldn't these tests be more efficient as we impose correct and more specific restrictions? Is it perhaps that the negative pretest bias is stronger than the effect of imposing valid restrictions?

    Thanks for this great blog!
    Best regards,
    Manuel

    ReplyDelete
    Replies
    1. Manuel: Thanks for the comment.

      Yes, I had in mind tests like the Mosconi-Giannini test.

      I'm not aware of this test being incorporated in any of the standrd econometrics packages, but other readers of this blog may be able to help on this point.

      You are right that there is a trade-off between the loss of power arising from the pre-test testing, and the gain in power when we impose correct restrictions. This is a comon problem, and at the end of the day the net effect will depend on the particular problem we're looking at.

      Delete
  6. Dear Prof. Giles,

    I would like to ask regarding to the coefficient of ECT. Some researchers said the coefficient of ECT consider good if the range between 0-1. What do you think, Prof.? Please advice.
    Thanks.

    ReplyDelete
    Replies
    1. Thanks for the question. We want the coefficient of an ECT to be negative, and we'd like it to be statistically significant.

      Delete
    2. Dear Prof. Giles,

      Wouldn't the desired sign of the coefficient estimate of the ECT be based on which line of the VECM system we're looking at? For example, if we have the second row of the most simple bivariate VECM:

      Delta*x_t = alpha*(y_{t-1}-beta*x_{t-1}) + e_t

      then we would want alpha to be positive such that when y_{t-1} gets "too big", the process x will increase over the next period to correct the disequilibrium?

      Delete
  7. Dear Prof Giles,

    I am currently researching whether remittances granger cause gdp and health expenditure. I have tested and transformed for stationarity(time series are I(2) processes), found lags elections using AIC etc and following this I had originally planned to simply model my data using VAR and then implement the Granger test in Stata. After reading your extremely useful (thank you!) blog posts I feel I need to employ a test for cointegration for each set of variables (i.e. remittances and gdp, remittances and health expenditure) and then decide whether my data must be modelled using a VAR or a VECM, rather than go straight to a VAR. Am I correct in my thinking? Econometrics is not my strong point!
    Thanks in advance.

    ReplyDelete
  8. Thanks for your comment.

    The evidence to hand suggests that iut is preferable to test for Granger causality using a levels VAR model (modified as per the Toda-Yamamoto procedure), rather than using a VECM model for causality testing.

    If you are using STATA, note that the Granger test there does NOT make the required Toda-Yamamoto adjustment. You will need to include (but not test) 2 extra lags of each variable, as some of your data are I(2).

    ReplyDelete
  9. Dear Prof Giles,

    Once again, I cannot thank you enough for what can only be described as a truly fantastic institution (your blog).

    Having read the TY (1995) paper, and undertaken some tests, I was looking to go further and do some kind of robustness check within a VECM framework, but I am struggling to find any commercial software which tests the restriction ,which i believe originates in Mosconi-Giannini, and is what i believe is being tested in the EXCELLENT Clarke and Mirza paper - that is: the product of the two relevant elements of cointegrating vector and error correction mechanism, and the coefficients on the lagged differenced variables are jointly equal to zero.

    If this is something which I want to pursue further, am I going to have to write up a matlab file or similar? Can this be coded into Eviews somehow? I can obviously estimate the VECM, then estimate this as a system equation by equation, and jointly test the \alpha=\differenced coefficients=0... but this is not quite what is what we're after, as it a test which is restricting the whole cointegrating relationship, not just one variable.

    Do you have any suggestions? Presumably, Clarke and Mirza write up their own proprietary code, but this is something which I would obviously be keen to avoid, if at all possible!

    Best wishes, thanks again for all of your hard work that goes into the blog!

    ReplyDelete
    Replies
    1. Thanks for the kind comments. I'm glad that the blog is helpful.

      I'll talk with Judith Clarke and see what can be done to get you some code, etc.

      Delete
  10. Dear Prof Giles,
    thank you very much for this helpful blog.
    I am at the first stages of learning econometric. I am sorry to ask this simple question, may I know that if times series data are I(0) and I(1), (mixed integrated order, can we employ Granger causality based on VECM?
    thanks in advance

    ReplyDelete
    Replies
    1. The VECM model is only defined when the time-series are cointegrated. For this to be the case the series need to be integratd of the same order. So, the answer to your question is "no".

      Delete
    2. Dear Prof,

      Does VECM show the direction of dependence in the long run? e.g. if we found that four stock market indices are co-integrated, Can VECM show which is the dependent market in the long run?

      Thank you.

      Delete
    3. No - this is a matter for causality testing.

      DG

      Delete
  11. Dear Professor Giles,

    Pesaran's bound test approach is a way to test cointegration when underlying series are not integrated to the same order (am I right on this point?). If this is the case, is there a way to test causality under this situation? Thanks and regards, Kamrul, Murdoch University, Perth, WA

    ReplyDelete
  12. Dear prof,

    How to know which coefficient is significant in the VECM output?

    ReplyDelete
    Replies
    1. The estimated coefficients will be asymptotically Normal, so if you have a big enough sample, treat the t-statistics as if they are z-statistics.

      Delete
    2. Thank you prof.

      One more question, If I have four variables (stock indices) and the JJ cointegration shows two cointegrating equations, should I run VECM with two cointegrations, or run it with one cointegrating equation at a time? because I want to know in the long run which index is influenced by the other.

      Thanks.

      Delete
    3. Dear Prof Dave Giles,
      I got your posts regarding the T-Y approach to Granger non-causality very helpful. Thank you! But, my question is that is it for short-run or long-run Granger causality?

      Delete
  13. It's short run - one period. See my response(with a reference) to the same question on the "Testing for Granger Causality" post a couple of days ago.

    DG

    ReplyDelete
  14. Dear Prof Dave Giles,

    If the equation contains only 2 variables (one dependent and only one independent variables) and dependent variable is I(0) while independent variable is I(1), can I test Engle and Granger cointegration based on this kind of data?

    And if after the test of cointegration, can I continue to test VEC(if it is cointegrated) and VAR(if it is not cointegrated)?

    Thank you very much in advance.

    ReplyDelete
    Replies
    1. No - the whole concept of cointegration is based on variables that are integrated of the same order. So, if the variables are all I(1) it makes sense to test if a linear combination of them is I(0). If such a linear combination exists, we say the original variables are cointegrated. If you have just 2 variables, on I(1) and one I(0), cointegration isn't possible.

      Delete
  15. Dear Prof Dave Giles,

    Thank you for your prompt reply.

    I would like to ask you another question regarding the critical value in unit root process. I use Stata program to run the test of ADF in step of unit root, my data contains 239 observations. In order to determine whether it is I(0), can I compare t value with critical value in table of ADF result directly, or I have to use MacKinnon's Critical Values for ADF integration.

    Thank you very much for the help.

    ReplyDelete
    Replies
    1. I'm not a STATA user - you'd need to check if the critical values are the asymptotic ones or exact ones from MacKinnon. IN EViews, the exact ones are used, together with p-values. If you have n=239, there won't be much difference between exact and asymptotic values, but if in doubt, use the MacKinnon values. And check the STAT manual or "help" - it's important to know what the package is giving you. :-)

      Delete
  16. Dear Prof. Giles,

    I'm currently working on a VAR model with one I(0) variable and one I(1) variable. Is there any theoretical foundation on how to do this? Most papers write about VAR models based on differences or levels. Can I model with a differenced and a non-differenced variable? Thank you very much for your time and very clear explanations on this blog!

    Kind regards,
    Robin

    ReplyDelete
    Replies
    1. Robin - if it's causality testing that you interested in, see this post: http://davegiles.blogspot.ca/2011/04/testing-for-granger-causality.html

      Putting causality to one side, it you just want to fit the VAR and use it for forecasting or impulse response functions, you have 2 options:
      1. Use the level of the I(0) variable & the first-difference of the I(1) variable.

      2. Difference both variables. The differenced I(0) variable will still be stationary. There is risk of over-differencing the I(0) variable, but overall I'd prefer to choose this option.

      I hope this helps.

      Delete
  17. Dear prof Giles
    My research title is ( tourism -led growth hypothesis:case study of Liby and I would investigate the relasionship between tourism and economic growth. My data period is annual data from1995-2010. And my variables are GDP. International receipt, unemployment rate , also I would investigate the short run and long run relationship and the causality between these variables.
    I would ask what are the steps should I follow to investigate the relationship and causality between the variables
    My regards
    Nagma

    ReplyDelete
    Replies
    1. Nagma: I have spelled out the steps in detail in my post here:
      http://davegiles.blogspot.ca/2011/04/testing-for-granger-causality.html

      DG

      Delete