## Monday, September 30, 2013

### Solution to the Regression Trick

In a post earlier this month, I posed the following problem:

A researcher wishes to estimate the regression of y on X by OLS, but does not wish to include an intercept term in the model. Unfortunately, the only econometrics package available is one that "automatically" includes the intercept term. A colleague suggests that the following approach may be used to ‘trick’ the computer package into giving the desired result – namely a regression fitted through the origin:
Enter each data point twice, once with the desired signs for the data, and then with the opposite signs. That is, the sample would involve ‘2n’ observations – the first ‘n’ of them would be of the form (yi, xi') and the next ‘n’ of them would be of the form (-yi , -xi'). Then fit the model (with the intercept) using all ‘2n’ observations, and the estimated slope coefficients will be the same as if the model had been fitted with just the first ‘n’ observations but no intercept.”
Is your colleague's suggestion going to work?

## Friday, September 27, 2013

### More Interesting Papers to Read

Here's my latest list of suggested reading:

• Bayer, C. and C. Hanck, 2012. Combining non-cointegration tests. Journal of Time Series AnalysisDOI: 10.1111/j.1467-9892.2012.814.x
• Cipollina, M., L. De Benedictis, L. Salvatici, and C. Vicarelli, 2013.  A note on dummies for policies in gravity models: A Monte Carlo experiment. Working Paper no. 180, Dipartimento di Economia, Università degli studi Roma Tre.
• Fair, R. C., 2013. Reflections on macroeconometric modelling. Cowles Foundation Discussion Paper No. 1908, Yale University.
• Kourouklis, S., 2012. A new estimator of the variance based on minimizing mean squared error. The American Statistician, 66, 234-236.
• Kulish, M. and A. R. Pagan, 2013. Issues in estimating new-Keynesian Phillips curves in the presence of unknown structural change. Research Discussion Paper, RDP 2012-11, Reserve Bank of Australia.
• Little, R. J., 2013. In praise of simplicity, not mathematistry! Ten simple powerful ideas for the statistical scientist. Journal of the American Statistical Association, 108, 359-369.
• Zhang, L., X. Xu, and G. Chen, 2012. The exact likelihood ratio test for equality of two normal populations. The American Statistician, 66, 180-184.

## Wednesday, September 25, 2013

### New Working Paper

Yanan Li (a former graduate student) and I have just released a new Working Paper. It's titled, "Modelling Volatility Spillover Effects Between Developed Stock Markets and Developing Asian Stock Markets".

If you're interested, you can download a copy of the paper from here.

ver Effects Between Developed Stock Markets and Asian Emerging Stock Marketsodelling Volatility Spillover Effects Between Developed Stock Markets and Asian Emerging Stock Markets

## Friday, September 20, 2013

### Roger Farmer on the Natural Rate Hypothesis and the Phillips Curve

Following yesterday's post about the Phillips Curve, Roger Farmer kindly emailed me and drew my attention to some of his related work.

One of his articles appeared recently in the Bank of England's Quarterly Bulletin - see here. It's titled, "The Natural Rate Hypothesis: An Idea Past its Sell-By Date". The "quick summary" is as follows:
• "Central banks throughout the world predict inflation with New Keynesian models where, after a shock, the unemployment rate returns to its so-called ‘natural rate’. That assumption is called the Natural Rate Hypothesis (NRH).
• This paper reviews a body of work, published over the past decade, in which I argue that the NRH does not hold in the data and provide an alternative paradigm that explains why it does not hold.
• I replace the NRH with the assumption that the animal spirits of investors are a fundamental of the economy that can be modelled by a ‘belief function’. I show how to operationalise that idea by constructing an empirical model that outperforms the New Keynesian Phillips Curve."
On p.246 of his article, Roger has a very nice illustrated summary of the estimation of the first Phillips Curve.

### On Zero Correlation and Statistical Independence

I put the following material together yesterday in response to a request from one of our grad. students. I thought it might be helpful to some readers of the blog.

## Thursday, September 19, 2013

### More on the History of the Phillips Curve

I've had two posts about A. W. (Bill) H. Phillips in the past - here and here. This is Phillips of the Phiilips Curve fame, of course.

Recently, Michael Mernagh has written two pieces about Phillips' original analysis for the online version of Significance Magazine, a joint publication of the American Statistical Association and the Royal Statistical Society. These pieces are titled A Short Overview of the Phillips Curve, and The Phillips Curve Revisited.

If you have an interest in the history of macroeconometrics, or the contributions of Bill Phillips, then Michael's short articles will interest you.

### P-Values, Statistical Significance, and Logistic Regression

Yesterday, William M. Briggs ("Statistician to the Stars") posted on his blog a piece titled "How to Mislead With P-values: Logistic Regression Example".

Here are some extracts which, hopefully, will encourage to read the post:

"It’s too easy to generate “significant” answers which are anything but significant. Here’s yet more—how much do you need!—proof. The pictures below show how easy it is to falsely generate “significance” by the simple trick of adding “independent” or “control variables” to logistic regression models, something which everybody does...............

Logistic regression is a common method to identify whether exposure is “statistically significant”. .... (The) Idea is simple enough: data showing whether people have the malady or not and whether they were exposed or not is fed into the model. If the parameter associated with exposure has a wee p-value, then exposure is believed to be trouble.
So, given our assumption that the probability of having the malady is identical in both groups, a logistic regression fed data consonant with our assumption shouldn’t show wee p-values. And the model won’t, most of the time. But it can be fooled into doing so, and easily. Here’s how.
Not just exposed/not-exposed data is input to these models, but “controls” are, too; sometimes called “independent” or “control variables.” These are things which might affect the chance of developing the malady. Age, sex, weight or BMI, smoking status, prior medical history, education, and on and on. Indeed models which don’t use controls aren’t considered terribly scientific.
Let’s control for things in our model, using the same data consonant with probabilities (of having the malady) the same in both groups. The model should show the same non-statistically significant p-value for the exposure parameter, right? Well, it won’t. The p-value for exposure will on average become wee-er (yes, wee-er). Add in a second control and the exposure p-value becomes wee-er still. Keep going and eventually you have a “statistically significant” model which “proves” exposure’s evil effects. Nice, right?"
Oh yes - don't forget to read the responses/comments for this post, here.

## Tuesday, September 17, 2013

### Another Regression Trick

Here's an exercise that I've set for one of my econometrics classes this week.

A researcher wishes to estimate the regression of y on X by OLS, but does not wish to include an intercept term in the model. Unfortunately, the only econometrics package available is one that "automatically" includes the intercept term. A colleague suggests that the following approach may be used to ‘trick’ the computer package into giving the desired result – namely a regression fitted through the origin:
Enter each data point twice, once with the desired signs for the data, and then with the opposite signs. That is, the sample would involve ‘2n’ observations – the first ‘n’ of them would be of the form (yi, xi') and the next ‘n’ of them would be of the form (-yi , -xi'). Then fit the model (with the intercept) using all ‘2n’ observations, and the estimated slope coefficients will be the same as if the model had been fitted with just the first ‘n’ observations but no intercept.”
Is your colleague's suggestion going to work?

I'll provide the answer after my students have completed their assignment.

## Sunday, September 15, 2013

### "Replication" and "Reproducibility"

We might be inclined to use the terms "replicate" and "reproduce" interchangeably. However, in the context of scientific verification, a distinction has been drawn between them.

The 2 December 2011 issue of Science devoted a special section to "Data Replication and |Reproducibility". Among the contributors was Roger Peng, whom I follow on the Simply Statistics blog. In one of his posts, Roger defines the two terms under discussion:

"......I define “replication” as independent people going out and collecting new data and “reproducibility” as independent people analyzing the same data. Apparently, others have the reverse definitions for the two words. The confusion is unfortunate because one idea has a centuries long history whereas the importance of the other idea has only recently become relevant. I’m going to stick to my guns here but we’ll have to see how the language evolves."
And the discussion has continued.

Recently, Roger has produced a three-part post, titled "Trading a New Path for Reproducible Research": Part 1; Part 2; Part 3.

It's definitely worth a read if you're involved in data-based research.

## Wednesday, September 11, 2013

### Can Your Results be Replicated?

It's a "given" that your empirical results should be able to be replicated by others. That's why more and more journals are encouraging or requiring that authors of such papers "deposit" their data and code with the journal as a condition of acceptance for publication.

That's all well and good. However, replicating someone's results using their data and code may not mean very much!

## Sunday, September 8, 2013

### Ten Things for Applied Econometricians to Keep in Mind

No "must do" list is ever going to be complete, let alone perfect. This is certainly true when it comes to itemizing essential ground-rules for all of us when we embark on applying our knowledge of econometrics.

That said, here's a list of ten things that I like my students to keep in mind:
1. Always, but always, plot your data.
2. Remember that data quality is at least as important as data quantity.
3. Always ask yourself, "Do these results make economic/common sense"?
4. Check whether your "statistically significant" results are also "numerically/economically significant".
5. Be sure that you know exactly what assumptions are used/needed to obtain the results relating to the properties of any estimator or test that you use.
6. Just because someone else has used a particular approach to analyse a problem that looks like yours, that doesn't mean they were right!
7. "Test, test, test"! (David Hendry). But don't forget that "pre-testing" raises some important issues of its own.
8. Don't assume that the computer code that someone gives to you is relevant for your application, or that it even produces correct results.
9. Keep in mind that published results will represent only a fraction of the results that the author obtained, but is not publishing.
10. Don't forget that "peer-reviewed" does NOT mean "correct results", or even "best practices were followed".
I'm sure you can suggest how this list can be extended!

## Saturday, September 7, 2013

### More on Multiple Bubbles

In a recent post I highlighted a new EViews Add-in package, written by Itamar Caspi. His rtdaf package facilitates the application of the Right-Tail Augmented Dickey-Fuller tests that are "....designed to detect the presence of an unobserved bubble component in an observed asset price and to date-stamp its occurrence".

In part, the testing procedures are based on Phillips et al. (2013b). If you're following this literature, there are two other recent papers by those authors that are a must-read - Phillips et al. (2013,a,b)

References

Phillips, P. C. B., Shi, S., and Yu, J., 2013a, Specification sensitivity in right-tailed unit root testing for explosive behaviour. Oxford Bulletin of Economics and Statistics, forthcoming.

Phillips, P., S. Shi, and J. Yu, 2013b. Testing for Multiple Bubbles 1: Historical episodes of exuberance and collapse in the S&P 500. Working paper.

Phillips, P., S. Shi, and J. Yu, 2013c. Testing for Multiple Bubbles 2: Limit theory of real time detectors.  SMU Economics and Statistics Working Paper Series, No. 05-2013.

## Friday, September 6, 2013

For better, or worse, here are some of the papers I've been reading lately:
• Chambers, M. J., J. S. Ercolani, and A. M. R. Taylor, 2013. Testing for seasonal unit roots by frequency domain regression. Journal of Econometrics, in press.
• Chicu, M. and M. A. Masten, 2013. A specification test for discrete choice models. Economics Letters, in press.
• Hansen, P. R. and A. Lunde, 2013. Estimating the persistence and the autocorrelation function of a time series that is measured with error. Econometric Theory, in press.
• Liu, Y., J. Liu, and F. Zhang, 2013. Bias analysis for misclassificaiton in a multicategorical exposure in a logistic regression model. Statistics and Probability Letters, in press.
• Thornton, M., 2013, The aggregation of dynamic relationships caused by incomplete information. Journal of Econometrics, in press.
• Wang, H. and S. Z. F. Zhou, 2013. Interval estimation by frequentist model averaging. Communications in Statistics - Theory and Methods, in press.