Saturday, December 31, 2016

New Year's Reading

New Year's resolution - read more Econometrics!
  • B├╝rgi, C., 2016. What do we lose when we average expectations? RPF Working Paper No. 2016-013, Department of Economics, George Washington University.
  • Cox, D.R., 2016. Some pioneers of modern statistical theory:A personal reflection. Biometrika, 103, 747-759
  • Golden, R.M., S.S. Henley, H. White, & T.M. Kashner, 2016. Generalized information matrix tests for detecting model misspecification. Econometrics, 4, 46; doi:10.3390/econometrics4040046.
  • Phillips, G.D.A. & Y. Xu, 2016. Almost unbiased variance estimation in simultaneous equations models. Working Paper No. E2016/10, Cardiff Business School, University of Cardiff. 
  • Siliverstovs, B., 2016. Short-term forecasting with mixed-frequency data: A MIDASSO approach. Applied Economics, 49, 1326-1343.
  • Vosseler, A. & E. Weber, 2016. Bayesian analysis of periodic unit roots in the presence of a break. Applied Economics, online.
Best wishes for 2017, and thanks for supporitng this blog!

© 2016, David E. Giles

Thursday, December 29, 2016

Why Not Join The Replication Network?

I've been a member of The Replication Network (TRN) for some time now, and I commend it to you.

I received the End-of-the-Year Update for the TRN today, and I'm taking the liberty of reproducing it below in its entirety in the hope that you may consider getting involved.

Here it is:

Wednesday, December 28, 2016

More on the History of Distributed Lag Models

In a follow-up to my recent post about Irving Fisher's contribution to the development of distributed lag models,  Mike Belongia emailed me again with some very interesting material. He commented:
"While working with Peter Ireland to create a model of the business cycle based on what were mainstream ideas of the 1920s (including a monetary policy rule suggested by Holbrook Working), I ran across this note on Fisher's "short cut" method to deal with computational complexities (in his day) of non-linear relationships. 
I look forward to your follow-up post on Almon lags and hope Fisher's old, and sadly obscure, note adds some historical context to work on distributed lags."
It certainly does, Mike, and thank you very much for sharing this with us.

The note in question is titled, "Irving Fisher: Pioneer on distributed lags", and was written by J.N.M Wit (of the Netherlands central bank) in 1998. If you don't have time to read the full version, here's the abstract:
"The theory of distributed lags is that any cause produces a supposed effect only after some lag in time, and that this effect is not felt all at once, but is distributed over a number of points in time. Irving Fisher initiated this theory and provided an empirical methodology in the 1920’s. This article provides a small overview."
Incidentally, the paper co-authored with Peter Ireland that Mike is referring to it titled, "A classical view of the business cycle", and can be found here.

© 2016, David E. Giles

Tuesday, December 27, 2016

More on Orthogonal Regression

Some time ago I wrote a post about orthogonal regression. This is where we fit a regression line so that we minimize the sum of the squares of the orthogonal (rather than vertical) distances from the data points to the regression line.

Subsequently, I received the following email comment:
"Thanks for this blog post. I enjoyed reading it. I'm wondering how straightforward you think this would be to extend orthogonal regression to the case of two independent variables? Assume both independent variables are meaningfully measured in the same units."
Well, we don't have to make the latter assumption about units in order to answer this question. And we don't have to limit ourselves to just two regressors. Let's suppose that we have p of them.

In fact, I hint at the answer to the question posed above towards the end of my earlier post, when I say, "Finally, it will come as no surprise to hear that there's a close connection between orthogonal least squares and principal components analysis."

What was I referring to, exactly?

Monday, December 26, 2016

Specification Testing With Very Large Samples

I received the following email query a while back:
"It's my understanding that in the event that you have a large sample size (in my case, > 2million obs) many tests for functional form mis-specification will report statistically significant results purely on the basis that the sample size is large. In this situation, how can one reasonably test for misspecification?" 
Well, to begin with, that's absolutely correct - if the sample size is very, very large then almost any null hypothesis will be rejected (at conventional significance levels). For instance, see this earlier post of mine.

Schmueli (2012) also addresses this point from the p-value perspective.

But the question was, what can we do in this situation if we want to test for functional form mis-specification?

Schmueli offers some general suggestions that could be applied to this specific question:
  1. Present effect sizes.
  2. Report confidence intervals.
  3. Use (certain types of) charts
This is followed with an empirical example relating toauction prices for camera sales on eBay, using a sample size of n = 341,136.

To this, I'd add, consider alternative functional forms and use ex post forecast performance and cross-validation to choose a preferred functional form for your model.

You don't always have to use conventional hypothesis testing for this purpose.


Schmueli, G., 2012. Too big to fail: Large samples and the p-value problem. Mimeo., Institute of Service Science, National Tsing Hua University, Taiwan.

© 2016, David E. Giles

Irving Fisher & Distributed Lags

Some time back, Mike Belongia (U. Mississippi) emailed me as follows: 
"I enjoyed your post on Shirley Almon;  her name was very familiar to those of us of a certain age.
With regard to your planned follow-up post, I thought you might enjoy the attached piece by Irving Fisher who, in 1925, was attempting to associate variations in the price level with the volume of trade.  At the bottom of p. 183, he claims that "So far as I know this is the first attempt to distribute a statistical lag" and then goes on to explain his approach to the question.  Among other things, I'm still struck by the fact that Fisher's "computer" consisted of his intellect and a pencil and paper."
The 1925 paper by Fisher that Mike is referring to can be found here. Here are pages 183 and 184:

Thanks for sharing this interesting bit of econometrics history, Mike. And I haven't forgotten that I promised to prepare a follow-up post on the Almon estimator!

© 2016, David E. Giles

Sunday, December 18, 2016

Not All Measures of GDP are Created Equal

A big hat-tip to one of my former grad. students, Ryan MacDonald at Statistics Canada, for bringing to my attention a really informative C.D. Howe Institute Working Paper by Philip Cross (former Chief Economic Analyst at Statistics Canada).

We all know what's meant by Gross Domestic Product (GDP), don't we? O.K., but do you know that there are lots of different ways of calculating GDP, including the six that Philip discusses in detail in his paper, namely:
  • GDP by industry
  • GDP by expenditure
  • GDP by income
  • The quantity equation
  • GDP by input/output
  • GDP by factor input
So why does this matter?

Well, for one thing - and this is one of the major themes of Philip's paper - how we view (and compute) GDP has important implications for policy-making. And, it's important to be aware that different ways of measuring GDP can result in different numbers.

For instance, consider this chart from p.16 of the Philip's paper:

My first reaction when I saw this was "it's not flat". However, as RMM has commented below, "the line actually shows us the fluctuations of industries that are more intermediate compared with industries (or the total) that includes only final goods. Interesting and useful for business cycle analysis..."

Here's my take-away (p.18 of the paper):
"For statisticians, the different measures of GDP act as an internal check on their conceptual and empirical consistency. For economists, the different optics for viewing economic activity lead to a more profound understanding of the process of economic growth. Good analysis and policy prescription often depend on finding the right optic to understand a particular problem."
Let's all keep this in mind when we look at the "raw numbers".

© 2016, David E. Giles

Wednesday, December 14, 2016

Stephen E. Fienberg, 1942-2016

The passing of Stephen Fienberg today is another huge loss for the statistics community. Carnegie Mellon University released this obituary this morning.

Steve was born and raised in Toronto, and completed his undergraduate training in mathematics and statistics at the University of Toronto before moving to Harvard University for his Ph.D.. His contributions to statistics, and to the promotion of statistical science, were immense.

As the CMU News noted:
"His many honors include the 1982 Committee of Presidents of Statistical Society President's Award for Outstanding Statistician Under the Age of 40; the 002 ASA Samuel S. Wilks Award for his distinguished career in statistics; the first Statistical Society of Canada's Lise Manchester Award in 2008 to recognize excellence in state-of-the-art statistical work on problems of public interest; the 2015 National Institute of Statistical Sciences Jerome Sacks Award for Cross-Disciplinary Research; the 2015 R.A. Fisher Lecture Award from the Committee of Presidents of Statistical Societies and the ISBA 2016 Zellner Medal. 
Fienberg published more than 500 technical papers, brief papers, editorials and discussions.  He edited 19 books, reports and other volumes and co-authored seven books, including 1999's "Who Counts? The Politics of Census-Taking in Contemporary America," which he called "one of his proudest achievements." " 
There at least three terrific interviews with Steve that we have to remind us of the breadth of his contributions:

© 2016, David E. Giles

Monday, December 5, 2016

Monte Carlo Simulation Basics, III: Regression Model Estimators

This post is the third in a series of posts that I'm writing about Monte Carlo (MC) simulation, especially as it applies to econometrics. If you've already seen the first two posts in the series (here and here) then you'll know that my intention is to provide a very elementary introduction to this topic. There are lots of details that I've been avoiding, deliberately.

In this post we're going to pick up from where the previous post about estimator properties based on the sampling distribution left off. Specifically, I'll be applying the ideas that were introduced in that post in the context of regression analysis. We'll take a look at the properties of the Least Squares estimator in three different situations. In doing so, I'll be able to illustrate, through simulation, some "text book" results that will know about already.

If you haven't read the immediately preceding post in this series already, I urge you to do so before continuing. The material and terminology that follow will assume that you have.

Saturday, December 3, 2016

December Reading List

Goodness me! November went by really quickly!
© 2016, David E. Giles