Tuesday, February 9, 2016

The Replication Network

This is a "shout out" for The Replication Network.

The full name is, The Replication Network: Furthering the Practice of Replication in Economics. I was alerted to TRN some time ago by co-organiser, Bob Reed, and I'm pleased to be a member.

What's TRN about:
"This website serves as a channel of communication to (i) update scholars about the state of replications in economics, and (ii) establish a network for the sharing  of information and ideas. 
The goal is to encourage economists and their journals to publish replications."
There's News & Events; Guest Blogs; Research involving replications in economics; and lots more.

Hats off to TRN. We need more of this!

© 2016, David E. Giles

Monday, February 8, 2016

"Using R for Introductory Econometrics"

Recently, I received an email from Florian Heiss, Professor and Chair of Statistics and Econometrics at the Henrich Heine University of Dusseldorf.

He wrote:
"I'd like to introduce you to a new book I just published that might be of interest to you: Using R for Introductory Econometrics.

The goal: An introduction to R that makes it as easy as possible for undergrad students to link theory to practice without any hurdles regarding material, notation, or terminology. The approach: Take a popular econometrics textbook (Jeff Wooldridge's Introductory Econometrics) and make the whole thing as consistent as possible.

I introduce R and show how to implement all methods Wooldridge mentions mostly using his examples. I also add some Monte Carlo simulation and present tools like R Markdown.

The book is self-published, so I can offer the whole text for free online reading and a hard copy is really cheap as well."
The link for the online version of Florian's book is http://www.urfie.net/.

What you`ll find there are two versions of his 365-page book (Flash and HTML5) that you can read online; and all of the related R files for easy download.

Florian has used the CreateSpace publishing platform to produce an extremely professional product.

Using R for Introductory Econometrics is a fabulous modern resource. I know I'm going to be using it with my students, and I recommend it to anyone who wants to learn about econometrics and R at the same time.

If you're after a hard copy of the book you can purchase it for the bargain price of US$26.90 directly from CreateSpace, or from Amazon.

© 2016, David E. Giles

Tuesday, February 2, 2016

February Reading List

Here's a suggested reading list for February:
  • Casey, G. and M. Klemp, 2016. Instrumental variables in the long run. MPRA Paper No. 68696.
  • Coglianese, J., L. W. Davis, L. Kilian, and J. H. Stock, 2016. Anticipation, tax avoidance, and the price elasticity of gasoline demand. Journal of Applied Econometrics, in press.
  • Falorsi, S., A. Naccarato, and A. Pierini, 2015. Using Google trend data to predict the Italian unemployment rate. Working Paper No. 203, Dipartimento di Economia, Università degli studi Roma Tre.
  • Harris, D., S. J. Leybourne, and A. M. Robert, 2016. Test of the co-integration rank in VAR models in the presence of a possible break in trend at an unknown point. Working Paper No. 5 01-2016, Essex Finance Centre, Essex Business School, University of Essex.
  • Inoue, A. and G. Solon, 2010. Two-sample instrumental variables estimators. Review of Economics and Statistics, 93, 557-561.
  • Kim, N., 2016. A robustified Jarque-Bera test for multivariate normality. Economics Letters, in press.

© 2016, David E. Giles

Sunday, January 24, 2016

(Legally) Free Books!

(An earlier version of this post inadvertently included links to "pirated" material. This has now been rectified, and the post has been completely re-written.)

There are several Econometrics books, and comprehensive sets of lecture notes, that can be accessed for free. These include a number of excellent books by world-class econometricians.

Here a few that will get you started:

Thanks to Donsker Class for supplying several of these links.

If you know of others I'd love to hear about them.

© 2016, David E. Giles

Friday, January 22, 2016

Modelling With the Generalized Hermite Distribution

"Count" data occur frequently in economics. These are simply data where the observations are integer-valued - usually 0, 1, 2, ....... . However, the range of values may be truncated (e.g., 1, 2, 3, ....).

To model data of this form we typically resort to distributions such as the Poisson, negative binomial, or variations of these. These variations may account for truncation or censoring of the data, or the over-representation of certain count values (e.g., the "zero-inflated" Poisson distribution).

Covariates (explanatory variables) can be included into the model by making the mean of the distribution a function of these variables. After all, that's exactly what we do in a linear regression model.

If the "count" data form a time-series, then there are other issues that have to be taken into account.

However, the discrete distributions that we typically use have a number of limitations. The fact that the Poisson distribution is, of necessity, "equi-dispersed" (its variance equals its mean) is a big limitation. This leads us to consider distributions such as the negative binomial, in which he variance exceeds the mean. This enables us to model "over-dispersed" data, which are encountered frequently in practice.

The standard distributions are also limited in terms of what they can model in terms of distributional shapes. In particular, there are limitations on modal values in the data.

For instance, in the case of the Poisson distribution, these limitations are the following. If the parameter (λ) of the Poisson distribution is an integer, then there are two adjacent modes with equal modal height, at x = λ and x = λ-1. If lambda is non-integer, then there is a single mode at int(λ), the integer part of λ.

In the case of the negative binomial distribution, there is a single mode.

This suggests that standard discrete distributions of the type that we typically use to mode l"count" data will not be very satisfactory if our data exhibit multi-modality.

We need to look to alternative distributions.

Here's an example of what I mean.

In an earlier post, I discussed some of my work involving the use of the so-called Hermite distribution, introduced by Kemp and Kemp (1965). As an example, I showed the distribution of data relating to the number of financial crises in various countries, as reproduced here:

You can see that, apart from being multi-modal, this empirical distribution is over-dispersed (its variance is approximately twice its mean).

In Giles (2010) I used the Hermite distribution, and various covariates, to model these data using maximum likelihood estimation.

The Hermite distribution can be generalized in various ways. Recently, Moriña et al. (2015) have released a terrific R package, called hermite, that makes it really easy to model "count data" using the Generalized Hermite distribution. We now have a convenient way of dealing with data that exhibit both over-dispersion and multi-modality.

I strongly recommend this new addition to R.


Giles, D. E., 2010. Hermite regression analysis of multi-modal count data. Economics Bulletin, 30(4), 2936–2945.

Kemp, C. D. and A. W. Kemp, 1965. Some properties of the ‘Hermite’ distribution. Biometrika, 52, 381-394.

Moriña, D,, M. Higueras, P. Puig, and M. Oliveira, 2015. Generalized Hermite distribution modelling with the R package hermite. The R Journal, 7(2), 263-274.  

© 2016, David E. Giles

Saturday, January 16, 2016

Why Does "Pi" Appear in the Normal Density

Every now and then a student will ask me why the formula for the density of a Normal random variable includes the constant, π, or more correctly (2π).

The answer is that this term ensures that the density function is "proper" - that is, the integral of the function over the full real line takes the value "1". The area under the density, or "total probability", is "1".

Some students are happy with this (partial) answer, but others want to see a proof. Fair enough!

However, there's a trick to proving that this integral (area) is "1" in value. Let's take a look at it.

Saturday, January 9, 2016

Difference-in-Differences With Missing Data

This brief post is a "shout out" for  Irene Botusaru (Economics, Simon Fraser University) who gave a great seminar in our department yesterday.

The paper that she presented (co-authored with Federico Guitierrez), is titled "Difference-in- Differences When the Treatment Status is Observed in Only One Period". So, the title of this post is a bit of an abbreviation of what the paper is really about.

When we conduct DID analysis, we need to be able to classify information about the behaviour/characteristics of survey respondents into a 4-way matrix. Specifically we need to be able to observe the respondents before and after a "treatment"; and in each case we need to know which respondents were treated, and which ones were not.

Usually, a true panel of data, observed at two or more time-periods, facilitates this.

However, what if we simply have repeated cross-sections of data, taken at different time-periods? In this case we aren't necessarily observing exactly the same respondents when we look at the cross-sections for two different time-periods. Typically, in the cross-section after the treatment we'll know which respondents were treated and which ones weren't. However, there will be no way of partitioning the respondents in the pre-treatment cross-section  into "subsequently treated" and "not treated" groups.

Two of the four cells in the matrix of information that we need will be missing, so conventional DID can't be performed.

This is the problem that Irene and Federico consider.

A natural response is introduce some sort of proxy variable(s) to deal with the missing data, and of course this will introduce an estimation bias, even asymptotically. This paper basically takes this approach. The result is a GMM estimation strategy, together with a test that the underlying assumptions are satisfied.

This is a really nice paper - well motivated, technically solid, and with a nice empirical example and application. I urge you to take a look at it if DID is in your econometrics tool-kit (and even if it's not!)

I'm sure that Irene and Federico would appreciate hearing about situations where you've encountered this missing data problem, and how you've responded to it.

© 2016, David E. Giles

Wednesday, December 30, 2015

The Econometric Game, 2016

I like to think of The Econometric Game as the World Championship of Econometrics.

There have been 16 annual Econometric Games to date, and some of these have been featured previously in this blog. For instance in 2015 there were several posts, such as this one. You'll find links in that post to earlier posts for other years.

I also discussed the cases that formed the basis for the 2015 competition here.

In 2016, the 17th Econometric Game will be held at the University of Amsterdam between 6 and 8 April.

The competing teams will be representing the following universities:

Requests I Ignore

About six months ago I wrote a post titled, "Readers' Forum Page".

Part of my explanation for the creation of the page was as follows:

Tuesday, December 29, 2015

Job Market for Economics Ph.D.'s

In a post in today's Inside Higher Ed, Scott Jaschik discusses the latest annual jobs report from the American Economic Association.

Ne notes:
"A new report by the American Economic Association found that its listings for jobs for economics Ph.D.s increased by 8.5 percent in 2015, to 3,309. Academic jobs increased to 2,458, from 2,290. Non-academic jobs increased to 846 from 761." 
(That's an 11.1% increase for non-academic jobs, and a 7.3% increase for academic positions.)

The bounce-back in demand for graduates since 2008 is impressive:
"Economics, like most disciplines, took a hit after 2008. Between then and 2010, the number of listings fell to 2,285 from 2,914. But this year's 3,309 is greater not only than the 2008 level, but of every year from 2001 on. The number of open positions also far exceeds the number of new Ph.D.s awarded in economics."
And here's the really good news for readers of this blog:
"As has been the case in recent years, the top specialization in job listings is mathematical and quantitative methods."

© 2015, David E. Giles