Exercises

The Data Set

Use the data set called prz.dta which is available in the Downloads Section. This is the data set used in the book “Democracy and Development” (Przeworski et al., 2000). Deposit this in an appropriate working directory and import the data set into a data frame called prz. We will only be looking at a few variables – democ = 1 if democracy, 0 otherwise; gdpw - GDP per worker; g = growth rate; oil = 1 if oil producer, 0 otherwise.9


Basics

  1. Run a probit model where democ is the dependent variable and g, gdpw and oil are the independent variables. Put the results in column 1 of Table 1. What is (possibly) wrong with this approach? Interpret the coefficients on one or two of the variables.

  2. Run the same probit model as before but now include a lagged dependent variable. To create the lagged dependent variable, call:

prz <- prz %>%
  group_by(country) %>%
  mutate(l.democ = lag(democ)) %>%
  ungroup()

Put the results in column two of Table 1. What are we assuming by including a lagged dependent variable? Do you think that this is appropriate here?

  1. Now estimate a probit “transition to democracy” model i.e. how do growth, wealth and oil affect the probability that a country is a democracy this year given that it was a dictatorship last year. We are also lagging the independent variables by one year. Put the results in column 3 of Table 1. Interpret the sign of the coefficients on each independent variable.

  2. Now estimate a probit “survival of democracy” model i.e. how do (lagged) growth, wealth and oil affect the probability that a country is a democracy this year given that it was a democracy last year. Put the results in column 4 of Table 1. Interpret the sign of the coefficients on each independent variable.


Advanced

This section draws on the instructions for joint estimation.

  1. Now interact all the lagged independent variables with the lagged dependent variable. Estimate a fully interactive model and include all the constitutive terms. Put the results in column 5 of Table 1. What is the relationship between these coefficients and those in the previous two columns? Is there any extra information provided by this full interaction model that was not available from the previous two models?
  2. Now consider the straight probit model, the probit model with the lagged dependent variable, and the full interaction model. Produce the ROC curve for each of these models. Interpret a point on one of these curves. What do the ROC curves tell you about the fit of these three models?

You can find the solutions to these exercises in the respective RScript in the Downloads Section. But you can also download the results table, and the stargazer code to produce this table.


  1. I have taken these exercises from some material written by Matt Golder from whom I learned all this many moons ago.↩︎