Worksheet Week 2

Self-Assessment Questions3

  1. Give an example for a two-sample test for a mean.
  2. Give an example for a two-sample test for a proportion.
  3. Why do we calculate the t-score as \(t =\frac{\text{Estimate of parameter - null hypothesis value of parameter}}{\text{Standard error of estimate}}\) ?
  4. What is the difference between a t-score and a z-score?
  5. What are the strengths and weaknesses of two-sample tests?

Please stop here and don’t go beyond this point until we have compared notes on your answers.


Two-Sample Tests in R

Data Preparation

  • We are working with the World Development Indicators again. Data are taken from World Bank (2024), Boix et al. (2018), and Marshall & Gurr (2020).
  • To save you clicking back to last week, here is the code book (you are welcome):
Table 2: WDI Codebook
variable label
Country Name Country Name
Country Code Country Code
year year
democracy 0 = Autocracy, 1 = Dictatorship (Boix et al., 2018)
gdppc GDP per capita (constant 2010 US$)
gdpgrowth Absolute growth of per capita GDP to previous year (constant 2010 US Dollars)
enrl_gross School enrollment, primary (% gross)
enrl_net School enrollment, primary (% net)
agri Employment in agriculture (% of total employment) (modeled ILO estimate)
slums Population living in slums (% of urban population)
telephone Fixed telephone subscriptions (per 100 people)
internet Individuals using the Internet (% of population)
tax Tax revenue (% of GDP)
electricity Access to electricity (% of population)
mobile Mobile cellular subscriptions (per 100 people)
service Services, value added (% of GDP)
oil Oil rents (% of GDP)
natural Total natural resources rents (% of GDP)
literacy Literacy rate, adult total (% of people ages 15 and above)
prim_compl Primary completion rate, total (% of relevant age group)
infant Mortality rate, infant (per 1,000 live births)
hosp Hospital beds (per 1,000 people)
tub Incidence of tuberculosis (per 100,000 people)
health_ex Current health expenditure (% of GDP)
ineq Income share held by lowest 10%
unemploy Unemployment, total (% of total labor force) (modeled ILO estimate)
lifeexp Life expectancy at birth, total (years)
urban Urban population (% of total population)
polity5 Combined Polity V score
  • Load the data set

  • The Polity V Score (variable polity5) codes regimes from -10 (indicating perfect autocracy) to +10 (indicating perfect democracy). With the tidyvserse, create a new variable called politybin which codes all countries with a Polity V score lower than +1 as dictatorships, and all countries with a Polity V score from +1 to +10 as democracies.

  • Apply the same procedure to gdppc, cutting it at its median into two levels, Developing and Developed, creating a new variable called gdpcat.

  • Last up is the variable gdpgrowth. Create a new variable called growth which divides countries into “slow-growing” and “fast-growing” countries, using the mean as the cut-off point.

Guided Example – Two-Sample Test for a Proportion

  • Let us find out whether a higher proportion of developing countries is autocratic than developed countries.
  • State the null hypothesis and the directional alternative hypothesis for this research question.
  • In order to test this hypothesis, we need to first create a cross-tabulation:
table(wdi$gdpcat, wdi$politybin)
            
             Dictatorship Democracy
  Developing           23        54
  Developed            16        54
  • We now take the number of observations which are classed as dictatorships per development status.
  • We also calculate the row totals, as this gives us the total number of developing and developed countries, respectively.
  • Then we are ready to use the prop.test() command, by first specifying the number of countries which are dictatorships, then the total number of developing and developed countries, then advising R that a correction for small sample sizes is not necessary in our case.

R uses a \(\chi^2\)-test for this, as we are essentially dealing with a cross-tabulation. When you do this by hand, please use z-scores and the normal distribution.

  • Our hypothesis is directional, because we expect a higher proportion of developing countries to be dictatorships than developed countries. The status Developing is the lower category, and we thus expect this proportion to be larger, or “greater”. We add this to the test function as option alternative = "greater".
prop.test(c(23,16),c(77,70), correct=F, alternative = "greater")

    2-sample test for equality of proportions without continuity correction

data:  c(23, 16) out of c(77, 70)
X-squared = 0.92517, df = 1, p-value = 0.1681
alternative hypothesis: greater
95 percent confidence interval:
 -0.04893134  1.00000000
sample estimates:
   prop 1    prop 2 
0.2987013 0.2285714 
  • Which proportion of developing and developed countries are dictatorships?
  • Do we verify or falsify our hypothesis at a 95% confidence level?

How would the R code change, if we investigated whether a higher proportion of developing countries is autocratic than developed countries?

Exercise – Two-Sample Test for a Proportion

  • Is a higher proportion of fast-growing countries democratic than slow-growing countries? Use a 95% confidence level.

  • Now repeat the exercise, but this time with the democracy variable. Do the results differ? Why? Why not?

  • Calculate the last two-sample test by hand.

Guided Example – Two-Sample Test for a Mean

  • Now we are interested whether people live longer in developed countries than in developing countries?
  • For this, we use the variable life
  • Once again, state the Null- and the Alternative Hypothesis
  • The first step in a t-test for two means is to find out whether the variance is equal in both samples.

Explain why this is required in a two-sample test for a mean, and why we did not have to do this in a two-sample test for a proportion.

To test for equal variances we use the Levene Test, where:

  • H\(_{0}\): The variance among the groups is equal.
  • H\(_{\text{A}}\): The variance among the groups is not equal.

This is essentially another two-sample test in which we ascertain whether the difference between the variances of the two groups is different from zero (H\(_{0}\))

  • For the Levene test you need the car package, where “car” stands for “Companion to Applied Regression”:
install.packages("car")
  • Now we call the package and perform the test:
library(car)

leveneTest(wdi$lifeexp ~ wdi$gdpcat)
Levene's Test for Homogeneity of Variance (center = median)
       Df F value    Pr(>F)    
group   1  11.544 0.0008498 ***
      167                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • The result is significant, and we therefore reject the null hypothesis. This means that the variance in the two samples is not equal.

  • Now we can perform the t-test. We once again specify alternative="less" as an option, due to the same reasoning as before.

t.test(lifeexp ~ gdpcat, data=wdi, var.equal = FALSE, alternative="less")

    Welch Two Sample t-test

data:  lifeexp by gdpcat
t = -10.83, df = 159.03, p-value < 2.2e-16
alternative hypothesis: true difference in means between group Developing and group Developed is less than 0
95 percent confidence interval:
      -Inf -8.477836
sample estimates:
mean in group Developing  mean in group Developed 
                66.80181                 76.80833 
  • Can we conclude at a 95% confidence level, that people live longer in developed countries than in developing countries?

What is the precise p-value for the hypothesis that the true difference in means between group Developing and group Developed is greater than 0?

Exercise – Two-Sample Test for a Mean

  • Do people live longer under democracies than under dictatorahips (use the politybin variable)? Use a 95% confidence level.

Homework for Week 3


Solutions

You can find the Solutions in the Downloads Section.



  1. Some of the content of this worksheet is taken from Reiche (forthcoming).↩︎