Introduction to Matrices

What is a Matrix?4

The word “matrix” generally conjures up levels of horror in students which even Stephen King would be struggling to match5. And I have to admit that I have not always been best friends with them myself, either. But they are useful, and in their ability to convey a large amount of information in a structured and logical way, they are even beautiful. Because at the end of the day, a matrix is nothing more than a list.

In the lecture I introduced you to the concept of the Population Regression Function (PRF) which can be written as:

\[\begin{equation} y_{i} = \beta_{0} + \beta_{1} x_{i} + \epsilon_{i} \end{equation}\]

If we wanted to write out this equation for every observation in our data set, it would look something like this:

\[\begin{align*} y_{1} &= \beta_{0} + \beta_{1} x_{1} + \epsilon_{1} \\ y_{2} &= \beta_{0} + \beta_{1} x_{2} + \epsilon_{2} \\ &\vdots \\ y_{n} &= \beta_{0} + \beta_{1} x_{n} + \epsilon_{n} \end{align*}\]

This is incredibly wasteful, as it unnecessarily repeats notation. What we could do instead is to create a list which has as many columns as there are unique elements in these equations (since they all have the same structure), and only note the numeric values which actually change. And there you have a matrix:

Whilst this is what we will be using matrices for next week, let us leave regression aside for now, and focus on working with matrices more generally. We will now be using these two sample matrices:

\[\begin{equation*} A = \begin{bmatrix} 1 & 7 & 3\\ 9 & 5 & 4\\ \end{bmatrix} \hspace{0.75cm} B = \begin{bmatrix} 6 & 3 & 8 & 2 \\ 3 & 2 & 1 & 4 \\ 1 & 5 & 3 & 9 \\ \end{bmatrix} \end{equation*}\]


Matrix Notation

We can refer to individual elements of a matrix by stating the name of the matrix, and then in the index first the row, followed by the column (you might recognise this logic from R – this is because it is the same). With respect to matrix A, for example, the value in the first row and first column (1) would be referred to as

A\(_{11}=1\)

The number in the first row, but second column would be referred to as

A\(_{12}=7\)

What is the value of B\(_{23}\)?


Calculating with Matrices

We will only need to multiply and divide matrices on this module, so let’s cover these operations now.

Multiplying Matrices

To show you how this is done, I will multiply matrix A with matrix B and record the results in a new matrix called C.

\[\begin{equation*} A \times B = C = \begin{bmatrix} 1 & 7 & 3\\ 9 & 5 & 4\\ \end{bmatrix} \times \begin{bmatrix} 6 & 3 & 8 & 2 \\ 3 & 2 & 1 & 4 \\ 1 & 5 & 3 & 9 \\ \end{bmatrix} = \begin{bmatrix} 30 & 32 & 24 & 51\\ 73 & 57 & 89 & 74\\ \end{bmatrix} \end{equation*}\]

In order to calculate a new element \(C_{i,j}\), we multiply the elements of the \(i^{th}\) row of A with the elements of the \(j^{th}\) column of B. We then add together these so-called inner products in order to arrive at \(C_{i,j}\). Let me give you some examples in which I have set the values from matrix A in bold to make the process more transparent.

\(C_{11}=\textbf{1}\times6+\textbf{7}\times3+\textbf{3}\times1=30\)

\(C_{12}=\textbf{1}\times3+\textbf{7}\times2+\textbf{3}\times5=32\)

\(C_{21}=\textbf{9}\times6+\textbf{5}\times3+\textbf{4}\times1=73\)

I have prepared a short video taking you through this process step by step. I would encourage you to watch it now.

If you can’t get enough of my delightful German accent, then I have some videos for you in which I go through the respective operation with matrices on screen. Here is the first:

Multiplying Matrices

Assume we are multiplying a 4x3 matrix with a 3x4 matrix with one another. How many rows and columns does the resulting matrix have?

Dividing Matrices

In order to be able to divide by an entire matrix, we take its inverse, and then multiply with the inverse6. We do the same in the non-matrix world. For example, if we wanted to divide 6 by 3, this is the same as multiplying 6 with \(\frac{1}{3}\), the inverse of 3. Sadly, inverting a matrix is not quite as straightforward as this. In fact, it is one of the most challenging operations you can do with a matrix. Luckily, the inversion of a 2 by 2 matrix (which is what we will be using) is still possible without a degree in algebra. The inverse \(D^{-1}\) of a 2 by 2 matrix \(D\) is defined as

\[\begin{equation*} D^{-1} = \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix}^{-1} = \frac{1}{ad-bc} \begin{bmatrix} d & -b \\ -c & a \\ \end{bmatrix}^{-1} \end{equation*}\]

Thus, to arrive at the inverse of a 2 by 2 matrix, we first have to form the fraction in front of it. This takes as its denominator the difference between the products of the diagonal elements. We also refer to the denominator as the determinant of the matrix. In a second step – now in the matrix itself – we swap a and d, and set a minus sign in front of b and c.

Dividing Matrices


Special Matrices

There are two types of matrices we will be using which hold special, useful properties. This is the transpose of a matrix, and the so-called identity matrix.

Transposing Matrices

Another important operation is transposing a matrix, which turns rows into columns and columns into rows. We denote a transposed matrix with an apostrophe. Transposing matrix \(A\) into matrix \(A^\prime\) gives us:

\[\begin{equation*} A = \begin{bmatrix} 1 & 7 & 3\\ 9 & 5 & 4\\ \end{bmatrix} \hspace{0.75cm} A^\prime = \begin{bmatrix} 1 & 9 \\ 7 & 5 \\ 3 & 4 \\ \end{bmatrix} \end{equation*}\]

Transposing Matrices

We have the following matrix:

\[\begin{equation*} X = \begin{bmatrix} 2 & 4 \\ 3 & 6 \\ \end{bmatrix} \end{equation*}\]

Calculate \(X^{\prime}X\).

The Identity Matrix

There is only one last thing left to show you before we can embark on using matrices for deriving and estimating our regression coefficients. And that is the so-called identity matrix \(I\). This matrix is always square, has the value 1 on all diagonal elements, and zeros otherwise. If a matrix is multiplied with \(I\), we receive the original matrix. For example let

\[\begin{equation*} I = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{bmatrix} \end{equation*}\]

If we multiply I with matrix B we receive

\[\begin{equation*} I \times B = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{bmatrix} \times \begin{bmatrix} 6 & 3 & 8 & 2 \\ 3 & 2 & 1 & 4 \\ 1 & 5 & 3 & 9 \\ \end{bmatrix} = \begin{bmatrix} 6 & 3 & 8 & 2 \\ 3 & 2 & 1 & 4 \\ 1 & 5 & 3 & 9 \\ \end{bmatrix} \end{equation*}\]

This feature will be important in the derivation of estimators next week, where we will make use of the fact that a matrix multiplied with its inverse results in an identity matrix. For example \(A \times A^{-1} = I\).

Identity Matrix



  1. This material is taken from Reiche (forthcoming).↩︎

  2. He is by far my favourite author. If you haven’t, already, read IT.↩︎

  3. This draws on https://www.mathsisfun.com/algebra/matrix-inverse.html↩︎