The “Undo” Button of Linear Algebra: Understanding Inverses

Introduction
In basic algebra, the concept of an “inverse” is intuitive. If you have a function that doubles a number, \(f(x) = 2x\), the inverse function is the one that “undoes” that action, bringing you back to where you started. In this case, the inverse would be halving the number, \(g(x) = x/2\). If you apply \(f\) and then \(g\) to a number, say 5, you get \(g(f(5)) = g(10) = 5\). You are back to the beginning.
In linear algebra, we extend this powerful concept to entire vector spaces and the linear transformations that map between them. Understanding inverses is crucial for solving systems of linear equations, changing bases, and understanding the fundamental structure of linear maps.
In this post, I will build up the concept of the inverse step-by-step, starting with the abstract definition for inverse linear transformations, moving to its properties, and finally connecting it to the practical world of matrix computations.
1. The Definition of an Inverse Transformation
At its core, an inverse is about reversing a process. If a linear transformation \(T\) moves a vector from space \(V\) to space \(W\), the inverse transformation should take that resulting vector in \(W\) and move it perfectly back to its original position in \(V\).
Before we give the formal definition, we need to clarify a concept: the Identity Transformation.
What is the Identity Transformation (\(I\))?
The identity transformation is the mathematical equivalent of doing nothing. It is a function mapping a vector space to itself that leaves every vector unchanged.
- Let \(V\) be a vector space. The identity transformation on \(V\), denoted by \(I_V\), is defined by \(I_V(v) = v\) for every vector \(v \in V\).
If you have two different spaces, \(V\) and \(W\), you have two different identity transformations: \(I_V\) and \(I_W\).
The Formal Definition
Now we can define the inverse using the language of linear transformations and identity functions.
Definition: Let \(V\) and \(W\) be vector spaces, and let \(T: V \to W\) be a linear transformation. A function \(U: W \to V\) is said to be an inverse of \(T\) if:
- \(TU = I_W\) (Applying \(U\) then \(T\) brings any vector in \(W\) back to itself)
- \(UT = I_V\) (Applying \(T\) then \(U\) brings any vector in \(V\) back to itself)
If \(T\) has such an inverse, then \(T\) is said to be invertible.
It is a known mathematical fact that if an inverse exists, it is unique. And I proved this in the post on functions. (Recall that a linear transformation is a function.) Because it is unique, we don’t just call it “an” inverse; we call it “the” inverse of \(T\) and denote it by \(T^{-1}\).
Therefore, if \(T\) is invertible, \(T^{-1}\) is the unique function such that \(T T^{-1} = I_W\) and \(T^{-1} T = I_V\).
A Simple Example
Let’s look at a concrete example mapping \(\mathbb{R}^2\) to \(\mathbb{R}^2\).
Let \(V = \mathbb{R}^2\) and \(W = \mathbb{R}^2\).
Let’s define a transformation \(T\) that stretches the x-coordinate by 2 and the y-coordinate by 3.
$$
T(x, y) = (2x, 3y)
$$
Intuitively, to “undo” a multiplication by 2, we need to divide by 2. To undo multiplication by 3, we divide by 3. So, let’s propose an inverse function \(U\):
$$
U(x, y) = (\frac{x}{2}, \frac{y}{3})
$$
To verify that \(U\) is actually the inverse of \(T\), we must check both conditions of the definition.
Check 1: Is \(TU = I_{\mathbb{R}^2}\)?
We apply \(U\) first, then \(T\) to a generic vector \((x,y)\):
\(TU(x, y) = T(U(x, y)) = T(\frac{x}{2}, \frac{y}{3})\)
Now apply T to that result:
\(= (2 \cdot \frac{x}{2}, 3 \cdot \frac{y}{3}) = (x, y)\)
Since we started with \((x,y)\) and ended with \((x,y)\), this is the Identity transformation. Check 1 passed.
Check 2: Is \(UT = I_{\mathbb{R}^2}\)?
We apply \(T\) first, then \(U\):
\(UT(x, y) = U(T(x, y)) = U(2x, 3y)\)
Now apply U to that result:
\(= (\frac{2x}{2}, \frac{3y}{3}) = (x, y)\)
Check 2 passed.
Since both conditions hold, \(T\) is invertible, and \(T^{-1}(x, y) = (\frac{x}{2}, \frac{y}{3})\).
2. Properties of Inverses
Before listing the algebraic properties of the inverse, we must establish a fundamental theorem connecting one-to-one, onto, and rank. (We covered ‘one-to-one’ and ‘onto’ in our previous post on functions.) This theorem is incredibly useful because determining if a transformation is invertible often boils down to checking its rank.Theorem 2.5: Equivalence of Rank, One-to-One, and Onto
Theorem 2.5. Let \(V\) and \(W\) be vector spaces of equal (finite) dimension, and let \(T: V \to W\) be linear. Then the following are equivalent:
(a) \(T\) is one-to-one.
(b) \(T\) is onto.
(c) \(\text{rank}(T) = \dim(V)\).
Key Algebraic Properties
Now that we understand the structural conditions, here are the essential algebraic rules for handling inverses.
- The “Socks and Shoes” Property: $$
(TU)^{-1} = U^{-1}T^{-1}
$$ Why is the order reversed? Think of putting on your socks (\(U\)) and then your shoes (\(T\)). To undo this, you must take off your shoes (\(T^{-1}\)) first, and then take off your socks (\(U^{-1}\)). - The Inverse of an Inverse: $$
(T^{-1})^{-1} = T
$$ In particular, this implies that \(T^{-1}\) is essentially invertible.
Restating Theorem 2.5 for Invertibility
We can therefore restate Theorem 2.5 as follows:
Corollary: Let \(T: V \to W\) be a linear transformation, where \(V\) and \(W\) are finite-dimensional spaces of equal dimension. Then \(T\) is invertible if and only if \(\text{rank}(T) = \dim(V)\).
A Simple Example
Let’s test this rank condition with a transformation \(T: \mathbb{R}^2 \to \mathbb{R}^2\).
Here, \(\dim(V) = 2\).
Consider the transformation defined by \(T(x, y) = (x + y, 2x + 2y)\).
Is this invertible?
Let’s check the rank. The output vectors are of the form \((x+y)(1, 2)\).
Every output is a scalar multiple of the single vector \((1, 2)\).
Therefore, the dimension of the range (the rank) is 1.
Since \(\text{rank}(T) = 1\) and \(\dim(V) = 2\):
$$
\text{rank}(T) \neq \dim(V)
$$
According to our corollary, \(T\) is NOT invertible.
(Intuitively, this makes sense because \(T\) collapses the entire 2D plane onto a single line. You cannot reverse this process because you’ve lost information—you don’t know which specific \((x,y)\) created a point on that line.)
3. Defining Invertibility for Matrices
While linear transformations provide the conceptual framework, matrices give us the calculation power. The definition of an invertible matrix mirrors the definition for transformations, but relies on matrix multiplication.
The Definition
Definition: Let \(A\) be an \(n \times n\) matrix. Then \(A\) is said to be invertible if there exists an \(n \times n\) matrix \(B\) such that:
$$
AB = BA = I
$$
where \(I\) is the \(n \times n\) identity matrix.
Uniqueness of the Inverse
Just like with transformations, if a matrix inverse exists, it is the only one. We can prove this easily using the associative property of matrix multiplication.
- Proof: Suppose \(A\) has two inverses, \(B\) and \(C\). By definition, \(AB = I\) and \(CA = I\). We can write: $$
C = CI = C(AB) = (CA)B = IB = B
$$ Therefore, \(C = B\).
Because the inverse is unique, we call matrix \(B\) the inverse of \(A\) and denote it by \(A^{-1}\).
A Simple Example
Let’s look at a \(2 \times 2\) matrix \(A\):
$$
A = \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix}
$$
Does this matrix have an inverse? Let’s propose a matrix \(B\):
$$
B = \begin{pmatrix} 2 & -1 \\ -1 & 1 \end{pmatrix}
$$
To verify if \(B\) is the inverse (\(A^{-1}\)), we need to check if their product results in the Identity matrix
$$
I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}
$$
Check 1 (\(AB\)):
$$
AB = \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix}\begin{pmatrix} 2 & -1 \\ -1 & 1 \end{pmatrix}
$$
$$
= \begin{pmatrix} (1)(2) + (1)(-1) & (1)(-1) + (1)(1) \\ (1)(2) + (2)(-1) & (1)(-1) + (2)(1) \end{pmatrix}
$$
$$
= \begin{pmatrix} 2-1 & -1+1 \\ 2-2 & -1+2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I
$$
Check 2 (\(BA\)):
$$
BA = \begin{pmatrix} 2 & -1 \\ -1 & 1 \end{pmatrix}\begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix}
$$
$$
= \begin{pmatrix} (2)(1) + (-1)(1) & (2)(1) + (-1)(2) \\ (-1)(1) + (1)(1) & (-1)(1) + (1)(2) \end{pmatrix}
$$
$$
= \begin{pmatrix} 2-1 & 2-2 \\ -1+1 & -1+2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I
$$
Since \(AB = I\) and \(BA = I\), the matrix \(A\) is invertible and \(A^{-1} = B\).
4. Lemma: Invertibility and Dimension
We established earlier that for spaces of equal dimension, invertibility is linked to rank. Now, we broaden our scope: what if we don’t know the dimensions are equal yet? This lemma proves that invertibility actually forces the dimensions to be the same.
The Lemma
Lemma: Let \(T\) be an invertible linear transformation from \(V\) to \(W\). Then \(V\) is finite-dimensional if and only if \(W\) is finite-dimensional. In this case, \(\dim(V) = \dim(W)\).
A Simple Example
Can we find an invertible linear transformation (an inverse) that maps the 2D plane \(\mathbb{R}^2\) to 3D space \(\mathbb{R}^3\)?
- \(\dim(\mathbb{R}^2) = 2\)
- \(\dim(\mathbb{R}^3) = 3\)
According to this Lemma, because \(\dim(V) \neq \dim(W)\), no invertible linear transformation \(T: \mathbb{R}^2 \to \mathbb{R}^3\) can exist.
You can try to make one, but you will fail one of the conditions:
- If you map the 2D plane into 3D space, you cannot “cover” the whole 3D space (it won’t be onto).
- If you try to squash 3D space into 2D space, you must collapse some vectors onto each other (it won’t be one-to-one).
Invertibility requires a perfect one-to-one match, which is only possible between spaces of the same dimension.
5. Theorem 2.18: The Matrix Connection
We have defined what an inverse is, and we have seen how to find the inverse of a matrix. Now, we connect the two. This theorem essentially says: A linear transformation is invertible if and only if its matrix representation is invertible.
This is powerful because it allows us to translate an abstract problem (Is this function reversible?) into a concrete matrix calculation.
The Theorem 2.18
Theorem 2.18. Let \(V\) and \(W\) be finite-dimensional vector spaces with ordered bases \(\beta\) and \(\gamma\), respectively. Let \(T: V \to W\) be linear. Then \(T\) is invertible if and only if the matrix \([T]_\beta^\gamma\) is invertible.
Furthermore, if it is invertible, the matrix of the inverse is the inverse of the matrix:
$$
[T^{-1}]_\gamma^\beta = ([T]_\beta^\gamma)^{-1}
$$
(Note: Notice the indices swap. The transformation \(T\) goes from \(\beta\) to \(\gamma\), while the inverse \(T^{-1}\) goes back from \(\gamma\) to \(\beta\).)
The Corollary 1
Let \(V\) be a finite-dimensional vector space with an ordered basis \(\beta\), and let \(T: V \to V\) be linear. Then \(T\) is invertible if and only if \([T]_\beta\) is invertible.
Furthermore, \([T^{-1}]_\beta = ([T]_\beta)^{-1}\).
(Note: We simplify the notation \([T]_\beta^\beta\) to just \([T]_\beta\). )
Corollary 2. Let \(A\) be an \(n \times n\) matrix. Then \(A\) is invertible if and only if \(L_A\) is invertible.
Furthermore, \((L_A)^{-1} = L_{A^{-1}}\).
Corollary 2.
Let \(A\) be an \(n \times n\) matrix. Then \(A\) is invertible if and only if \(\mathsf{L}_A\) is invertible. Furthermore, \((\mathsf{L}_A)^{-1} = \mathsf{L}_{A^{-1}}\).
A Simple Example
Let’s consider a transformation \(T: \mathbb{R}^2 \to \mathbb{R}^2\) defined by:
$$
T(x, y) = (x + 2y, 3x + 4y)
$$
Let’s use the standard basis \(\beta = \{(1,0), (0,1)\}\) for both input and output (so \(\beta = \gamma\)).
Step 1: Find the matrix \([T]_\beta^\beta\).
$$
T(1, 0) = (1, 3) = 1(1,0) + 3(0,1)
$$
$$
T(0, 1) = (2, 4) = 2(1,0) + 4(0,1)
$$
So the matrix is:
$$
A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}
$$
Step 2: Check if the matrix is invertible.
To check if matrix \(A\) is invertible according to the definition, we need to see if there exists a matrix \(B\) such that \(AB = BA = I\).
Let’s test the following matrix \(B\):
$$
B = \begin{pmatrix} -2 & 1 \\ \frac{3}{2} & -\frac{1}{2} \end{pmatrix}
$$
First, we check the product \(AB\):
$$
AB = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix} \begin{pmatrix} -2 & 1 \\ 1.5 & -0.5 \end{pmatrix}
$$
$$
= \begin{pmatrix} (1)(-2) + (2)(1.5) & (1)(1) + (2)(-0.5) \\ (3)(-2) + (4)(1.5) & (3)(1) + (4)(-0.5) \end{pmatrix}
$$
$$
= \begin{pmatrix} -2 + 3 & 1 – 1 \\ -6 + 6 & 3 – 2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I
$$
Next, we check the product \(BA\):
$$
BA = \begin{pmatrix} -2 & 1 \\ 1.5 & -0.5 \end{pmatrix} \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}
$$
$$
= \begin{pmatrix} (-2)(1) + (1)(3) & (-2)(2) + (1)(4) \\ (1.5)(1) + (-0.5)(3) & (1.5)(2) + (-0.5)(4) \end{pmatrix}
$$
$$
= \begin{pmatrix} -2 + 3 & -4 + 4 \\ 1.5 – 1.5 & 3 – 2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I
$$
Since \(AB = I\) and \(BA = I\), the matrix \(A\) is invertible, and \(B\) is its inverse (\(A^{-1} = B\)). According to Theorem 2.18, this guarantees that the abstract transformation \(T\) is also invertible.
(By the way, are you curious about how we found that specific matrix \(B\)? Don’t worry, finding the inverse is the subject of the next post!
References
The theorem numbering in this post follows Linear Algebra (4th Edition) by Friedberg, Insel, and Spence. Some explanations and details here differ from the book.