Understanding Change of Basis and Matrix Similarity.

1. THE CHANGE OF COORDINATE SYSTEM
In linear algebra, we often get comfortable working with the “standard” basis—the familiar $x$ and $y$ axes. But what happens when the problem you’re trying to solve looks messy in one coordinate system but becomes crystal clear in another?
Think of it like language. If you are describing a star’s position, an astronomer on Earth and an astronomer on the sun will use different numbers, even though they are looking at the exact same star. To communicate, they need a translator.
What is this?
Imagine you are describing the location of a star.
- Astronomer A uses a coordinate system based on the Earth’s horizon.
- Astronomer B uses a coordinate system based on the Sun.
They are looking at the same star (the same vector \(v\)), but they describe it using different numbers (different coordinate vectors).
The Change of Coordinate Matrix is the mathematical “translator” that converts Astronomer B’s numbers into Astronomer A’s numbers. In linear algebra, it converts the coordinate vector of \(v\) relative to one basis (say, \(\beta’\)) into the coordinate vector relative to another basis (say, \(\beta\)).
Why is this necessary?
- Simplification: Some problems are incredibly difficult in the “standard” basis but become trivial in a different basis (e.g., a basis made of eigenvectors makes matrix multiplication act like simple scalar multiplication).
- Perspective: It allows us to switch between different “viewpoints” of the same vector space without changing the underlying vectors themselves.
A Simple Example
Consider the vector space \(V = \mathbb{R}^2\).
Let our vector be \(v = (2, 2)\).
- Basis \(\beta\): \(\{(1,0), (0,1)\}\). In this basis, the coordinates are simply $$
\beta=\begin{pmatrix} 2 \\ 2 \end{pmatrix}
$$ - Basis \(\beta’\): \(\{(1,1), (-1,1)\}\). To build \(v=(2,2)\) using these vectors, we need 2 of the first vector and 0 of the second: $$
2(1,1) + 0(-1,1) = (2,2)
$$ So, the coordinate vector in $$
\beta’=\begin{pmatrix} 2 \\ 0 \end{pmatrix}
$$
The Change of Coordinate Matrix is the tool that transforms \(\begin{pmatrix} 2 \ 0 \end{pmatrix}\) directly into \(\begin{pmatrix} 2 \ 2 \end{pmatrix}\).
2. THEOREM 2.22
Theorem 2.22
Let \(\beta\) and \(\beta’\) be two ordered bases for a finite-dimensional vector space \(V\), and let \(Q = [I_V]_{\beta’}^{\beta}\). Then:
(a) \(Q\) is invertible.
(b) For any \(v \in V\), \([v]_\beta = Q[v]_{\beta’}\).
(Note: The matrix \(Q\) is called the change of coordinate matrix from \(\beta’\) to \(\beta\).)
The Meaning
This theorem formalizes the “translator” definition. It defines the change of coordinate matrix \(Q\) specifically as the matrix representation of the Identity Transformation (\(I_V\)).
Why the Identity? Because the Identity transformation takes a vector \(v\) and outputs the same vector \(v\). However, when we write the matrix for this transformation using different bases for the input (\(\beta’\)) and the output (\(\beta\)), the matrix effectively “rewrites” the vector from the language of \(\beta’\) to the language of \(\beta\).
- Part (a) says this translation is reversible (you can always translate back).
- Part (b) gives the formula: New Coordinates = Matrix \(\times\) Old Coordinates.
A Simple Example for Theorem 2.22
Let’s verify this using \(\mathbb{R}^2\).
1. Define the Bases:
- \(\beta = \{e_1, e_2\} = \{\begin{pmatrix} 1 \ 0 \end{pmatrix}, \begin{pmatrix} 0 \ 1 \end{pmatrix}\} \) (The Standard Basis)
- \(\beta’ = \{u_1, u_2\} = \{\begin{pmatrix} 1 \ 1 \end{pmatrix}, \begin{pmatrix} 1 \ -1 \end{pmatrix}\} \)
2. Construct \(Q\):
To find \(Q = [I_V]_{\beta’}^{\beta}\), we take the vectors of \(\beta’\) and write them as coordinates in \(\beta\).
- \(I_V(u_1) = u_1 = \begin{pmatrix} 1 \ 1 \end{pmatrix} = 1e_1 + 1e_2 \Rightarrow [u_1]_\beta = \begin{pmatrix} 1 \ 1 \end{pmatrix}\)
- \(I_V(u_2) = u_2 = \begin{pmatrix} 1 \ -1 \end{pmatrix} = 1e_1 – 1e_2 \Rightarrow [u_2]_\beta = \begin{pmatrix} 1 \ -1 \end{pmatrix}\)
So, the matrix is:
$$
Q = \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}
$$
3. Verify Part (b):
Let’s pick a vector
$$
v = \begin{pmatrix} 3 \\ 1 \end{pmatrix}
$$
- In \(\beta’\) coordinates: We need to solve \(c_1(1,1) + c_2(1,-1) = (3,1)\). By inspection (or solving), \(c_1 = 2\) and \(c_2 = 1\). So, \([v]_{\beta’} = \begin{pmatrix} 2 \ 1 \end{pmatrix}\).
- Apply the Theorem: Multiply \(Q\) by the \(\beta’\) coordinates: $$
Q [v]_{\beta’} = \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \begin{pmatrix} 2 \\ 1 \end{pmatrix} = \begin{pmatrix} (1)(2) + (1)(1) \\ (1)(2) + (-1)(1) \end{pmatrix} = \begin{pmatrix} 3 \\ 1 \end{pmatrix}
$$ - Result: \(\begin{pmatrix} 3 \ 1 \end{pmatrix}\) matches the coordinates of \(v\) in the standard basis \(\beta\). The theorem holds!
3. THEOREM 2.23
Theorem 2.23
Let \(T\) be a linear operator on a finite-dimensional vector space \(V\), and let \(\beta\) and \(\beta’\) be ordered bases for \(V\). Suppose that \(Q\) is the change of coordinate matrix that changes \(\beta’\)-coordinates into \(\beta\)-coordinates. Then:
$$
[T]_{\beta’} = Q^{-1}[T]_{\beta}Q
$$
The Meaning
It tells us how the matrix representation of a linear map changes when we switch our viewpoint (basis).
- \([T]_\beta\) is the matrix in the “old” basis (usually the standard basis).
- \([T]_{\beta’}\) is the matrix in the “new” basis.
- \(Q\) is the translator between them.
The formula \(Q^{-1}[T]_\beta Q\) represents a three-step cycle to process a vector using the “new” perspective (\(\beta’\)), even if we only know how the machine works in the “old” perspective (\(\beta\)):
- \(Q\): Translate the input from New Language (\(\beta’\)) \(\to\) Old Language (\(\beta\)).
- \([T]_\beta\): Process the vector using the Old machine.
- \(Q^{-1}\): Translate the output from Old Language (\(\beta\)) \(\to\) New Language (\(\beta’\)).
A Simple Example for Theorem 2.23
Let \(V = \mathbb{R}^2\) and let \(T\) be the linear operator defined by \(T(x, y) = (x+y, x-y)\).
1. The “Old” Basis (\(\beta\) – Standard):
\(\beta = \{(1,0), (0,1)\}\).
The matrix is simply:
$$
[T]_\beta = \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}
$$
2. The “New” Basis (\(\beta’\)):
Let \(\beta’ = \{(1,1), (-1,1)\}\).
From our previous example, the change of coordinate matrix (columns are \(\beta’\) vectors) is:
$$
Q = \begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}
$$
3. Calculate the New Matrix \([T]_{\beta’}\) using the formula:
First, find \(Q^{-1}\) (see the previous post for the method of finding \(Q^{-1}\))
$$
Q^{-1} = \begin{pmatrix} 0.5 & 0.5 \\ -0.5 & 0.5 \end{pmatrix}
$$
Now compute \(Q^{-1}[T]_\beta Q\):
$$
Q^{-1} ([T]_\beta Q) = \begin{pmatrix} 0.5 & 0.5 \\ -0.5 & 0.5 \end{pmatrix} \left[ \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix} \right]
$$
$$
= \begin{pmatrix} 0.5 & 0.5 \\ -0.5 & 0.5 \end{pmatrix} \begin{pmatrix} 2 & 0 \\ 0 & -2 \end{pmatrix}
$$
$$
= \begin{pmatrix} 1 & -1 \\ -1 & -1 \end{pmatrix}
$$
Result:
$$
[T]_{\beta’} = \begin{pmatrix} 1 & -1 \\ -1 & -1 \end{pmatrix}
$$
We computed the matrix for basis \(\beta’\) without ever having to plug basis vectors into \(T\) directly!
4. COROLLARY
Corollary
Let \(A \in M_{n \times n}(F)\), and let \(\gamma\) be an ordered basis for \(F^n\). Then \([L_A]_\gamma = Q^{-1}AQ\), where \(Q\) is the \(n \times n\) matrix whose \(j\) th column is the \(j\) th vector of \(\gamma\).
The Meaning
This corollary takes the general Theorem 2.23 and applies it to the most common scenario: standard matrices.
Usually, we are given a matrix \(A\) right at the start. Implicitly, this matrix \(A\) represents a transformation using the Standard Basis. This corollary says: “If you want to know what matrix \(A\) looks like in a different basis \(\gamma\), just conjugate it (\(Q^{-1}AQ\)) using the matrix \(Q\) formed by those basis vectors.”
This operation is called a Similarity Transformation. We say \(A\) is similar to the resulting matrix.
A Simple Example for the Corollary
Let
$$
A = \begin{pmatrix} 4 & 2 \\ 3 & 3 \end{pmatrix}
$$
This is our original matrix. We want to represent this transformation in a basis \(\gamma\) composed of its eigenvectors (this is called diagonalization).
1. The Basis \(\gamma\):
Let \(\gamma = \{\begin{pmatrix} 2 \ -3 \end{pmatrix}, \begin{pmatrix} 1 \ 1 \end{pmatrix} \}\).
2. Form Matrix \(Q\):
Just put the basis vectors as columns:
$$
Q = \begin{pmatrix} 2 & 1 \\ -3 & 1 \end{pmatrix}
$$
3. Apply the Formula \([L_A]_\gamma = Q^{-1}AQ\):
First, find \(Q^{-1}\):
$$
Q^{-1} = \frac{1}{5} \begin{pmatrix} 1 & -1 \\ 3 & 2 \end{pmatrix}
$$
Now multiply:
$$
Q^{-1}AQ = \frac{1}{5} \begin{pmatrix} 1 & -1 \\ 3 & 2 \end{pmatrix} \begin{pmatrix} 4 & 2 \\ 3 & 3 \end{pmatrix} \begin{pmatrix} 2 & 1 \\ -3 & 1 \end{pmatrix}
$$
Let’s do \(A \times Q\) first:
$$
\begin{pmatrix} 4 & 2 \\ 3 & 3 \end{pmatrix} \begin{pmatrix} 2 & 1 \\ -3 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 6 \\ -3 & 6 \end{pmatrix}
$$
Now \(Q^{-1} \times (AQ)\):
$$
\frac{1}{5} \begin{pmatrix} 1 & -1 \\ 3 & 2 \end{pmatrix} \begin{pmatrix} 2 & 6 \\ -3 & 6 \end{pmatrix} = \frac{1}{5} \begin{pmatrix} 5 & 0 \\ 0 & 30 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 6 \end{pmatrix}
$$
Result:
$$
[L_A]_\gamma = \begin{pmatrix} 1 & 0 \\ 0 & 6 \end{pmatrix}
$$
By choosing the right basis \(\gamma\), we turned the complicated matrix \(A\) into a simple diagonal matrix.
5. DEFINITION (Similarity)
Definition
Let \(A\) and \(B\) be matrices in \(\mathsf{M}_{n\times n}(F)\). We say that \(B\) is similar to \(A\) if there exists an invertible matrix \(Q\) such that \(B = Q^{-1}AQ\).
A Simple Example
Let’s show that the matrix
$$
A = \begin{pmatrix} 1 & 0 \\ 0 & 2 \end{pmatrix}
$$
is similar to
$$
B = \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}
$$
Intuitively, these matrices do the exact same thing (scale one axis by 1 and the other by 2), just in a different order of axes.
1. Choose a “change of clothes” matrix \(Q\):
Let \(Q = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}\).
This matrix simply swaps the standard basis vectors \(e_1\) and \(e_2\).
Note that \(Q^{-1} = Q\) (swapping twice gets you back to the start).
2. Apply the similarity formula \(Q^{-1}AQ\):
$$
\begin{aligned} Q^{-1}AQ &= \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & 2 \end{pmatrix} \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \\ &= \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 0 & 1 \\ 2 & 0 \end{pmatrix} \\ &= \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix} \end{aligned}
$$
3. Result:
We calculated \(\begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}\), which is exactly matrix \(B\).
Thus, \(B = Q^{-1}AQ\), so \(A\) and \(B\) are similar.
References
The theorem numbering in this post follows *Linear Algebra* (4th Edition) by Friedberg, Insel, and Spence. Some explanations and details here differ from the book.