From Linear Transformation to Rank & Nullity

Linear Transformation is essentially a specific rule for moving vectors from one vector space to another vector space while keeping the underlying structure intact.
In this post, we will break down the formal definition, look at real-world examples (including Linear Regression), and explore the fundamental subspaces that define these transformations: the Null Space and the Range.
1. What is a Linear Transformation?
Let \(V\) and \(W\) be vector spaces. A function \(T: V \rightarrow W\) is called a linear transformation if it satisfies two specific rules for all vectors \(x, y\) in \(V\) and all scalars \(c\) in \(F\):
- Additivity (Preserves Addition)
$$
T(x + y) = T(x) + T(y)
$$
It doesn’t matter if you add two vectors together first and then transform them, or transform them separately and then add the results—you get the same answer.
- Homogeneity (Preserves Scalar Multiplication)
$$
T(cx) = cT(x)
$$
If you stretch a vector by a factor of \(c\) and then transform it, the result is the same as transforming the vector first and then stretching the result by \(c\).
Example A: The Matrix Transformation
Let’s define a transformation \(T: \mathbb{R}^2 \rightarrow \mathbb{R}^2\) by \(T(\mathbf{x}) = A\mathbf{x}\), where:
$$
A = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix}, \quad \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \in \mathbb{R}^2
$$
To verify this is a linear transformation, we check if it satisfies the properties of addition and scalar multiplication.
Let \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^2\) and \(c \in \mathbb{R}\).
$$
\mathbf{u} = \begin{bmatrix} u_1 \\ u_2 \end{bmatrix}, \quad \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}
$$
Step 1: Check Additivity (Preserves Addition)
Goal: Show that \(T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})\).
Step 1: Calculate the Left Side (\(T(\mathbf{u} + \mathbf{v})\))
First, add the vectors, then apply the transformation matrix \(A\):
$$
\mathbf{u} + \mathbf{v} = \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \end{bmatrix}
$$
$$
T(\mathbf{u} + \mathbf{v}) = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \end{bmatrix}
$$
$$
= \begin{bmatrix} 2(u_1 + v_1) \\ 0.5(u_2 + v_2) \end{bmatrix} = \begin{bmatrix} 2u_1 + 2v_1 \\ 0.5u_2 + 0.5v_2 \end{bmatrix}
$$
Step 2: Calculate the Right Side (\(T(\mathbf{u}) + T(\mathbf{v})\))
Apply the transformation to each vector separately, then add the results:
$$
T(\mathbf{u}) = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} = \begin{bmatrix} 2u_1 \\ 0.5u_2 \end{bmatrix}
$$
$$
T(\mathbf{v}) = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} = \begin{bmatrix} 2v_1 \\ 0.5v_2 \end{bmatrix}
$$
$$
T(\mathbf{u}) + T(\mathbf{v}) = \begin{bmatrix} 2u_1 \\ 0.5u_2 \end{bmatrix} + \begin{bmatrix} 2v_1 \\ 0.5v_2 \end{bmatrix}
$$
$$
= \begin{bmatrix} 2u_1 + 2v_1 \\ 0.5u_2 + 0.5v_2 \end{bmatrix}
$$
Conclusion: The Left Side equals the Right Side. Additivity holds.
Step 2: Check Homogeneity (Preserves Scalar Multiplication)
Goal: Show that \(T(c\mathbf{u}) = cT(\mathbf{u})\).
Step 1: Calculate the Left Side (\(T(c\mathbf{u})\))
First, scale the vector by \(c\), then apply matrix \(A\):
$$
c\mathbf{u} = \begin{bmatrix} cu_1 \\ cu_2 \end{bmatrix}
$$
$$
T(c\mathbf{u}) = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} cu_1 \\ cu_2 \end{bmatrix}
$$
$$
= \begin{bmatrix} 2(cu_1) \\ 0.5(cu_2) \end{bmatrix} = \begin{bmatrix} 2cu_1 \\ 0.5cu_2 \end{bmatrix}
$$
Step 2: Calculate the Right Side (\(cT(\mathbf{u})\))
Apply the transformation first, then scale the result by \(c\):
$$
T(\mathbf{u}) = \begin{bmatrix} 2u_1 \\ 0.5u_2 \end{bmatrix}
$$
$$
cT(\mathbf{u}) = c \begin{bmatrix} 2u_1 \\ 0.5u_2 \end{bmatrix}
$$
$$
= \begin{bmatrix} c(2u_1) \\ c(0.5u_2) \end{bmatrix} = \begin{bmatrix} 2cu_1 \\ 0.5cu_2 \end{bmatrix}
$$
Conclusion: The Left Side equals the Right Side. Homogeneity holds.
Example B: Identity and Zero
Two transformations appear so frequently they deserve their own notation.
- The Identity Transformation (\(I\)): Maps every vector to itself (\(I(x) = x\)).
- The Zero Transformation (\(T_0\)): Maps every vector to the zero vector (\(T_0(x) = 0\)).
Proeperties
- Property 1: Mapping the Zero Vector
- If \(T: V \to W\) is a linear transformation, then \(T(0_V) = 0_W\).
- Property 2: Combined Linearity Test
- \(T\) is linear if and only if \(T(cx + y) = cT(x) + T(y)\) for all \(x, y \in V\) and \(c \in F\).
- Property 3: Preservation of Subtraction
- If \(T\) is linear, then \(T(x – y) = T(x) – T(y)\) for all \(x, y \in V\).
- Property 4: General Linear Combinations (Superposition)
- \(T\) is linear if and only if \(T\left(\sum_{i=1}^{n} a_i x_i\right) = \sum_{i=1}^{n} a_i T(x_i)\).
2. The Core Structures: Null Space and Range
To deeply understand a transformation, we ask: “What gets destroyed?” and “What can we create?”.
Null Space (or Kernel)
- Symbol: \(N(T)\)
- Definition: The set of all vectors \(x\) in \(V\) such that \(T(x) = 0\).
$$
N(T) = {x \in V:T(x)=0}
$$
- Meaning: This is the set of inputs that the transformation “crushes” to zero.
Range (or Image)
- Symbol: \(R(T)\)
- Definition: The subset of \(W\) consisting of all outputs (\(T(x)\)) for every \(x\) in \(V\).
$$
R(T) = {T(x): x\in V}
$$
- Meaning: This is the set of all possible outcomes.
Why Structure Matters
Are the Null Space and Range just random sets of vectors? No. The following Theorem tells us that they preserve the structure of the vector space.
Theorem 2.1: Let \(V\) and \(W\) be vector spaces and \(T: V \rightarrow W\) be linear. Then \(N(T)\) and \(R(T)\) are subspaces of \(V\) and \(W\), respectively. (See appendix for proof)
3. Measuring the Spaces: Rank and Nullity
Since the Null Space and Range are subspaces, we can measure their dimensions.
- Nullity: The dimension of the Null Space \(N(T)\). It measures how much information is lost.
- Rank: The dimension of the Range \(R(T)\). It measures how many dimensions are kept in the output.
4. Theorem 2.3 (Dimension Theorem)
Let \(V\) and \(W\) be vector spaces, and let \(T: V \to W\) be linear. If \(V\) is finite-dimensional, then
$$
\text{nullity}(T) + \text{rank}(T) = \dim(V).
$$
Example: The Projection Matrix
Consider a transformation \(T\) that flattens 3D space onto the 2D floor (\(xy\)-plane): \(T: \mathbb{R}^3 \rightarrow \mathbb{R}^3\)
$$
A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}
$$
Step 1: Find the Nullity
- We found that the Null Space is the entire \(z\)-axis.
- Any vector on the \(z\)-axis can be written as \(c \cdot \begin{bmatrix} 0 \ 0 \ 1 \end{bmatrix}\).
- Since this line is defined by 1 basis vector (\(\begin{bmatrix} 0 \ 0 \ 1 \end{bmatrix}\)), it is a 1-dimensional line.
- \(\text{nullity}(T) = 1\)
Step 2: Find the Rank
- We found that the Range is the entire \(xy\)-plane.
- Any vector on this plane can be built from 2 basis vectors: the x-direction \(\begin{bmatrix} 1 \ 0 \ 0 \end{bmatrix}\) and the y-direction \(\begin{bmatrix} 0 \ 1 \ 0 \end{bmatrix}\).
- Since a plane has 2 dimensions:
- \(\text{rank}(T) = 2\)
The “Conservation of Dimensions” Check
Notice something interesting?
- Our input space (\(\mathbb{R}^3\)) had 3 dimensions.
- We lost 1 dimension (Nullity).
- We kept 2 dimensions (Rank).
- \(3 = 1 + 2\)
The Result: The input had 3 dimensions. We lost 1 (Nullity) and kept 2 (Rank). This perfectly illustrates the Rank-Nullity Theorem
$$
\text{dim}(V) = \text{Rank}(T) + \text{Nullity}(T)
$$
References & Further Reading
The theorem numbering in this post follows *Linear Algebra* (4th Edition) by Friedberg, Insel, and Spence. Some explanations and details here differ from the book. If you want a deeper and more rigorous treatment of linear algebra, this book is an excellent reference.
Appendix 1: Theorem 2.1
Theorem
Let \(V\) and \(W\) be vector spaces and \(T: V \rightarrow W\) be linear. Then \(N(T)\) and \(R(T)\) are subspaces of \(V\) and \(W\), respectively.
Let’s prove this formally. To avoid confusion, we will use 0V to denote the zero vector in V and 0W to denote the zero vector in W.
The Proof
Let’s prove this formally. To avoid confusion, we will use \(0_V\) to denote the zero vector in \(V\) and \(0_W\) to denote the zero vector in \(W\).
Part A: Proving \(N(T)\) is a subspace of \(V\)
To prove a set is a subspace, we must show it contains the zero vector and is closed under addition and scalar multiplication.
- Contains Zero: Since \(T\) is linear, we know \(T(0_V) = 0_W\). Therefore, \(0_V\) is inside \(N(T)\).
- Closed Under Addition: Let \(x, y \in N(T)\). This means \(T(x) = 0_W\) and \(T(y) = 0_W\). Then: $$T(x + y) = T(x) + T(y) = 0_W + 0_W = 0_W$$ Since the result is \(0_W\), the sum \(x + y\) is also in \(N(T)\).
- Closed Under Scalar Multiplication: Let \(c\) be a scalar. Then: $$T(cx) = cT(x) = c(0_W) = 0_W$$ Since the result is \(0_W\), the vector \(cx\) is also in \(N(T)\).
Conclusion: \(N(T)\) is a subspace of \(V\).
Part B: Proving \(R(T)\) is a subspace of \(W\)
We use the same three checks for the Range.
- Contains Zero: Because \(T(0_V) = 0_W\), we know that \(0_W\) is an output of the transformation. Thus, \(0_W \in R(T)\).
- Closed Under Addition: Let \(x, y \in R(T)\). This means there must exist some input vectors \(v, w\) in \(V\) such that \(T(v) = x\) and \(T(w) = y\). Then: $$x + y = T(v) + T(w) = T(v + w)$$ Since \(v+w\) is just another vector in \(V\), its output (\(x+y\)) must be in the Range.
- Closed Under Scalar Multiplication: Let \(c\) be a scalar. Then: $$cx = cT(v) = T(cv)$$ Since \(cv\) is in \(V\), its output (\(cx\)) is in the Range.
Conclusion: \(R(T)\) is a subspace of \(W\).
Appendix 2: Linear transformation properties
Property 1: Mapping the Zero Vector
If \(T: V \to W\) is a linear transformation, then \(T(0_V) = 0_W\).
Proof with Explicit Notation
- In the domain vector space \(V\), the zero vector satisfies the identity property: $$
0_V + 0_V = 0_V
$$ - Apply the transformation \(T\) to both sides of this equation: $$
T(0_V + 0_V) = T(0_V)
$$ - Because \(T\) is linear, we can use the Additivity property on the left side: $$
T(0_V) + T(0_V) = T(0_V)
$$ (Note: At this point, \(T(0_V)\) is a vector inside the codomain \(W\).) - To isolate \(T(0_V)\), we add the additive inverse of the vector \(T(0_V)\)—denoted as \(-T(0_V)\)—to both sides. This inverse exists because \(W\) is a vector space. $$
T(0_V) + T(0_V) + (-T(0_V)) = T(0_V) + (-T(0_V))
$$ - Simplify both sides using the definition of an additive inverse. On the right side, a vector plus its inverse equals the zero vector of that space (\(0_W\)): $$
T(0_V) + 0_W = 0_W
$$ - Finally, by the identity property of the zero vector in \(W\): $$
T(0_V) = 0_W
$$
This clearly shows that the “input zero” comes from \(V\) and the “output zero” lives in \(W\).
Property 2: Combined Linearity Test
Statement:
\(T\) is linear if and only if \(T(cx + y) = cT(x) + T(y)\) for all \(x, y \in V\) and \(c \in F\).
Why it holds (Proof):
This is an “if and only if” statement, so we must prove two directions.
Direction 1 (\(\Rightarrow\)): Assume \(T\) is linear. We show the formula holds.
- Start with \(T(cx + y)\).
- By the Additivity axiom: \(T(cx + y) = T(cx) + T(y)\).
- By the Homogeneity axiom: \(T(cx) = cT(x)\).
- Substitute this back in: \(T(cx + y) = cT(x) + T(y)\).
Direction 2 (\(\Leftarrow\)): Assume \(T(cx + y) = cT(x) + T(y)\) holds. We show \(T\) is linear.
- Check Additivity: Let \(c = 1\). $$
T(1x + y) = 1T(x) + T(y) \Rightarrow T(x + y) = T(x) + T(y)
$$ - Check Homogeneity: Let \(y = 0\). $$
T(cx + 0) = cT(x) + T(0)
$$ Since we know \(T(0)=0\) (from Property 1, which is implied by this formula if you set \(c=0, y=0\)), this simplifies to: $$
T(cx) = cT(x)
$$ Since both axioms are satisfied, \(T\) is linear.
Property 3: Preservation of Subtraction
Statement:
If \(T\) is linear, then \(T(x – y) = T(x) – T(y)\) for all \(x, y \in V\).
Why it holds (Proof):
Recall that vector subtraction \(x – y\) is defined as adding the negative: \(x + (-1)y\).
- Write the expression as a sum: $$
T(x – y) = T(x + (-1)y)
$$ - Apply Additivity: $$
= T(x) + T((-1)y)
$$ - Apply Homogeneity (pull out the scalar \(-1\)): $$
= T(x) + (-1)T(y)
$$ - Simplify arithmetic: $$
= T(x) – T(y)
$$
Property 4: General Linear Combinations (Superposition)
Statement:
\(T\) is linear if and only if \(T\left(\sum_{i=1}^{n} a_i x_i\right) = \sum_{i=1}^{n} a_i T(x_i)\).
Why it holds (Proof):
This is a generalization of the definition of linearity to \(n\) terms. We prove the forward direction (if \(T\) is linear, the sum formula holds) by Mathematical Induction.
- Base Case (\(n=1\)): Does \(T(a_1 x_1) = a_1 T(x_1)\)? Yes, by the Homogeneity axiom.
- Inductive Hypothesis: Assume the property holds for \(n = k\). That is: $$
T\left(\sum_{i=1}^{k} a_i x_i\right) = \sum_{i=1}^{k} a_i T(x_i)
$$ - Inductive Step (\(n = k+1\)): Consider a sum with \(k+1\) terms: $$
T\left(\sum_{i=1}^{k+1} a_i x_i\right) = T\left(\left(\sum_{i=1}^{k} a_i x_i\right) + a_{k+1}x_{k+1}\right)
$$ Treat the big sum inside the parenthesis as a single vector \(U\) and the last term as vector \(V\). By the Additivity axiom: $$
= T\left(\sum_{i=1}^{k} a_i x_i\right) + T(a_{k+1}x_{k+1})
$$ By the Inductive Hypothesis (for the first part) and Homogeneity (for the second part): $$
= \sum_{i=1}^{k} a_i T(x_i) + a_{k+1}T(x_{k+1})
$$ $$
= \sum_{i=1}^{k+1} a_i T(x_i)
$$
Since the base case holds and the inductive step holds, the property is true for all \(n\).