The Gold Standard: Why We Love Orthonormal Bases

In Linear Algebra, choosing the right basis can be the difference between a five-minute calculation and a five-page headache. While any linearly independent set can act as a basis, not all bases are created equal. In this post, we explore the ‘gold standard’ of coordinate systems: the Orthonormal Basis. We will see how orthogonal vectors allow us to find coordinates using simple dot products—completely bypassing the need for Gaussian elimination. Finally, we will cover The Gram-Schmidt Process, a powerful algorithm that allows us to take any ‘messy’ basis and refine it into a perfectly orthogonal set.

1. Definition: Orthonormal Basis

Let \(V\) be an inner product space. A subset of \(V\) is called an orthonormal basis for \(V\) if it satisfies two conditions:

  • Orthogonal: Every vector in the basis is perpendicular to every other vector. If you take the inner product (dot product) of any two different vectors, the result is 0.
  • Normal: Every vector has a length (norm) of exactly 1.

Why is this important?

An orthonormal basis is the “gold standard” of coordinate systems. It is exactly like the standard \(x, y, z\) axes we use in calculus. Calculations become incredibly simple.

Example: The Standard Basis for \(\mathbb{R}^3\)

The most famous example is the standard basis for 3D Euclidean space, often denoted as \(\{e_1, e_2, e_3\}\).

Let \(V = \mathbb{R}^3\) with the standard dot product.

$$
\beta = \{ (1, 0, 0), (0, 1, 0), (0, 0, 1) \}
$$

Let’s check the conditions:

  1. Is it Orthogonal?
    • Take the first two vectors: \((1, 0, 0) \cdot (0, 1, 0) = 1(0) + 0(1) + 0(0) = 0\).
    • They are perpendicular. This holds true for all pairs.
  2. Is it Normal?
    • Take the first vector: \(||(1, 0, 0)|| = \sqrt{1^2 + 0^2 + 0^2} = \sqrt{1} = 1\).
    • It has unit length. This holds true for all vectors in the set.

Therefore, \(\beta\) is an orthonormal basis.

2. Theorem 6.3: Coordinates in an Orthogonal Set

The Formal Statement

Let \(V\) be an inner product space and \(S = {v_1, v_2, \dots, v_k}\) be an orthogonal subset of \(V\) consisting of nonzero vectors. If a vector \(y\) is in the span of \(S\), then:

$$
y = \sum_{i=1}^{k} \frac{\langle y, v_i \rangle}{||v_i||^2} v_i
$$

This is a powerful shortcut. Normally, if you want to write a vector \(y\) as a linear combination of basis vectors (i.e., find its coordinates), you have to solve a system of linear equations (using Gaussian elimination).

However, if your basis vectors are orthogonal (perpendicular), you don’t need to solve a system. You can find the coefficient for each vector \(v_i\) independently using a simple formula.

  • The term \(\frac{\langle y, v_i \rangle}{||v_i||^2}\) is essentially the “projection” of \(y\) onto \(v_i\).
  • It tells you “how much” of \(y\) points in the direction of \(v_i\).

Example

Let \(V = \mathbb{R}^2\). Consider an orthogonal set (but not orthonormal):

$$
S = \{ v_1=(3, 0), v_2=(0, 2) \}
$$

Let \(y = (6, 8)\). We want to write \(y = c_1 v_1 + c_2 v_2\).

Using the formula:

  1. Find \(c_1\): $$
    c_1 = \frac{\langle y, v_1 \rangle}{||v_1||^2} = \frac{(6)(3) + (8)(0)}{3^2 + 0^2} = \frac{18}{9} = 2
    $$
  2. Find \(c_2\): $$
    c_2 = \frac{\langle y, v_2 \rangle}{||v_2||^2} = \frac{(6)(0) + (8)(2)}{0^2 + 2^2} = \frac{16}{4} = 4
    $$

So, \(y = 2v_1 + 4v_2\). (We found the coefficients without doing row reduction!)

3. Corollary 1: Coordinates in an Orthonormal Set

The Formal Statement

If \(S\) is orthonormal (not just orthogonal) and \(y \in \text{span}(S)\), then the formula simplifies to:

$$
y = \sum_{i=1}^{k} \langle y, v_i \rangle v_i
$$

This is the ideal scenario. Because the vectors are orthonormal, their length is 1 (\(|v_i| = 1\)), so the denominator \(|v_i|^2\) becomes 1 and disappears. The coefficient is simply the dot product of \(y\) and the basis vector. These coefficients are often called Fourier coefficients in generalized contexts.

Example

Let \(V = \mathbb{R}^2\). Consider the standard orthonormal basis:

$$
S = \{ e_1=(1, 0), e_2=(0, 1) \}
$$

Let \(y = (5, -3)\).

  1. Coefficient 1: \(\langle y, e_1 \rangle = (5)(1) + (-3)(0) = 5\).
  2. Coefficient 2: \(\langle y, e_2 \rangle = (5)(0) + (-3)(1) = -3\).

So, \(y = 5e_1 – 3e_2\). (This is trivial in the standard basis, but the math holds for any rotated orthonormal basis).

4. Corollary 2: Orthogonality Implies Independence

Let \(V\) be an inner product space, and let \(S\) be an orthogonal subset of \(V\) consisting of nonzero vectors. Then \(S\) is linearly independent.

This guarantees that “perpendicular” vectors are distinct enough that you can’t build one out of the others.

  • Geometric intuition: If \(v_1\) points North, and \(v_2\) points East, there is no way you can stretch or shrink the North arrow to make it point East. They are independent.
  • This is useful because it means any orthogonal set of \(n\) non-zero vectors in an \(n\)-dimensional space automatically forms a basis.

Example

Consider the vectors in \(\mathbb{R}^3\):

$$
v_1 = (1, 0, 0), \quad v_2 = (0, 1, 0), \quad v_3 = (0, 0, 1)
$$

Since \(v_1 \cdot v_2 = 0\), \(v_1 \cdot v_3 = 0\), and \(v_2 \cdot v_3 = 0\), this set is orthogonal.

Therefore, by Corollary 2, we immediately know \({v_1, v_2, v_3}\) is linearly independent without checking the determinant or row reducing a matrix.

5. Theorem 6.4: The Gram-Schmidt Process

The Formal Statement

Let \(V\) be an inner product space and \(S = {w_1, w_2, \dots, w_n}\) be a linearly independent subset of \(V\). Define a new set \(S’ = {v_1, v_2, \dots, v_n}\) where \(v_1 = w_1\) and the subsequent vectors are defined recursively:

$$
v_k = w_k – \sum_{j=1}^{k-1} \frac{\langle w_k, v_j \rangle}{||v_j||^2} v_j \quad \text{for } 2 \le k \le n
$$

Then \(S’\) is an orthogonal set of nonzero vectors such that the span of \(S’\) is exactly the same as the span of \(S\).

This theorem provides an algorithm (a step-by-step recipe) to “fix” a basis.

Often, we are given a basis that works (it spans the space and is independent), but it is “messy”—the vectors are skewed and not perpendicular to each other.

This process takes those skewed vectors (\(w_k\)) and straightens them out (\(v_k\)) one by one.

Example 1

Let’s convert a “messy” basis in \(\mathbb{R}^2\) into an orthogonal one.

Let \(S = { w_1=(1, 1), w_2=(0, 3) }\).

Note: These are independent, but not orthogonal (\(1 \cdot 0 + 1 \cdot 3 \neq 0\)).

Step 1: Set \(v_1\)

$$
v_1 = w_1 = (1, 1)
$$

Step 2: Calculate \(v_2\)

Formula: \(v_2 = w_2 – \frac{\langle w_2, v_1 \rangle}{||v_1||^2} v_1\)

  • Calculate the dot product: \(\langle w_2, v_1 \rangle = (0)(1) + (3)(1) = 3\).
  • Calculate the length squared: \(||v_1||^2 = 1^2 + 1^2 = 2\).
  • Substitute: $$
    v_2 = (0, 3) – \frac{3}{2}(1, 1)
    $$ $$
    v_2 = (0, 3) – (1.5, 1.5)
    $$ $$
    v_2 = (-1.5, 1.5)
    $$

The Result:

Our new orthogonal set is \(S’ = \{ (1, 1), (-1.5, 1.5) \}\).

Check: \((1)(-1.5) + (1)(1.5) = 0\). They are perfectly perpendicular!

Example 2: Legendre Polynomials (Orthogonalizing Functions)

Let \(V = P_2(\mathbb{R})\), the space of polynomials of degree at most 2.

Instead of the dot product, we define the inner product using an integral over the interval \([-1, 1]\):

$$
\langle f, g \rangle = \int_{-1}^{1} f(t)g(t) \, dt
$$

Let’s start with the standard, simple basis \(S = \{1, t, t^2\}\).

These are independent, but they are not orthogonal. For example, the overlap between \(1\) and \(t^2\) is not zero. Let’s apply Gram-Schmidt to produce the famous Legendre Polynomials.

Step 1: Set \(v_1\)

$$
v_1 = 1
$$

Step 2: Calculate \(v_2\)

We remove the projection of \(t\) onto \(1\).

$$
v_2 = t – \frac{\langle t, 1 \rangle}{||1||^2} (1)
$$

  • \(\langle t, 1 \rangle = \int_{-1}^{1} t \cdot 1 \, dt = 0\) (Odd function integral is 0).
  • Since the projection is 0, \(v_2\) remains unchanged. $$
    v_2 = t
    $$

Step 3: Calculate \(v_3\)

We remove the components of \(t^2\) that line up with \(1\) and \(t\).

$$
v_3 = t^2 – \frac{\langle t^2, 1 \rangle}{||1||^2} (1) – \frac{\langle t^2, t \rangle}{||t||^2} (t)
$$

  • Compute terms:
    • \(\langle t^2, 1 \rangle = \int_{-1}^{1} t^2 \, dt = [\frac{t^3}{3}]_{-1}^{1} = \frac{2}{3}\)
    • \(||1||^2 = \int_{-1}^{1} 1 \cdot 1 \, dt = 2\)
    • \(\langle t^2, t \rangle = \int_{-1}^{1} t^3 \, dt = 0\)
  • Substitute: $$
    v_3 = t^2 – \frac{2/3}{2}(1) – 0(t)
    $$ $$
    v_3 = t^2 – \frac{1}{3}
    $$

Result:

The orthogonal set is \(\{ 1, t, t^2 – \frac{1}{3} \}\). These are the first three Legendre polynomials (up to a scalar multiple).

Example 3: Orthogonalizing Matrices

Let \(V = M_{2 \times 2}(\mathbb{R})\).

We use the Frobenius inner product, which treats matrices like long vectors: \(\langle A, B \rangle = \text{tr}(B^T A)\), which is equivalent to summing the products of corresponding entries: \(\sum a_{ij}b_{ij}\).

Consider the linearly independent set:

$$
w_1 = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}, \quad w_2 = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix}, \quad w_3 = \begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix}
$$

Step 1: Set \(v_1\)

$$
v_1 = w_1 = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}
$$

  • Note: \(||v_1||^2 = 1^2 + 0 + 0 + 0 = 1\).

Step 2: Calculate \(v_2\)

$$
v_2 = w_2 – \frac{\langle w_2, v_1 \rangle}{||v_1||^2} v_1
$$

  • \(\langle w_2, v_1 \rangle = (1)(1) + (1)(0) + (0)(0) + (0)(0) = 1\).
  • Substitute: $$
    v_2 = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} – \frac{1}{1} \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}
    $$
  • Note: \(||v_2||^2 = 0 + 1^2 + 0 + 0 = 1\).

Step 3: Calculate \(v_3\)

$$
v_3 = w_3 – \frac{\langle w_3, v_1 \rangle}{||v_1||^2} v_1 – \frac{\langle w_3, v_2 \rangle}{||v_2||^2} v_2
$$

  • \(\langle w_3, v_1 \rangle = (1)(1) + 0 + 0 + 0 = 1\).
  • \(\langle w_3, v_2 \rangle = 0 + (1)(1) + 0 + 0 = 1\).
  • Substitute: $$
    v_3 = \begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix} – 1 \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} – 1 \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}
    $$ $$
    v_3 = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}
    $$

Result:

Our new basis \({v_1, v_2, v_3}\) consists of the standard basis matrices \(E_{11}, E_{12}, E_{21}\). We successfully decomposed the cumulative matrices into their independent, orthogonal “building blocks.”

References

The theorem numbering in this post follows Linear Algebra (4th Edition) by Friedberg, Insel, and Spence. Some explanations and details here differ from the book.

Leave a Reply

Your email address will not be published. Required fields are marked *