The Inner Product: Turning the Lights on in Abstract Spaces

I often find that the most beautiful concepts are the ones that build bridges. In linear algebra, vector spaces can be quite abstract—lists of numbers or functions floating in a void. But once you equip that space with an Inner Product, you suddenly turn the lights on. You gain the ability to measure lengths, and determine if two things are perpendicular (orthogonality), which is fundamental to linear regression.

Today, let’s demystify the inner product, moving from its strict formal definition to the intuition that makes it useful.

1. The Strict Mathematical Definition

Formally, let \(\mathbf{V}\) be a vector space over \(F\) (typically the real numbers \(\mathbb{R}\) or complex numbers \(\mathbb{C}\)).

An inner product on \(\mathbf{V}\) is a function that assigns, to every ordered pair of vectors \(x\) and \(y\) in \(\mathbf{V}\), a scalar in \(F\), denoted \(\langle x, y \rangle\), such that for all \(x\), \(y\), and \(z\) in \(\mathbf{V}\) and all \(c\) in \(F\), the following conditions hold:

(a) Additivity in the first slot

$$
\langle x + z, y \rangle = \langle x, y \rangle + \langle z, y \rangle
$$

(b) Homogeneity (Scaling) in the first slot

$$
\langle cx, y \rangle = c \langle x, y \rangle
$$

(c) Conjugate Symmetry

$$
\overline{\langle x, y \rangle} = \langle y, x \rangle
$$

where the bar denotes complex conjugation. If you swap the order of the vectors, the result is the complex conjugate of the original. If we are working with real numbers, the conjugate does nothing, and \(\langle x, y \rangle = \langle y, x \rangle\).

(d) Positive Definiteness

$$
\langle x, x \rangle > 0 \quad \text{if } x \neq 0
$$

The inner product of any non-zero vector with itself must be a strictly positive real number. This is crucial because it allows us to define the “length” of a vector as the square root of this value.

2. Example: The Standard Inner Product

There is a very important example of an inner product—the Standard Inner Product. This is the fundamental way we measure angles and lengths in standard coordinate spaces like \(\mathbb{R}^n\) (real numbers) or \(\mathbb{C}^n\) (complex numbers).

Mathematical Definition

Let \(F\) be a field (either real numbers \(\mathbb{R}\) or complex numbers \(\mathbb{C}\)). For two vectors \(x = (a_1, a_2, \dots, a_n)\) and \(y = (b_1, b_2, \dots, b_n)\) in \(F^n\), we define the standard inner product as:

$$
\langle x, y \rangle = \sum_{i=1}^{n} a_i \overline{b_i} = a_1\overline{b_1} + a_2\overline{b_2} + \dots + a_n\overline{b_n}
$$

A Note on the Real Case (The Dot Product):

When \(F = \mathbb{R}\), the conjugations are not needed because the conjugate of a real number is just itself. In early calculus or physics courses, this standard inner product is usually called the dot product and is denoted by \(x \cdot y\) instead of \(\langle x, y \rangle\).

How the Standard Inner Product Satisfies the 4 Conditions

To confirm this formula is truly an inner product, we must verify that it satisfies the four axioms defined in the previous section.

(a) Additivity in the First Slot

Condition: \(\langle x + z, y \rangle = \langle x, y \rangle + \langle z, y \rangle\)

Verification: Let \(z = (c_1, \dots, c_n)\). When we substitute \((x+z)\) into the sum, we use the distributive property of standard arithmetic.

$$
\langle x + z, y \rangle = \sum_{i=1}^{n} (a_i + c_i)\overline{b_i}
$$

$$
= \sum_{i=1}^{n} (a_i\overline{b_i} + c_i\overline{b_i})
$$

$$
= \sum_{i=1}^{n} a_i\overline{b_i} + \sum_{i=1}^{n} c_i\overline{b_i} = \langle x, y \rangle + \langle z, y \rangle
$$

Result: The sum splits cleanly, so Additivity holds.

(b) Homogeneity (Scaling) in the First Slot

Condition: \(\langle cx, y \rangle = c \langle x, y \rangle\)

Verification: Let \(c\) be a scalar in \(F\). If we scale the first vector \(x\) by \(c\), every component \(a_i\) becomes \(ca_i\).

$$
\langle cx, y \rangle = \sum_{i=1}^{n} (c a_i) \overline{b_i}
$$

Because \(c\) is a constant common to every term in the summation, we can factor it out:

$$
= c \left( \sum_{i=1}^{n} a_i \overline{b_i} \right) = c \langle x, y \rangle
$$

Result: The constant factors out, so Homogeneity holds.

(c) Conjugate Symmetry

Condition: \(\overline{\langle x, y \rangle} = \langle y, x \rangle\)

Verification: We start with the conjugate of the inner product of \(x\) and \(y\). Recall that \(\overline{A + B} = \overline{A} + \overline{B}\) and \(\overline{AB} = \overline{A} \cdot \overline{B}\).

So,

$$
\overline{\langle x, y \rangle} = \overline{\sum_{i=1}^{n} a_i \overline{b_i}} = \sum_{i=1}^{n} \overline{a_i \overline{b_i}}
$$

Since the conjugate of a conjugate is the original number (\(\overline{\overline{b}} = b\)):

$$
= \sum_{i=1}^{n} \overline{a_i} b_i = \sum_{i=1}^{n} b_i \overline{a_i}
$$

This formula \((\sum b_i \overline{a_i})\) is exactly the definition of \(\langle y, x \rangle\).

Result: Flipping the vectors yields the conjugate, so Symmetry holds.

(d) Positive Definiteness

Condition: \(\langle x, x \rangle > 0\) if \(x \neq 0\)

Verification: Take the inner product of a vector with itself. Here, \(y = x\), so \(b_i = a_i\).

In complex numbers, any number multiplied by its conjugate gives its magnitude squared (\(z\overline{z} = |z|^2\)). Since squares of real numbers (\(|a_i|^2\)) are always non-negative, the sum must be non-negative. If \(x \neq 0\), at least one \(|a_i|^2 > 0\), making the total sum strictly positive.

Result: The length squared is always positive for non-zero vectors.

Numerical Example

Let \(x = (2 + i, 3)\) and \(y = (1 – 2i, 5)\) in \(\mathbb{C}^2\). To find \(\langle x, y \rangle\):

  1. \(\overline{y_1} = 1 + 2i\), \(\overline{y_2} = 5\)
  2. Multiply components:
    • \((2 + i)(1 + 2i) = 2 + 5i + 2i^2 = 5i\)
    • \((3)(5) = 15\)
  3. Sum: \(\langle x, y \rangle = 15 + 5i\).

3. Example: Conjugate Transpose (Adjoint)

Before moving to inner products on matrices, we need to define a crucial operation: the conjugate transpose (often called the adjoint).

Let \(A\) be an \(m \times n\) matrix with entries in \(F\). We define the conjugate transpose of \(A\), denoted \(A^*\), as the \(n \times m\) matrix where the entry in the \(i\)-th row and \(j\)-th column is the complex conjugate of the entry in the \(j\)-th row and \(i\)-th column of \(A\).

Mathematically:

$$
(A^*)_{ij} = \overline{A_{ji}}
$$

To find \(A^*\), you take the transpose of the matrix (swap rows and columns) and then take the complex conjugate of every entry. If the matrix contains only real numbers, \(A^*\) is simply the standard transpose \(A^T\).

Numerical Example

Let \(A\) be the following \(2 \times 2\) matrix:

$$
A = \begin{pmatrix} 1 & 1+1i \\ 3 & 2+5i \end{pmatrix}
$$

To find \(A^*\):

  1. Transpose: Swap rows and columns. $$
    A^T = \begin{pmatrix} 1 & 3 \\ 1+1i & 2+5i \end{pmatrix}
    $$
  2. Conjugate: Flip the sign of the imaginary part for each entry.
    • \(\bar{1} = 1\) (Real numbers don’t change)
    • \(\overline{1+i} = 1-i\)
    • \(\bar{3} = 3\)
    • \(\overline{2+5i} = 2-5i\)

Result:

$$
A^* = \begin{pmatrix} 1 & 3 \\ 1-i & 2-5i \end{pmatrix}
$$

4. Example: The Frobenius Inner Product

While standard vectors are familiar, we can also apply inner products to matrices. Let \(\mathrm{V} = \mathrm{M}_{n \times n}(F)\) be the vector space of \(n \times n\) matrices.

We define the Frobenius inner product for two matrices \(A, B \in \mathrm{V}\) as:

$$
\langle A, B \rangle = \text{tr}(B^* A)
$$

Here, \(\text{tr}(A)\) denotes the trace of the matrix (the sum of diagonal elements, \(\sum_{i=1}^{n} A_{ii}\)), and \(B^*\) denotes the conjugate transpose of \(B\) (also known as the adjoint).

Verification

Let’s verify that this definition satisfies the axioms (specifically additivity and positive definiteness).

(a) Additivity

Let \(A, B, C \in \mathrm{V}\). Using the properties of the trace and matrix multiplication:

$$
\langle A + B, C \rangle = \text{tr}(C^*(A + B)) = \text{tr}(C^*A + C^*B)
$$

Since the trace is linear (\(\text{tr}(X+Y) = \text{tr}(X) + \text{tr}(Y)\)):

$$
= \text{tr}(C^*A) + \text{tr}(C^*B) = \langle A, C \rangle + \langle B, C \rangle
$$

(b) Homogeneity (Scaling)

Let \(c\) be a scalar in \(F\). We must show \(\langle cA, B \rangle = c \langle A, B \rangle\).

$$
\langle cA, B \rangle = \text{tr}(B^*(cA))
$$

Scalars factor out of matrix multiplication freely (\(M(cN) = c(MN)\)):

$$
= \text{tr}(c(B^*A))
$$

Since the trace is a linear map, scalars can be pulled outside (\(\text{tr}(cX) = c\text{tr}(X)\)):

$$
= c(\text{tr}(B^*A)) = c\langle A, B \rangle
$$

(c) Conjugate Symmetry

We need to show \(\overline{\langle A, B \rangle} = \langle B, A \rangle\).

$$
\overline{\langle A, B \rangle} = \overline{\text{tr}(B^* A)}
$$

Recall that the conjugate of the trace is the trace of the conjugate transpose (\(\overline{\text{tr}(X)} = \text{tr}(X^*)\)).

$$
= \text{tr}((B^* A)^*)
$$

Using the property of the conjugate transpose of a product (\((XY)^* = Y^* X^*\)):

$$
= \text{tr}(A^* (B^*)^*)
$$

Since \((B^*)^* = B\):

$$
= \text{tr}(A^* B) = \langle B, A \rangle
$$

(d) Positive Definiteness

$$
\langle A, A \rangle = \text{tr}(A^* A) = \sum_{i=1}^{n} (A^* A)_{ii}
$$

$$
= \sum_{i=1}^{n} \sum_{k=1}^{n} (A^*)_{ik} A_{ki}
$$

Since \((A^*){ik} = \overline{A{ki}}\), this becomes:

$$
= \sum_{i=1}^{n} \sum_{k=1}^{n} \overline{A_{ki}} A_{ki} = \sum_{i=1}^{n} \sum_{k=1}^{n} |A_{ki}|^2
$$

This is the sum of the squared magnitudes of every entry in the matrix. Now, if \(A \neq O\) (the zero matrix), then \(A_{ki} \neq 0\) for some \(k\) and \(i\), meaning \(\langle A, A \rangle > 0\).

4. Inner Product Spaces

A vector space \(\mathbf{V}\) over \(F\) endowed with a specific inner product is called an inner product space.

  • If \(F = \mathbb{C}\), we call \(\mathbf{V}\) a complex inner product space.
  • If \(F = \mathbb{R}\), we call \(\mathbf{V}\) a real inner product space.

It is important to realize that if \(\mathbf{V}\) has an inner product \(\langle x, y \rangle\) and \(\mathbf{W}\) is a subspace of \(\mathbf{V}\), then \(\mathbf{W}\) is also an inner product space when the same function \(\langle x, y \rangle\) is restricted to the vectors \(x, y \in \mathbf{W}\).

Thus, the examples above (\(\mathbb{R}^n\), \(\mathbb{C}^n\), and \(\mathrm{M}_{n \times n}(F)\)) act as our primary examples of inner product spaces.

  • Notation Note: For the remainder of this discussion, unless specified otherwise, \(F^n\) denotes the inner product space with the standard inner product. Likewise, \(\mathrm{M}_{n \times n}(F)\) denotes the space with the Frobenius inner product.
  • Caution: Two distinct inner products on a given vector space yield two distinct inner product spaces. The geometry changes depending on how you choose to measure “angles” and “lengths.”

Theorem 6.1

Inner products have the following derived properties. Let \(V\) be an inner product space. Then for \(x, y, z \in V\) and \(c \in F\):

\(a\) \(\langle x, y + z \rangle = \langle x, y \rangle + \langle x, z \rangle\)

\(b\) \(\langle x, cy \rangle = \overline{c} \langle x, y \rangle\) (Note: The scalar comes out conjugated from the second slot!)

\(c\) \(\langle x, 0 \rangle = \langle 0, x \rangle = 0\)

\(d\) \(\langle x, x \rangle = 0\) if and only if \(x = 0\)

\(e\) If \(\langle x, y \rangle = \langle x, z \rangle\) for all \(x \in V\), then \(y = z\)

References

The theorem numbering in this post follows Linear Algebra (4th Edition) by Friedberg, Insel, and Spence. Some explanations and details here differ from the book.

Leave a Reply

Your email address will not be published. Required fields are marked *