2 Linear Algebra

We include two primary references in this section: one is the textbook for Math 232 (Bretscher 2013) and the other is a standard textbook for advanced courses in linear algebra (Petersen 2012). We include the former for its familiarity and the latter for its abstractness and formalism. For example, results in Linear Algebra with Applications (Bretscher 2013) are stated over the real numbers and there is no discussion of Hermitian inner products; much of Linear Algebra (Petersen 2012) is done over an arbitrary field and spectral theory is given a robust treatment. For those interested in a compromise, featuring a pragmatic balance of theory with a special focus on materials of interest to physicists (e.g., separable Hilbert spaces, differential and integral operators, differential equations, etc.), we submit Introduction to Hilbert Spaces with Applications (Debnath and Mikusinski 2005). For another (decidedly abstract) treatment, see Abstract Algebra (Dummit and Foote 2003).

Basic Concepts

Definition 2.1 (Vector spaces) (Bretscher 2013, 167; Petersen 2012, 8) Let $F$ be a field. An $F$-vector space is a set $V$ with a binary operation $+: V \times V \to V$ and a map $F \times V \to V$, denoted by \[ (x,y) \mapsto x+y \quad \text{ and } \quad (\alpha,x) \mapsto \alpha x \]

and called vector addition and scalar multiplication, respectively, so that

$V,+$ is an abelian group.
(Identity) $1 x = x$ for all $x \in V.$
(Compatibility) $\alpha (\beta x) = (\alpha\beta) x$ for all $\alpha, \beta \in F,$ and $x \in V.$
(Distributivity over scalar addition) $(\alpha+\beta)x = \alpha x + \beta x$ for all $\alpha, \beta \in F,$ and $x \in V.$
(Distributivity over vector addition) $\alpha(x + y) = \alpha x + \alpha y$ for all $\alpha \in F,$ and $x, y \in V.$

The identity vector with respect to addition is written $0 \in V.$

Definition 2.2 (Algebras) (Dummit and Foote 2003, 342) If an $F$-vector space $V$ is equipped with another binary operation $\cdot$ giving it a ring structure and also satisfying \[ \alpha(x \cdot y) = (\alpha x) \cdot y = x \cdot (\alpha y), \] for all $\alpha \in F$ and $x,y \in V$, then $V$ is an $F$-algebra. If $\cdot$ is commutative then $V$ is a commutative $F$-algebra; otherwise, we might emphasize that $V$ is non-commutative. If every nonzero element in $V$ has a multiplicative inverse, we call it a division algebra.

Example 2.1 Every field $F$ is a commutative $F$-algebra; the trivial ring $0$ is also an $F$-algebra.

Example 2.2 $\mathbb{C}$ is a commutative $\mathbb{R}$-algebra with its usual field operations.

Example 2.3 The set of Hamilton quaternions $\mathbb{H}= \{ a + bi + cj + dk: a, b, c, d \in \mathbb{R}\}$ is a non-commutative $\mathbb{R}$-algebra (though, not a $\mathbb{C}$-algebra). Here multiplication is defined using the relations \[ i^2 = j^2 = k^2 = ijk = -1. \] Indeed, we will see that $\mathbb{H}$ is a division algebra.

Example 2.4 The set $\operatorname{Mat}_{m \times n}(F)$ of $m \times n$ matrices with entries in $F$ is an $F$-vector space with entrywise addition. The set of $n \times n$ matrices, here written more simply as $\operatorname{Mat}_n(F)$, is a non-commutative $F$-algebra with the additional operation of matrix multiplication.

Example 2.5 The ring of polynomials $F[x]$ is a commutative $F$-algebra.

Example 2.6 The set of all real-valued functions on $\mathbb{R}$, denoted $\operatorname{Func}(\mathbb{R},\mathbb{R})$, is an $\mathbb{R}$-algebra when equipped with pointwise operations: \[ (f+g)(x) \coloneqq f(x)+g(x) \quad \text{ and } \quad (f \cdot g)(x) \coloneqq f(x)g(x). \] Similarly, one might consider functions on a subset $\Omega \subseteq \mathbb{R}^n$ or with values in $\mathbb{C}$; we write $\operatorname{Func}(\Omega,\mathbb{R})$ and $\operatorname{Func}(\Omega,\mathbb{C})$ for these algebras (over $\mathbb{R}$ and, for the latter, also over $\mathbb{C}$).

Example 2.7 If $F$ is a field, the set $F^n$ is an $F$-vector space with coordinate-wise addition. We follow the convention of writing vectors as columns: \[ x = \begin{pmatrix} \alpha_1 \\ \vdots \\ \alpha_n \end{pmatrix} \normalsize \in F^n. \]

Definition 2.3 (Bretscher 2013, 170; Petersen 2012, 54) Let $V$ be an $F$-vector space. A subset $U \subseteq V$ is a subspace, denoted $U \leq V$, if it is an $F$-vector space with the operations from $V.$ Equivalently, a non-empty subset $U$ is a subspace if, for every $x,y \in U$ and $\alpha, \beta \in F$, we have $\alpha x+\beta y \in U.$ If $V$ is an $F$-algebra, a subspace $U$ is a subalgebra if it is also a subring.

Example 2.8 Every vector space contains the trivial subspace $\{0\}.$

Example 2.9 For $\Omega \subseteq \mathbb{R}^n$, the following are all subalgebras of $\operatorname{Func}(\Omega,\mathbb{R})$: \[ \begin{split} C(\Omega) & = \text{all continuous real-valued functions on } \Omega. \\ C^k(\Omega) & = \text{all real-valued functions on } \Omega \text{ with continuous partial derivatives of order } k. \\ C^\infty(\Omega) & = \text{all infinitely-differentiable real-valued functions on } \Omega. \\ \mathcal{P}(\Omega) & = \text{functions on } \Omega \text{ that can be expressed as polynomials in $n$ variables.} \\ \end{split} \] We write $C(\Omega,\mathbb{C}), C^k(\Omega,\mathbb{C}), C^\infty(\Omega,\mathbb{C}),$ and $\mathcal{P}(\Omega,\mathbb{C})$ for the analogous $\mathbb{C}$-valued subalgebras.

Proposition 2.1 (Petersen 2012, 56) Let $V$ be an $F$-vector space.

If $V_1, V_2 \leq V$, then $V_1 \cap V_2 \leq V.$
If $V_1 \leq V$ and $V_2 \leq V_1$, then $V_2 \leq V.$

Bases and Dimension

From this point on, all vector spaces are understood with respect to some fixed ground field $F$ unless stated otherwise.

Definition 2.4 (Complementary subspaces) (Petersen 2012, 57) Let $V$ be a vector space. We say that $V_1, V_2 \leq V$ are complementary if $V_1 \cap V_2 = \{0\}$ and every vector $x \in V$ can be expressed as $x = y+z$ for some $y \in V_1, z \in V_2.$

Proposition 2.2 Two subspaces $V_1, V_2 \leq V$ are complementary if and only if each $x \in V$ can be written uniquely as $x=y+z$ for $y \in V_1$ and $z \in V_2.$

Definition 2.5 (Bretscher 2013, 171; Petersen 2012, 56–57) Let $V$ be a vector space. Given $S \subseteq V$, the span of $S$ is the smallest subspace of $V$ containing $S$, i.e., the intersection of all subspaces containing $S.$ Equivalently, the span is the collection of all finite linear combinations in $S$: \[ \operatorname{Span}(S) \coloneqq \left\{ \small \sum_{i=1}^k \alpha_i x_i : k \in \mathbb{N}, \alpha_i \in F, x_i \in S \normalsize \right\}. \tag{2.1}\]

Definition 2.6 (Linear independence) (Bretscher 2013, 171; Petersen 2012, 72) Let $V$ be a vector space and $S \subseteq V$ non-empty. We say $S$ is linearly dependent if there are $x_1,\dots,x_n \in S$ and $\alpha_1,\dots,\alpha_n \in F$ not all zero with \[ \alpha_1 x_1 + \cdots + \alpha_n x_n = 0. \tag{2.2}\] A set $\{x_1,\dots,x_n\}$ is instead said to be linearly independent if it is not linearly dependent; that is, the only solution to Equation 2.2 is given by taking all $\alpha_i = 0.$

Definition 2.7 (Bases) (Bretscher 2013, 172; Petersen 2012, 14) Let $V$ be a vector space. A collection of vectors $\mathscr{B} \subset V$ is said to be a basis for $V$ if $\mathscr{B}$ is linearly independent and \[\operatorname{Span}(\mathscr{B}) = V.\] Often we refer to an ordered basis, which is a basis whose vectors are given a specific ordering.

Theorem 2.1 (Dimension) (Bretscher 2013, 172; Petersen 2012, 15) Let $V$ be a vector space. All bases of $V$ have the same cardinality, called the dimension of $V$ and written $\dim_F V$ (or just $\dim V$ if the context is clear). If this cardinality is finite, we say that $V$ is finite-dimensional and write $\dim V < \infty$; otherwise, $V$ is infinite-dimensional.

Theorem 2.2 (Uniqueness) (Bretscher 2013, 172; Petersen 2012, 15) Let $V$ be a vector space with \[ S = \{x_1,\dots,x_n\} \subset V \] a linearly independent set. Then every element in $\operatorname{Span}(S)$ can be expressed uniquely as a linear combination of elements in $S.$ In particular, if $\dim V < \infty$ and $\mathscr{B}$ is a basis for $V$, then every $x \in V$ can be expressed uniquely as a linear combination of basis elements.

Example 2.10 For $V = F^n$, the standard basis is given by elements $e_i \in F^n$ with a $1$ in the $i$th entry and 0 elsewhere. The tuple $(e_1,\dots,e_n)$ is an ordered basis for $F^n$, which is $n$-dimensional.

Example 2.11 The complex numbers $\mathbb{C}$ are 2-dimensional over $\mathbb{R}$, with the basis $\{1,i\}.$

Example 2.12 The quaternions $\mathbb{H}$ are 4-dimensional over $\mathbb{R}$, with the basis $\{1,i,j,k\}.$

Example 2.13 The matrix algebra $\operatorname{Mat}_n(F)$ is $n^2$-dimensional over $F.$

Example 2.14 The set $\{ x^n : n \in \mathbb{N}\}$, where we define $x^0 \coloneqq 1$, is a basis for the polynomial algebra $F[x]$ over $F$, which is therefore infinite-dimensional.

Example 2.15 The algebras $C(\mathbb{R}), C^k(\mathbb{R}),$ and $C^\infty(\mathbb{R})$ have infinite dimension over $\mathbb{R}$, each containing $\mathcal{P}(\Omega)$ as a subalgebra. Note that we know $\dim_{\mathbb{R}} \mathcal{P}(\Omega)$ is infinite by a (temporarily informal, cf. Definition 2.10) comparison to $\mathbb{R}[x]$, thinking of polynomials (the formal objects) as the functions defined in terms of polynomials.

Example 2.16 The trivial vector space (Example 2.8) is $0$-dimensional over $F$ with the basis $\varnothing.$

Decompositions

Definition 2.8 (Direct Sum) (Petersen 2012, 58) Given a pair of vector spaces $V$ and $W$ over the same field $F$, we can make the Cartesian product $V \times W$ into a vector space by defining \[ (x,y) + (x',y') \coloneqq (x+x',y+y') \quad \text{ and } \quad \alpha (x,y) = (\alpha x,\alpha y). \] To avoid such cumbersome notation, we think of elements in $V \times W$ as formal sums $x+y$ instead of pairs $(x,y).$ This space is the direct sum of $V$ and $W$ and is denoted $V \oplus W.$

Remark 2.1. By construction, every $x \in V \oplus W$ has a unique decomposition $x = y+z$ for $y \in V$ and $z \in W.$ That is, $V$ and $W$ can be identified with natural complementary subspaces of $V \oplus W.$

Theorem 2.3 (Petersen 2012, 58) If $\{v_1,\dots,v_n\} \subset V$ and $\{w_1,\dots,w_m\} \subset W$ are bases, then \[ \{v_1,\dots,v_n,w_1,\dots,w_m\} \] is a basis for $V \oplus W.$ Hence $\dim (V \oplus W) = \dim V + \dim W.$

Theorem 2.4 (Existence of complements) (Petersen 2012, 61) Let $V$ be finite-dimensional with basis $\mathscr{B} = \{v_1,\dots,v_n\}$ and $U \leq V.$ Then it is possible to choose $\{v_{i_1},\dots,v_{i_k}\} \subseteq \mathscr{B}$ so that \[ V = U \oplus \operatorname{Span}\{v_{i_1},\dots,v_{i_k}\}. \]

Corollary 2.1 If $V$ is a vector space and $U \leq V$, then $\dim U \leq \dim V.$ In particular, if $V$ is finite-dimensional, then $\dim U = \dim V$ if and only if $U = V.$

Linear Transformations

Definition 2.9 (Bretscher 2013, 178; Petersen 2012, 20) Let $V$ and $W$ be vector spaces over a field $F.$ Then a linear transformation (a.k.a. a linear map, a linear operator, or a homomorphism of $F$-vector spaces) is a function $L: V \to W$ satisfying \[ L(x+y)=L(x)+L(y) \quad \text{ and } \quad L(\alpha x) = \alpha L(x) \] for all $x,y \in V$ and $\alpha \in F.$ Equivalently, \[ L(\alpha x+\beta y) = \alpha L(x)+\beta L(y) \text{ for each } x,y \in V, \alpha,\beta \in F. \] We write $\operatorname{Hom}_F(V,W)$, or simply $\operatorname{Hom}(V,W)$ if the context is clear, for the set of all such maps. If $V=W$, we abbreviate further to $\operatorname{End}(V) \coloneqq \operatorname{Hom}(V,V).$ If $L$ is a bijection, then its inverse is also linear; we call $L$ a linear isomorphism and write $V \cong W.$

Remark 2.2. Knowing the values of $L$ on a basis $\mathscr{B} \subset V$ determines $L$; we often define linear maps by deciding where to send basis elements, then extending linearly. In addition, we will often write $Lx$ instead of $L(x)$ to avoid cumbersome parentheses.

Example 2.17 The identity map $V \to V$ is a linear isomorphism. More generally, scaling by $\lambda \in F^\times$ is a linear isomorphism from $V \to V$ (known as a homothety) as is rotation about a particular axis through the origin.

Proposition 2.3 The composition of linear transformations is a linear transformation. Moreover, the composition of linear isomorphisms is a linear isomorphism.

Example 2.18 Complex conjugation is an $\mathbb{R}$-linear isomorphism $\mathbb{C}\to \mathbb{C}.$ Similarly, if $V$ is a field and $F$ is its canonical (a.k.a. prime (Dummit and Foote 2003, 511)) subfield then every element of $\operatorname{Aut}(V)$ is an $F$-linear isomorphism $V \to V$ (a.k.a., an $F$-equivariant automorphism).

Example 2.19 The map $\mathrm{D}: C^1([0,1]) \to C([0,1])$ given by $(\mathrm{D}f)(t) \coloneqq \frac{\mathop{}\!\mathrm{d}f}{\mathop{}\!\mathrm{d}t}$ is a linear map. Note that $\mathrm{D}$ is not injective, since every constant function maps to zero.

Example 2.20 The map $\mathrm{I}: C([0,1]) \to C^1([0,1])$ given by \[ (\mathrm{I}f)(t) = \int_0^t f(s) \mathop{}\!\mathrm{d}x \] is linear. By the fundamental theorem of calculus, $\mathrm{D}\circ \mathrm{I}= \mathrm{id}.$

Cosets and Quotients

Definition 2.10 (Bretscher 2013, 178; Petersen 2012, 64) If $L: V \to W$ is a linear map, then \[ \ker L \coloneqq \{ x \in V: Lx = 0 \} \] is called the kernel of $L$; the image of $L$ is defined as \[ \operatorname{im}L \coloneqq \{ y \in W: y = Lx \text{ for some } x \in V \}. \] Recall that $\ker L \leq V$ and $\operatorname{im}L \leq W$ and also that some books call $\ker L$ the nullspace. We call $\operatorname{Null}(L) \coloneqq \dim \ker L$ the nullity of $L$ and $\operatorname{rank}(L) \coloneqq \dim \operatorname{im}L$ the rank of $L.$

Definition 2.11 (Petersen 2012, 67) The map $V_1 \oplus V_2 \to V_1$ given by forgetting the $V_2$ term, i.e., \[ x+y \mapsto x, \] is linear and called a projection onto $V_1.$ Equivalently, a linear map $P: V \to V$ is called a projection if $P^2 = P$, i.e., $Px = x$ for all $x \in \operatorname{im}P.$ In this setup, $\mathrm{id}-P: V \to V$ is also a projection such that $V_1 \coloneqq \operatorname{im}P = \ker(\mathrm{id}-P)$, $V_2 \coloneqq \ker P = \operatorname{im}(\mathrm{id}-P)$, and $V = V_1 \oplus V_2.$

Theorem 2.5 If $L: V \to W$ is a linear transformation and $Lx = y$, then \[ L^{-1}(y) = x + \ker L \coloneqq \{ x + z: z \in \ker L \}. \] Informally: fibers of linear maps look the same. In particular, $L$ is one-to-one if and only if $\ker L$ is the trivial subspace. We call $x + \ker L$ the coset of $x$ with respect to $\ker L$. This construction makes sense if we replace $\ker L$ with any subspace of $V$, since any subspace can be realized as the kernel of some linear map.

Example 2.21 Given a differential equation of the form \[ \mathrm{D}f = g, \] where $\mathrm{D}$ is some linear differential operator and $g$ is a given function, a common technique is to first find all solutions to the equation $\mathrm{D}f = 0$ (i.e., $\ker D$) and then find a particular function $f_p$ so that $\mathrm{D}f_p = g.$ Then the space of all solutions is the coset $f_p + \ker \mathrm{D}.$

To see this in action, consider the differential equation \[ f''(t)+f(t)= - \sin t. \] By the arduous process of educated guesses, one can find a particular solution $f_p(t) = \tfrac{t}{2} \cos t.$ Moreover, by considering the associated characteristic equation, one can show that all real-valued solutions to $f''(t)+f(t)=0$ are of the form $a \cos t + b \sin t$ for $a,b \in \mathbb{R}$. Hence \[ \left\{ \tfrac{t}{2} \cos t + a \cos t + b \sin t : a, b \in \mathbb{R}\right\} \] is the set of all solutions to $f''(t) + f(t) = -\sin t$. See, e.g., (Logan 2015, 100) for further reading.

Remark 2.3. Recall the infamous $+C$ that calculus instructors are so fond of. They were generously reminding you that solutions to differential equations are really cosets!

Theorem 2.6 (Linear Quotients) (Dummit and Foote 2003, 108; Petersen 2012, 108) Let $V$ be an $F$-vector space with $U \leq V$. Then the set of cosets \[ V/U \coloneqq \{ x + U: x \in V \} \] is a (well-defined) $F$-vector space with the operations \[ (x+U) + (y+U) \coloneqq (x+y) + U \quad \text{ and } \quad \alpha (x+U) \coloneqq (\alpha x)+U. \] We call $V/U$ the quotient of $V$ with respect to $U$ and $\dim V/U$ the codimension of $U \leq V$. If we suppose further that $V$ is finite-dimensional, then we have \[ \dim V/U = \dim V - \dim U, \] i.e., the codimension of $U$ is the same as the dimension of any complementary subspace to $U$.

Proof (Sketch). If $\{v_1,\dots,v_n\}$ is a basis for $V$ and $V = U \oplus \operatorname{Span}\{v_{i_1},\dots,v_{i_k}\}$, then show \[ \{ v_{i_1} + U, \dots, v_{i_k} + U \} \] is a basis for $V/U$.

Theorem 2.7 (Nöther’s isomorphism) (Dummit and Foote 2003, 412; Petersen 2012, 109) Let $L: V \to W$ be a linear map. Then there is a natural isomorphism \[ V/ \ker L \cong \operatorname{im}L. \] Moreover, there are inclusion-respecting bijections: \[ \{ \text{subspaces of $V$ containing $\ker L$} \} \leftrightarrow \{ \text{subspaces of $V/\ker L$} \} \leftrightarrow \{ \text{subspaces of $\operatorname{im}L$} \}. \]

Corollary 2.2 (Rank-nullity theorem) If $V$ is finite-dimensional and $L: V \to W$ is linear, then \[ \dim L = \operatorname{Null}L + \operatorname{rank}L. \]

Corollary 2.3 (Isomorphism criteria) Suppose $V$ and $W$ are vector spaces of the same finite dimension and that $L: V \to W$ is a linear transformation. Then the following are equivalent:

$L$ is an isomorphism.
$\ker L = \{0\}$.
$\operatorname{im}L = W$.
$L$ sends a basis of $V$ to a basis of $W$.

Corollary 2.4 Two finite dimensional vector spaces over the same field are isomorphic if and only if they have the same dimension.

Matrices

Definition 2.12 (Bretscher 2013, 187; Petersen 2012, 48) Given a linear transformation $L: V \to W$ and a pair of ordered bases $\mathscr{A} = (v_1,\dots,v_n)$ and $\mathscr{B} = (w_1,\dots,w_m)$ for $V$ and $W$, respectively, we associate a matrix (i.e., an $m \times n$ array of scalars) to $L$ as follows. For each basis element $v_j \in V$, we know the image $Lv_j \in W = \operatorname{Span}(\mathscr{B})$ can be expressed uniquely as a linear combination: \[ Lv_j = \ell_{1,j} w_1 + \ell_{2,j} w_2 + \cdots + \ell_{m,j} w_m, \] with each $\ell_{i,j} \in F$. For bookkeeping, we arrange these coefficients into columns: \[ {}_{\mathscr{B}}[L]_{\mathscr{A}} \coloneqq \begin{pmatrix} \ell_{1,1} & \cdots & \ell_{1,n} \\ \vdots & \ddots & \vdots \\ \ell_{m,1} & \cdots & \ell_{m,n} \\ \end{pmatrix} \in \operatorname{Mat}_{m \times n}(F). \] When the bases are clear, we might simply write $M_L$ for this matrix.

For an arbitrary $x \in V$ expressed in the basis $\mathscr{A}$, i.e., $x = \alpha_1 v_1 + \cdots + \alpha_n v_n$, we can compute \[ Lx = (\ell_{1,1} \alpha_1 + \cdots + \ell_{1,n} \alpha_n) w_1 + \cdots + (\ell_{m,1} \alpha_1 + \cdots + \ell_{m,n} \alpha_n) w_m. \] More compactly, writing $(Lx)_i$ for the $w_i$ coefficient of $Lx$, we have \[ (Lx)_i = \sum_{j=1}^n \ell_{i,j} \alpha_j. \] If we write a column vector $[x]_\mathscr{A}$ with the coefficient of $v_j$ in the $j$th entry, then we can compute $L(x)$ using the standard definition of matrices acting on column vectors: \[ [L(x)]_\mathscr{B} = {}_{\mathscr{B}}[L]_{\mathscr{A}} \; [x]_\mathscr{A}. \] Hence, given bases, much of the theory of linear maps can be reduced to that of matrices from $\operatorname{Mat}_{m \times n}(F)$ acting on column vectors from $F^n$ (cf. Example 2.7). For example, the rank of a linear map is equal to the rank of its associated matrix, which can be computed using the concrete techniques of row reduction; see (Bretscher 2013, 26; Petersen 2012, 82).

We will most often be interested in maps $L: V \to V$ with the same basis $\mathscr{B}$ for the domain and the codomain, where we simply write $[L]_{\mathscr{B}}$ for the associated matrix.

Theorem 2.8 Composition of linear transformations corresponds to matrix multiplication. That is, if $T: U \to V$ and $L: V \to W$ are linear maps with $\mathscr{A}, \mathscr{B},$ and $\mathscr{C}$ ordered bases for the vector spaces $U, V,$ and $W$, respectively, then \[ {}_{\mathscr{C}}[L \circ T]_{\mathscr{A}} = {}_{\mathscr{C}}[L]_{\mathscr{B}} \; {}_{\mathscr{B}}[T]_{\mathscr{A}}. \]

Example 2.22 Consider the operator $\mathrm{D}$ from Example 2.19 restricted to polynomials of degree less than $n+1$. Take the ordered basis $(1, x, x^2, \dots, x^n)$. Then $\mathrm{D}$ is associated to the matrix \[ \begin{pmatrix} 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & n \\ 0 & 0 & 0 & \cdots & 0 \end{pmatrix}. \] We can compute $\frac{\mathop{}\!\mathrm{d}}{\mathop{}\!\mathrm{d}x} (x^4-3x^2+7x+2) = 4x^3-6x+7$ by way of matrix multiplication: \[ \begin{pmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 2 & 0 & 0 \\ 0 & 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 & 4 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} 2 \\ 7 \\ -3 \\ 0 \\ 1 \end{pmatrix} = \begin{pmatrix} 7 \\ -6 \\ 0 \\ 4 \\ 0 \end{pmatrix}. \]

Theorem 2.9 (Bretscher 2013, 194; Petersen 2012, 48) Let $V$ be finite-dimensional with $L: V \to V$ a linear map and consider two ordered bases $\mathscr{A}$ and $\mathscr{B} = (v_1,\dots,v_n)$. Then \[ S \; [L]_{\mathscr{B}} = [L]_{\mathscr{A}} \; S \quad \text{ and } \quad [L]_{\mathscr{B}} = S^{-1} \; [L]_{\mathscr{A}} \; S, \] where $S$ is the invertible matrix whose $i$th column is given by $v_i$ expressed in the basis $\mathscr{A}$. The latter is known as conjugation by the matrix $S$.

Example 2.23 Consider complex conjugation, as in Example 2.18. The change of basis matrix from $\mathscr{A} = \{1+i, 1-i\}$ to $\mathscr{B} = \{1,i\}$ is given by \[ S = \begin{pmatrix} 1/2 & 1/2 \\ 1/2 & -1/2 \\ \end{pmatrix}, \] since $\tfrac{1}{2} (1+i) + \tfrac{1}{2} (1-i) = 1$ and $\tfrac{1}{2} (1+i) + \tfrac{-1}{2} (1-i) = i$. Noting that $S^{-1} = 2 S$, we have \[ \begin{pmatrix} 1 & 0 \\ 0 & -1 \\ \end{pmatrix} = S^{-1} \begin{pmatrix} 0 & 1 \\ 1 & 0 \\ \end{pmatrix} S. \] This is an example of diagonalizing a matrix, i.e., changing from a less convenient basis into a more convenient basis, to be reviewed in Section 2.3.3.

Definition 2.13 If $V$ and $W$ are finite-dimensional and $L: V \to V$ and $T: W \to W$ are a pair of linear maps, there is an induced map $L \oplus T: V \oplus W \to V \oplus W$ given by \[ (L \oplus T)(x+y) \coloneqq L(x) + T(y) \] for each $x \in V$ and $y \in W$. Given ordered bases for $V$ and $W$, so that $L$ and $T$ are expressed by the matrices $M_L$ and $M_T$, respectively, the map $L \oplus T$ is represented by the block matrix \[ \begin{pmatrix} M_L & 0 \\ 0 & M_T \end{pmatrix}. \] Block matrices are especially important because they visually depict how the associated linear map respects (or fails to respect) a subspace decomposition.

Example 2.24 In the previous sense, a diagonal matrix \[ \begin{pmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{pmatrix} \] can be understood as the direct sum of many scaling transformations: $\lambda_1 \oplus \lambda_2 \oplus \cdots \oplus \lambda_n$, which each $\lambda_i$ acts on the subspace spanned by $e_i$. In this sense, diagonalized matrices are nice because they have been decomposed into many $1$-dimensional linear transformations (a.k.a. scaling) that do not interact with one another.

Definition 2.14 There are many equivalent and useful definitions for the determinant of $A \in \operatorname{Mat}_n(F)$, written $\det(A) \in F$. In particular, the determinant is the only continuous map $\operatorname{Mat}_n(\mathbb{R}) \to \mathbb{R}$ satisfying $\det(\lambda \mathrm{id}) = \lambda^n$ and \[ \det(AB) = \det(A) \det(B) \] for all $A, B \in \operatorname{Mat}_n(\mathbb{R})$ and $\lambda \in \mathbb{R}$; there are similar characterizing properties over arbitrary fields. From this one can show, among other things, that $\det(A) = 0$ if and only if $A$ is not invertible and $\det(A^{-1}) = \det(A)^{-1}$ for all invertible matrices.

Another, more geometric way of understanding determinants over $F = \mathbb{R}$ is in terms of the volume of the paralellepiped determined by the columns of the matrix. In other words, the determinant measures how a matrix dilates or contracts unit volume. One can use this formulation to again see that $\det(A) = 0$ if and only if $\{0\} \subset \ker A$, since a matrix with non-trivial kernel must collapse at least one axis in $\mathbb{R}^n$ to zero.

Determinants can be computed in a number of ways, in particular the recursive Laplace expansion in terms of a weighted sum of minors (a.k.a. submatrices obtained by deleting one row and one column) or via adjugates. Lastly, we can speak of the determinant of any linear map $L: V \to V$ by fixing an ordered basis $\mathscr{A}$, and computing $\det([L]_{\mathscr{A}})$. This is well-defined, since choosing a different basis $\mathscr{B}$ for $V$ amounts to conjugation and therefore \[ \det([L]_{\mathscr{B}}) = \det(S^{-1} \; [L]_{\mathscr{A}} \; S) = \det(S)^{-1} \det( [L]_{\mathscr{A}}) \det(S) = \det( [L]_{\mathscr{A}}). \]

Definition 2.15 (Bretscher 2013, 194; Petersen 2012, 48) The trace of a matrix $A \in \operatorname{Mat}_n(F)$ is given by the sum of its diagonal entries, i.e., writing $a_{i,j}$ for the entry of $A$ in the $i$th row and $j$th column, $\operatorname{Tr}(A) \coloneqq a_{1,1} + \cdots + a_{n,n}$. The trace satisfies a cyclic invariance property: \[ \operatorname{Tr}(AB) = \operatorname{Tr}(BA) \] for all $A, B \in \operatorname{Mat}_n(F)$, as well as linearity. Moreover, the trace is the unique linear map $\operatorname{Mat}_n(F) \to F$ satisfying the cyclic invariance property and $\operatorname{Tr}(\mathrm{id}) = n$. The trace of a linear map on a finite-dimensional vector space $V$ is defined by fixing a basis and computing the trace of the associated matrix; as with determinants, this is independent of the basis choice.

Definition 2.16 The commutator of two linear operators $A, B: V \to V$ is defined as \[ [A,B] \coloneqq AB - BA. \] This expression is so named because $AB = BA$ if and only if $[A,B] = 0$.

Definition 2.17 The operator $T: C^\infty([0,1]) \to C^\infty([0,1])$ given by $(Tf)(t) = t f(t)$ is linear. This map does not commute with $\mathrm{D}= \frac{\mathop{}\!\mathrm{d}}{\mathop{}\!\mathrm{d}t}$, since \[ (\mathrm{D}T f)(t) = f(t) + t f'(t) \not = t f'(t) = (T \mathrm{D}f)(t). \] Indeed, $[\mathrm{D},T] = \mathrm{id}$. Physicists like to make a big deal about this sort of thing.

Hermitian Inner Products

Coming soon

Orthogonal Complements

Coming soon

Orthonormal Bases

Coming soon

Functionals and Adjoints

Coming soon

Spectral Decomposition

Coming soon

Eigenvalues

Coming soon

Diagonalization

Coming soon

Spectral Theorem

Coming soon