Linear Algebra (3)
Linear Algebra (3)
Polynomials
Complex Conjugate and Absolute Value
Definition 4.2: \(Re(z)\), \(Im(z)\)
Suppose \(z = a + b_i\), where \(a\) and \(b\) are real numbers.
- The real part of \(z\), denoted \(Re(z)\), is defined by \(Re(z) = a\).
- The imaginary part of \(z\), denoted \(Im(z)\), is defined by \(Im(z) = b\).
Thus for every complex number \(z\), we have:
\[z = Re(z) + Im(z) i\]
Definition 4.3: Complex Conjugate, \(\bar{z}\), absolute value, |z|.
Suppose \(z \in \mathbb{C}\):
- The complex conjugate of \(z \in \mathbb{C}\), denoted \(\bar{z}\), is defined by: \[\bar{z} = Re(z) - Im(z)i\]
- The absolute value of a complex number \(z\), denoted \(|z|\), is defined by: \[|z| = \sqrt{(Re (z))^2 + (Im (z))^2}\]
Theorem 4.5: Properties of Complex Numbers
Suppose \(w, z \in \mathbb{C}\). Then:
- Sum of \(z\) and \(\bar{z}\): \[z + \bar{z} = 2 Re(z)\]
- Difference of \(z\) and \(\bar{z}\): \[z - \bar{z} = 2(Im (z)) i\]
- Product of \(z\) and \(\bar{z}\): \[z \bar{z} = |z|^2\]
- Additive and Multiplicative of Complex Conjugate: \[\overline{w + z} = \bar{w} + \bar{z}, \quad \overline{wz} = \bar{w}\bar{z}\]
- Conjugate of Conjugate: \[\bar{\bar{z}} = z\]
- Real and Imaginary Parts are Bounded by \(|z|\): \[|Re (z)| \leq |z|, \quad |Im(z)| \leq |z|\]
- Absolute Value of the Complex Conjugate: \[|\bar{z}| = |z|\]
- Multiplicative of Absolute Value \[|wz| = |w||z|\]
- Triangle Inequality: \[|w + z| \leq |w| + |z|\]
Uniqueness of Coefficients for Polynomials
A function \(p: \mathbb{F} \rightarrow \mathbb{F}\) is called a polynomial with coefficients in \(\mathbb{F}\) if there exists \(a_0 , ..., a_m \in \mathbb{F}\) s.t:
\[p(z) = a_0 + a_1z + .... + a_mz^m\]
for all \(z \in \mathbb{F}\)
Theorem 4.7: If a Polynomial is the Zero Function, Then All Coefficients are \(0\)
Suppose \(a_0, ..., a_m \in \mathbb{F}\). If
\[a_0 + a_1 z + ... + a_m z^m = 0\]
for every \(z \in \mathbb{F}\), then \(a_0 = ... = a_m = 0\)
This results show that the coefficients of a polynomial are uniquely determined.
Recall that if a polynomial \(p\) with \(a_m \neq 0\), then we say that \(p\) has degree \(m\), and we write \(\deg p = m\). The degree of \(0\) polynomial is defined to be \(-\infty\).
Theorem 4.8: Division Algorithm for Polynomials
Suppose that \(p, s \in P(\mathbb{F})\), with \(s \neq 0\). Then there exist unique polynomials, \(q, r \in P(\mathbb{F})\) s.t:
\[p = sq + r\]
and \(\text{deg } r < \text{deg } s\)
Definition 4.9: Zero of a Polynomial
A number \(\lambda \in \mathbb{F}\) is called a zero or root of a polynomial \(p \in P(\mathbb{F})\) if:
\[p(\lambda) = 0\]
Definition 4.10: Factor
A polynomial \(s \in P(\mathbb{F})\) is called a factor of \(p \in P(\mathbb{F})\) if there exists a polynoimal \(q \in P(\mathbb{F})\) s.t \(p = sq\).
Theorem 4.11: Each Zero of a Polynomial Corresponds to a Degree-1 Factor
Suppose \(p \in P(\mathbb{F})\) and \(\lambda \in \mathbb{F}\). Then \(p(\lambda) = 0\) if and only if there is a polynomial \(q \in P(\mathbb{F})\) s.t:
\[p(z) = (z - \lambda) q(z)\]
for every \(z \in \mathbb{F}\)
Theorem 4.12: A Polynomial has at most as many Zeros as Its Degree
Suppose \(p \in P(\mathbb{F})\) is a polynomial with degree \(m \geq 0\). Then \(p\) has at most \(m\) distinct zeros in \(\mathbb{F}\).
Theorem 4.13: Fundamental Theorem of Algebra
Every nonconstant polynomial with complex coefficients has a zero.
Theorem 4.14: Factorization of a Polynomial over \(\mathbb{C}\)
If \(p \in P(\mathbb{C})\) is a nonconstant polynomial, then \(p\) has a unique factorization except for the order of the factors of the form:
\[p(z) = c(z - \lambda_1) ... (z - \lambda_m)\]
where \(c, \lambda_1, ..., \lambda_m \in \mathbb{C}\), \(\lambda_1, ..., \lambda_m\) are the zeros of \(p\).
The failure of the fundamental theorem of algebra for \(\mathbb{R}\) accounts for the differences between operators on real and complex vector spaces.
Theorem 4.15: Polynomials with Real Coefficients Have Zeros in Pairs
Suppose \(p \in P(\mathbb{C})\) is a polynomial with real coefficients. If \(\lambda \in \mathbb{C}\) is a zero of \(p\), then so is \(\bar{\lambda}\).
Proof of Theorem 4.15
Let:
\[p(z) = a_0 + a_1z + .... + a_m z^m\]
If \(\lambda\) is a root of \(p\), we have:
\[p(\lambda) = a_0 + a_1 \lambda_m + .... + a_m \lambda^m = 0\]
By taking the conjugate:
\[\overline{a_0 + a_1 \lambda_m + .... + a_m \lambda^m} = a_0 + a_1 \bar{\lambda} + ... + a_m \bar{\lambda}^m = 0\]
Theorem 4.16: Factorization of a Quadratic Polynomial
Suppose \(b, c \in \mathbb{R}\). Then there is a polynomial factorization of the form:
\[x^2 + bx + c = (x - \lambda_1) (x - \lambda_2)\]
with \(\lambda_1, \lambda_2 \in \mathbb{R}\) if and only if \(b^2 \geq 4c\)
Theorem 4.17: Factorization of a Polynomial over \(\mathbb{R}\)
Suppose \(p \in P(\mathbb{R})\) is a nonconstant polynomial. Then \(p\) has a unique factorization of the form:
\[p(x) = c(x - \lambda_1) ... (x - \lambda_m) (x^2 + b_1 x + c_1 ) ... (x^2 + b_M x + c_M)\]
where \(c, \lambda_1, ..., \lambda_m, b_1, ..., b_M, c_1, ..., c_M \in \mathbb{R}\), with \(b_j^2 < 4c_j\) for each \(j\).
Eigenvalues, Eigenvectors, and Invariant Subspaces
Invariant Subspaces
Let \(T \in L(V)\), \(U \subseteq V\) be a proper subspace, then \(T|_U\) denotes the domain of \(T\) restrict to the set \(U\).
Definition 5.2: Invariant Subspace
Suppose \(T \in L(V)\) is an operator on \(V\). A subspace \(U\) of \(V\) is called invariant under \(T\) if \(u \in U \implies T(u) \in U\).
In other words, \(U\) is invariant under \(T\) if \(T|_U\) is an operator on \(U\).
Suppose \(T \in L(V)\), then the following subspaces are invariant under \(T\):
\(\{0\}\)
\(V\)
\(\text{null } T\)
\(\text{range } T\)
EigenValues and Eigenvectors
Take any \(v \in V\) with \(v \neq 0\) and let \(U\) equal the set of all scalar multiples of \(v\):
\[U = \{\lambda v : \lambda \in \mathbb{F}\} = span (v)\]
Then \(U\) is a 1-dimensional subspace of \(V\). If \(T\) is a linear operator s.t \(T(v) \in U\), then:
\[T(v) = \lambda v\]
For some \(\lambda \in \mathbb{F}\). Conversely, if \(T(v) = \lambda v\) for some \(\lambda \in \mathbb{F}\), then \(U = span(v)\) is a 1-dimensional invariant subspace of \(V\) under \(T\).
Definition 5.5: Eigenvalue
Suppose \(T \in L(V)\). A number \(\lambda \in \mathbb{F}\) is called an eigenvalue of \(T\) if there exists \(v \in V\) s.t \(v \neq 0\) and \(T(v) = \lambda v\).
In other words, \(T\) has a 1-dimensional subspace IFF \(T\) has an eigenvalue.
Theorem 5.6 Equivalent Conditions to be an Eigenvalue
Suppose \(V\) is finite-dimensional, \(T \in L(V)\), and \(\lambda \in \mathbb{F}\). Then the following are equivalent:
- \(\lambda\) is an eigenvalue of \(T\).
- \(T - \lambda I\) is not injective.
- \(T - \lambda I\) is not surjective.
- \(T - \lambda I\) is not invertible.
where \(I \in L(V)\) is the identity operator defined by \(I(v) = v, \; \forall v \in V\).
Definition 5.7: Eigenvector
Suppose \(T \in L(V)\) and \(\lambda \in \mathbb{F}\) is an eigenvalue of \(T\). A vector \(v \in V\) is called an eigenvector of \(T\) corresponding to \(\lambda\) if \(v \neq 0\) and \(T(v) = \lambda v\).
In other words, a vector \(v \in V\) is an eigenvector of \(T\) corresponding to \(\lambda\) IFF \(v \in \text{null } (T - \lambda I)\)
Theorem 5.10: Linearly Independent Eigenvectors
Let \(T \in L(V)\). Suppose \(\lambda_1, ..., \lambda_m\) are distinct eigenvalues of \(T\) and \(v_1, ..., v_m\) are corresponding eigenvectors. Then \(v_1, ..., v_m\) is linearly independent.
Theorem 5.13: Number of Eigenvalues
Suppose \(V\) is finite-dimensional. Then each operator on \(V\) has at most \(\dim V\) distinct eigenvalues.
Definition 5.14: \(T|_U\) and \(T / U\)
Suppose \(T \in L(V)\) and \(U\) is a subspace of \(V\) invariant under \(T\).
The restriction operator \(T|_U \in L(U)\) is defined by: \[T|_U (u) = T(u)\]
for \(u \in U\).
The quotient operator \(T / U \in L(V / U)\) is defined by: \[(T / U) (v + U) = T(v) + U\]
for \(v \in V\)
Eigenvectors and Upper-Triangular Matrices
The main reason that a richer theory exists for operators (which map a vector space into itself) than for more general linear maps is that operators can be raised to powers.
Definition 5.16 \(T^m\)
Suppose \(T \in L(V)\) and \(m\) is a positive integer.
\(T^m\) is defined by: \[T^m = T \underbrace{...}_{m \text{ times}} T\]
\(T^0\) is defined to be the identity operator \(I\) on \(V\).
If \(T\) is invertible with inverse \(T^{-1}\), then \(T^{-m}\) is defined by: \[T^{-m} = (T^{-1})^m\]
\(T^m T^n = T^{m + n}\), \((T^{m})^n = T^{mn}\)
Definition 5.17: \(p(T)\)
Suppose \(T \in L(V)\) and \(p \in P(\mathbb{F})\) is a polynomial given by:
\[p(z) = a_0 + a_1 z + a_2 z^2 + ... + a_m z^m\]
for \(z \in \mathbb{F}\). Then, \(p(T)\) is the operator defined by:
\[p(T) = a_1 I + a_1 T + .... + a_m T^m\]
Suppose \(D \in L(P(\mathbb{R}))\) is the differentiation operator defined by \(D(q) = q^{\prime}\) and \(p(x) = 16 - 3 x + 5x^3\). Then:
\[p(D) = 16I - 3 D + 5D^3\]
and
\[(p(D))(q) = 16 I - 3 q^{\prime} + 5 q^{\prime \prime \prime}\]
If we fix an operator \(T \in L(V)\), then the function \(M: P(\mathbb{F}) \rightarrow L(V)\) defined by \(M(p) = p(T)\) is linear.
Definition 5.19: Product of Polynomials
If \(p, q \in P(\mathbb{F})\), then \(pq \in P(\mathbb{F})\) is the polynomial defined by:
\[(pq)(z) = p(z) q(z)\]
for \(z \in \mathbb{F}\)
Theorem 5.20: Multiplicative Properties of Polynomials
Suppose \(p, q \in P(\mathbb{F})\) and \(T \in L(V)\), then:
- \((pq)(T) = p(T) q(T)\)
- \(p(T)q(T) = q(T)p(T)\)
When expanding a product of polynomials using the distributive property, it does not matter whether the symbol is \(z\) or \(T\)
Theorem 5.21: Operators on Complex Vector Spaces Have an Eigenvalue
Every operator on a finite-dimensional, nonzero, complex vector space has an eigenvalue.
Definition 5.22 Matrix of an Operator, \(M(T)\)
Suppose \(T \in L(V)\) and \(v_1, ..., v_n\) is a basis of \(V\). The matrix of \(T\) w.r.t this basis is the \(n \times n\) matrix:
\[ M(T) = \begin{bmatrix} A_{1, 1} & ... & A_{1, n}\\ . & ... & .\\ . & ... & .\\ . & ... & .\\ A_{n, 1} & ... & A_{n, n} \end{bmatrix} \]
whose entries \(A_{j, k}\) are defined by:
\[T(v_k) = A_{1, k} v_1 + .... + A_{n, k} v_n\]
The \(k\)th column of the matrix is formed from the coefficients used to write \(T(v_k)\) as a linear combination of \(v_1, ..., v_n\). If the basis is not clear from the context, then the notation \(M(T, (v_1, ..., v_n))\) is used or standard bases are assumed.
Define \(T \in L(\mathbb{R}^3)\) by \(T(x, y, z) = (2x + y, 5y + 3x, 8z)\), then \[ M(T) = \begin{bmatrix} 2 & 1 & 0\\ 0 & 5 & 3\\ 0 & 0 & 8\\ \end{bmatrix} \]
Definition 5.24: Diagonal of a Matrix
THe diagonal of a square matrix consists of the entries along the line from the upper left corner to the bottom right corner.
Definition 5.25 Upper-Triangular Matrix
A matrix is called upper triangular if all the entries below the diagonal equal \(0\).
Theorem 5.26: Conditions for Upper-Triangular Matrix
Suppose \(T \in L(V)\) and \(v_1, ..., v_n\) is a basis of \(V\). Then the following are equivalent:
- The matrix of \(T\) w.r.t \(v_1, ..., v_n\) is upper triangular.
- \(T(v_j) \in span(v_1, ..., v_j)\) for each \(j = 1, ...., n\).
- \(span(v_1, ..., v_j)\) is invariant under \(T\) for each \(j = 1 , ..., n\).
Proof of Theorem 5.26:
\(1 \Longleftrightarrow 2, 3 \rightarrow 2\) is trivial, so we only need to prove \(2 \rightarrow 3\).
Suppose \(2\) holds and fix \(j \in \{1, ..., n\}\) then we have:
\[T(v_1) = span(v_1) \subseteq span(v_1, ..., v_j)\] \[T(v_2) = span(v_1, v_2) \subseteq span(v_1, ..., v_j)\] \[.\] \[.\] \[.\] \[T(v_j) = span(v_1, ..., v_j)\]
Let \(v \in span(v_1, ..., v_j)\), then \(v = a_1v_1 + ... + a_j v_j\):
\[T(v) = a_1 T(v_1) + ... + a_j T(v_j) \in span(v_1, ..., v_j)\]
Thus, we can conclude that \(span (v_1, ..., v_j)\) is invariant under \(T\).
Theorem 5.27: Over \(\mathbb{C}\), Every Operator has an Upper-Triangular Matrix
Suppose \(V\) is a finite-dimensional complex vector space and \(T \in L(V)\). Then \(T\) has an upper-triangular matrix w.r.t some basis of \(V\).
This result does not hold for real number space, because we are not guaranteed to have an eigenvalue for every operator in the real number vector space.
Theorem 5.30: Determination of Invertibility from Upper-Triangular Matrix
Suppose \(T \in L(V)\) has an upper-triangular matrix w.r.t some basis of \(V\). Then \(T\) is invertible if and only if all the entries on the diagonal of that upper-triangular matrix are nonzero.
Theorem 5.32: Determination of Eigenvalues from Upper-Triangular Matrix
Suppose \(T \in L(V)\) has an upper-triangular matrix w.r.t some basis of \(V\). Then the eigenvalues of \(T\) are precisely the entries on the diagonal of that upper-triangular matrix.
Proof of Theorem 5.32:
Suppose \(v_1, ..., v_n\) is a basis of \(V\) w.r.t which \(T\) has an upper-triangular matrix:
\[ M(T) = \begin{bmatrix} \lambda_1 & ... & *\\ . & \lambda_2 & .\\ . & ... & .\\ \end{bmatrix} \]
Let \(\lambda \in \mathbb{F}\), then:
\[ M(T - \lambda I) = M(T) - \lambda M(I) = \begin{bmatrix} \lambda_1 - \lambda & ... & *\\ . & \lambda_2 - \lambda & .\\ . & ... & .\\ \end{bmatrix} \]
\(\lambda\) is an eigenvalue of \(T\) IFF \(T - \lambda I\) is not invertible, which means that by theorem 5.30
, at least one of the diagonal entries of \(M(T - \lambda I)\) has to be zero. So the eigenvalue is one of the diagonal entries.
Eigenspaces and Diagonal Matrices
Definition 5.34: Diagonal Matrix
A diagonal matrix is a square matrix that is \(0\) everywhere except possibly along the diagonal.
If an operator has a diagonal matrix with respect to some basis, then the entries along the diagonal are precisely the eigenvalues of the operator.
Definition 5.36: Eigenspace, \(E(\lambda, T)\)
Suppose \(T \in L(V)\) and \(\lambda \in \mathbb{F}\). The eigenspace of \(T\) corresponding to \(\lambda\), denoted \(E(\lambda, T)\), is defined by:
\[E(\lambda, T) = \text{null}(T - \lambda I)\]
In other words, \(E(\lambda, T)\) is the set of span of all eigenvectors of \(T\) corresponding to \(\lambda\), along with the \(0\) vector and \(\lambda\) is an eigenvalue of \(T\) IFF \(E(\lambda, T) \neq \{0\}\).
Theorem 5.38: Sum of Eigenspaces is a Direct Sum
Suppose \(V\) is finite-dimensional and \(T \in L(V)\). Suppose also that \(\lambda_1, ..., \lambda_m\) are distinct eigenvalues of \(T\). Then:
\[E(\lambda_1, T) + ... + E(\lambda_m, T)\]
is a direct sum. Furthermore,
\[\dim E(\lambda_1, T) + ... + \dim E(\lambda_m, T) \leq \dim V\]
Definition 5.39: Diagonalizable
An operator \(T \in L(V)\) is called diagonalizable if the operator has a diagonal matrix with respect ot some basis of \(V\).
Define \(T \in L(\mathbb{R}^2)\) by: \[T(x, y) = (41x + 7y, -20x + 74y)\]
\(T\) is diagonalizable w.r.t the basis \((1, 4), (7, 5)\)
Theorem 5.41: Conditions Equivalent to Diagonalizability
Suppose \(V\) is finite-dimensional and \(T \in L(V)\). Let \(\lambda_1, ..., \lambda_m\) denote the distinct eigenvalues of \(T\). Then the following are equivalent:
- \(T\) is diagonalizable.
- \(V\) has a basis consisting of eigenvectors of \(T\).
- There exists \(1\)-dimensional subspaces \(U_1, ..., U_n\) of \(V\), each invariant under \(T\), such that: \[V = U_1 \oplus .... \oplus U_n\]
- \(V = E(\lambda_1, T) \oplus ... \oplus E(\lambda_m, T)\)
- \(\dim V = \dim E(\lambda_1, T) + ... + \dim E(\lambda_m, T)\)
Proof of Theorem 5.41
\(1 \Longleftrightarrow 2\) is trivial. Let \(v_1, .., v_n\) be a basis s.t \(T\) is diagonalizable, then we have \(T(v_j) = \lambda_j v_j\), so \(\{v_j\}\) are eigenvectors of \(T\) and basis of \(V\).
Suppose \(2\) holds, then \(v_1, ..., v_n\) is an eigenvector basis of \(V\). Let \(U_1 = span(v_1), ...., U_n = span(v_n)\), since \(v_1, ..., v_n\) are eigenvectors of \(T\), then \(U_1, ..., U_n\) are \(1\)-dimensional invariant subspaces. Since \(v_1, ..., v_n\) is a basis of \(V\), we have each vector \(v \in V\) can be uniquely write as linear combination of \(u_j \in U_j\):
\[v = a_1v_1 + ... + a_n v_n = u_1 + ... + u_n\]
Thus, we have \(V = U_1 \oplus ... \oplus U_n\), \(2 \rightarrow 3\).
Suppose \(3\) holds, then \(U_1 = span(v_1), ..., U_n = span(v_n)\), where \(u_1, ..., u_n\) are eigenvector of \(T\), since \(V = U_1 \oplus ... \oplus U_n\), we have:
\[v = u_1 + ... + u_n = a_1 v_1 + ... + a_n v_n, \; \forall v \in V\]
Thus, we have \(v_1, ..., v_n\) being eigenvector basis of \(V\), which implies \(2\).
Suppose \(2\) holds, we have a basis consisting of eigenvectors of \(T\). Hence, every vector of \(V\) can be written as:
\[v = a_1 v_1 + ... + a_n v_n = u_1 + ... + u_n\]
Where \(u_i \in \text{null} (T - \lambda_j I), \; i \in \{1, ..., n\}, j \in \{1, ..., m\}\). Thus, we have:
\[V = E(\lambda_1, T) + ... + E(\lambda_m, T)\]
By theorem 5.38
, we have:
\[V = E(\lambda_1, T) \oplus ... \oplus E(\lambda_m, T)\]
Thus, \(2 \rightarrow 4\).
By theorem 5.38
, we can easily see that \(4 \rightarrow 5\)
Finally, suppose \(5\) holds, we have:
\[\dim V = \dim E(\lambda_1, T) + ... + \dim E(\lambda_m, T)\]
Choose a basis of each \(E(\lambda_j, T)\) and put all these bases together to form a list \((v_1, ..., v_n)\) of eigenvectors of \(T\), where \(n = \dim V\). To show that they are linearly independent, suppose that:
\[a_1 v_1 + ... a_n v_n = 0\]
For each \(j = 1, ..., m\), let \(u_j\) denote the sum of all the terms \(a_kv_k\) s.t \(v_k \in E(\lambda_j, T)\). Thus, each \(u_j\) is in \(E(\lambda_j, T)\), and:
\[u_1 + ... + u_m = 0\]
Since eigenvectors corresponding to different eigenvalues are independent and \(u_i\) is sum of basis in \(E(\lambda_j, T)\), we have \(u_1, ..., u_m\) independent and all \(a_k = 0\). Thus, \(5 \rightarrow 2\).
Theorem 5.44: Enough Eigenvalues Implies Diagonalizability
If \(T \in L(V)\) has \(\dim V\) distinct eigenvalues, then \(T\) is diagonalizable.