Vectors and Matrices

A linear transformation, represented by a square matrix A, acts on vectors by moving them in space. It can stretch, compress, rotate, or reflect them, and in general the image of a vector points in a...

In this chapter:

Definition
The characteristic equation
Eigenspaces
Example 1
Example 2
Linear independence of eigenvectors
Diagonalization
Trace, determinant and eigenvalues

trace and determinantjordan formdiagonalizationlinear independencealgebraic multiplicitygeometric multiplicityeigenspacescomplex eigenvaluesroots and multiplicitydeterminant conditioncharacteristic polynomialcharacteristic equationeigen equationnon-zero conditionscaling factorinvariant directionseigenvalueeigenvectorlinear transformationstructureconstructionfoundations

Definition

A linear transformation, represented by a square matrix $A$, acts on vectors by moving them in space. It can stretch, compress, rotate, or reflect them, and in general the image of a vector points in a different direction from the original. Among all vectors however there are those for which the action of $A$ is particularly simple. The transformation scales them by a constant factor, leaving their direction unchanged. Such vectors are called eigenvectors of $A$, and the corresponding scaling factors are called eigenvalues.

Eigenvectors reveal the intrinsic geometry of a linear transformation, and the collection of eigenvalues encodes information about the matrix that is invariant under a wide class of coordinate changes.

Let $A$ be a square matrix of order $n$ with entries in $\mathbb{R}$ or $\mathbb{C}$. A non-zero vector $\mathbf{v}$ is called an eigenvector of $A$ if there exists a scalar $\lambda$ such that the following equation holds:

\[A \mathbf{v} = \lambda \mathbf{v}\]

The scalar $\lambda$ is called the eigenvalue of $A$ associated with $\mathbf{v}$. The condition requires that $A$ maps $\mathbf{v}$ to a scalar multiple of itself: the vector $\mathbf{v}$ may be stretched or compressed, and its orientation may be reversed if $\lambda$ is negative, but it remains on the same line through the origin. Eigenvectors are the invariant directions of the transformation and eigenvalues are the scaling factors along those directions.

The zero vector is excluded by convention. The equation $A 0 = \lambda 0$ is satisfied for every $\lambda$ and carries no information about the matrix.

The following diagram illustrates this idea for the square matrix:

\[A = ( 2 & 1 \\ 1 & 2 )\]

Eigenvalues and eigenvectors.

The unit circle is mapped to an ellipse: most vectors change direction under the transformation. The two eigenvectors $\mathbf{v}{1}$ and $\mathbf{v}{2}$ are the exception. They remain on the same line through the origin, scaled by $\lambda_{1} = 3$ and $\lambda_{2} = 1$ respectively.

The characteristic equation

Rewriting the eigenvalue equation as $( A - \lambda I ) \mathbf{v} = 0$, where $I$ is the identity matrix of order $n$, it is clear that a non-zero solution $\mathbf{v}$ exists precisely when the matrix $A - \lambda I$ is singular. The condition for singularity is that its determinant vanishes. The equation

\[det ( A - \lambda I ) = 0\]

is called the characteristic equation of $A$. Expanding the determinant yields a polynomial of degree $n$ in $\lambda$, known as the characteristic polynomial of $A$. The eigenvalues of $A$ are the roots of this polynomial, and by the fundamental theorem of algebra there are exactly $n$ of them, counted with multiplicity, in $\mathbb{C}$.

A matrix with real entries has a characteristic polynomial with real coefficients, but this does not prevent complex roots. Complex eigenvalues of a real matrix always appear in conjugate pairs.

Eigenspaces

For each eigenvalue $\lambda_{0}$, the set of all vectors satisfying $A \mathbf{v} = \lambda_{0} \mathbf{v}$ is a subspace of $\mathbb{R}^{n}$ or $\mathbb{C}^{n}$. It coincides with the kernel of $A - \lambda_{0} I$ and is called the eigenspace of $A$ associated with $\lambda_{0}$:

\[E_{\lambda_{0}} = ker ⁡ ( A - \lambda_{0} I ) = \{ \mathbf{v} : ( A - \lambda_{0} I ) \mathbf{v} = 0 \}\]

The dimension of $E_{\lambda_{0}}$ is called the geometric multiplicity of $\lambda_{0}$. Separately, the multiplicity of $\lambda_{0}$ as a root of the characteristic polynomial is called the algebraic multiplicity of $\lambda_{0}$. It can be shown that the geometric multiplicity never exceeds the algebraic one, and the two coincide in the most well-behaved cases.

Example 1

Consider the following matrix:

\[A = ( 3 & 1 \\ 0 & 2 )\]

We compute the characteristic polynomial by forming the matrix $A - \lambda I$ and computing its determinant. Since $A - \lambda I$ is upper triangular, its determinant is the product of the diagonal entries:

\[det ( A - \lambda I ) = ( 3 - \lambda ) ( 2 - \lambda )\]

Setting this expression equal to zero gives $\lambda_{1} = 2$ and $\lambda_{2} = 3$. For $\lambda_{1} = 2$, we solve $( A - 2 I ) \mathbf{v} = 0$. The matrix $A - 2 I$ reduces to:

\[A - 2 I = ( 1 & 1 \\ 0 & 0 )\]

The system yields the single condition $v_{1} + v_{2} = 0$, so $v_{1} = - v_{2}$. Taking $v_{2} = 1$, the eigenspace $E_{2}$ is spanned by:

\[\mathbf{v}_{1} = ( - 1 \\ 1 )\]

For $\lambda_{2} = 3$, the matrix $A - 3 I$ is:

\[A - 3 I = ( 0 & 1 \\ 0 & - 1 )\]

Both rows give the condition $v_{2} = 0$, leaving $v_{1}$ free. Taking $v_{1} = 1$, the eigenspace $E_{3}$ is spanned by:

\[\mathbf{v}_{2} = ( 1 \\ 0 )\]

The matrix $A$ has therefore eigenvalue $\lambda_{1} = 2$ with eigenvector $( - 1 , 1 )^{T}$, and eigenvalue $\lambda_{2} = 3$ with eigenvector $( 1 , 0 )^{T}$.

Example 2

Consider the matrix

\[A = ( 2 & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 )\]

The matrix $A - \lambda I$ is block upper triangular, so its determinant is again the product of the diagonal entries. The characteristic polynomial is the following:

\[p ( \lambda ) = ( 2 - \lambda )^{2} ( 3 - \lambda )\]

Setting $p ( \lambda ) = 0$ gives two eigenvalues: $\lambda_{1} = 2$, with algebraic multiplicity two, and $\lambda_{2} = 3$, with algebraic multiplicity one.

For $\lambda_{2} = 3$, we solve $( A - 3 I ) \mathbf{v} = 0$. The matrix $A - 3 I$ is:

\[A - 3 I = ( - 1 & 1 & 0 \\ 0 & - 1 & 0 \\ 0 & 0 & 0 )\]

The second row gives $v_{2} = 0$, and the first row then gives $v_{1} = 0$, leaving $v_{3}$ free. Taking $v_{3} = 1$, the eigenspace $E_{3}$ is spanned by:

\[\mathbf{v}_{1} = ( 0 \\ 0 \\ 1 )\]

For $\lambda_{1} = 2$, we solve $( A - 2 I ) \mathbf{v} = 0$. The matrix $A - 2 I$ is:

\[A - 2 I = ( 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 )\]

The first row gives $v_{2} = 0$ and the third row gives $v_{3} = 0$, while $v_{1}$ remains free. Taking $v_{1} = 1$, the eigenspace $E_{2}$ is one-dimensional, spanned by:

\[\mathbf{v}_{2} = ( 1 \\ 0 \\ 0 )\]

The geometric multiplicity of $\lambda_{1} = 2$ is therefore one, while its algebraic multiplicity is two. Since these two values differ, the matrix $A$ is not diagonalizable. It possesses only two linearly independent eigenvectors, which is insufficient to form a basis of $\mathbb{R}^{3}$.

Linear independence of eigenvectors

Eigenvectors corresponding to distinct eigenvalues are always linearly independent. More precisely, if $\lambda_{1} , \ldots , \lambda_{k}$ are pairwise distinct eigenvalues of $A$ with associated eigenvectors $\mathbf{v}{1} , \ldots , \mathbf{v}{k}$, then $\mathbf{v}{1} , \ldots , \mathbf{v}{k}$ are linearly independent. The proof proceeds by induction on $k$ and uses the fact that each eigenvalue is distinct to derive a contradiction from any supposed linear dependence relation.

As a consequence, a square matrix of order $n$ with $n$ distinct eigenvalues always possesses $n$ linearly independent eigenvectors, and therefore admits a basis of eigenvectors.

Diagonalization

A matrix $A$ of order $n$ is called diagonalizable if it can be written in the form

\[A = P D P^{- 1}\]

where $P$ is an invertible matrix and $D$ is diagonal. The columns of $P$ are eigenvectors of $A$, and the corresponding diagonal entries of $D$ are the associated eigenvalues. This decomposition, when it exists, simplifies many computations substantially. In particular, the $k$-th power of $A$ takes the form:

\[A^{k} = P D^{k} P^{- 1}\]

Since raising a diagonal matrix to a power amounts to raising each diagonal entry to that power, this avoids the need to perform $k$ successive matrix multiplications.

A matrix is diagonalizable if and only if, for every eigenvalue, its geometric multiplicity equals its algebraic multiplicity. When this condition fails, the matrix cannot be diagonalized but can be reduced to Jordan canonical form, which is the closest diagonal-like structure available in the general case.

Trace, determinant and eigenvalues

Let $\lambda_{1} , \lambda_{2} , \ldots , \lambda_{n}$ be the eigenvalues of $A$ counted with algebraic multiplicity. Two classical identities relate them directly to entries of the matrix. The trace of $A$, defined as the sum of its diagonal entries, satisfies:

\[\text{tr} ( A ) = \lambda_{1} + \lambda_{2} + \hdots + \lambda_{n}\]

The determinant of $A$ satisfies:

\[det ( A ) = \lambda_{1} \cdot \lambda_{2} \hdots \lambda_{n}\]

Both identities follow from the structure of the characteristic polynomial. The second has a notable consequence: a matrix is singular if and only if zero is one of its eigenvalues. Together, these two relations offer a quick consistency check when eigenvalues are computed by hand, without requiring any additional verification.