Symmetric matrices

2 Symmetric matrices

Symmetric matrices have a number of useful properties which we will investigate in this Section.

Task!

Consider the following four matrices

$A_{1} = [\begin{matrix} 3 & 1 \\ 4 & 5 \end{matrix}] A_{2} = [\begin{matrix} 3 & 1 \\ 1 & 5 \end{matrix}]$

$A_{3} = [\begin{matrix} 5 & 8 & 7 \\ - 1 & 6 & 8 \\ 3 & 4 & 0 \end{matrix}] A_{4} = [\begin{matrix} 5 & 8 & 7 \\ 8 & 6 & 4 \\ 7 & 4 & 0 \end{matrix}]$

What property do the matrices $A_{2}$ and $A_{4}$ possess that $A_{1}$ and $A_{3}$ do not?

Matrices $A_{2}$ and $A_{4}$ are symmetric across the principal diagonal. In other words transposing these matrices, i.e. interchanging their rows and columns, does not change them.

$A_{2}^{T} = A_{2} A_{4}^{T} = A_{4} .$

This property does not hold for matrices $A_{1}$ and $A_{3}$ which are non-symmetric .

Calculating the eigenvalues of an $n \times n$ matrix with real elements involves, in principle at least, solving an $n^{th}$ order polynomial equation, a quadratic equation if $n = 2$ , a cubic equation if $n = 3$ , and so on. As is well known, such equations sometimes have only real solutions, but complex solutions (occurring as complex conjugate pairs) can also arise. This situation can therefore arise with the eigenvalues of matrices.

Task!

Consider the non-symmetric matrix

$A = [\begin{matrix} 2 & - 1 \\ 5 & - 2 \end{matrix}]$

Obtain the eigenvalues of $A$ and show that they form a complex conjugate pair.

The characteristic equation of $A$ is

$det (A - λ I) = |\begin{matrix} 2 - λ & - 1 \\ 5 & - 2 - λ \end{matrix}| = 0$

i.e.

$- (2 - λ) (2 + λ) + 5 = 0$ leading to $λ^{2} + 1 = 0$

giving eigenvalues $\pm i$ which are of course complex conjugates.

In particular any $2 \times 2$ matrix of the form

$A = [\begin{matrix} a & b \\ - b & a \end{matrix}]$

has complex conjugate eigenvalues $a \pm i b$ .

A $3 \times 3$ example of a matrix with some complex eigenvalues is

$B = [\begin{matrix} 1 & - 1 & - 1 \\ 1 & - 1 & 0 \\ 1 & 0 & - 1 \end{matrix}]$

A straightforward calculation shows that the eigenvalues of $B$ are

$λ = - 1$ (real), $λ = \pm i$ (complex conjugates).

With symmetric matrices on the other hand, complex eigenvalues are not possible.

Key Point 6

The eigenvalues of a symmetric matrix with real elements are always real.

The general proof of this result in Key Point 6 is beyond our scope but a simple proof for symmetric $2 \times 2$ matrices is straightforward.

Let $A = [\begin{matrix} a & b \\ b & c \end{matrix}]$ be any $2 \times 2$ symmetric matrix, $a, b, c$ being real numbers.

The characteristic equation for $A$ is

$(a - λ) (c - λ) - b^{2} = 0$ or, expanding: $λ^{2} - (a + c) λ + a c - b^{2} = 0$

from which

$λ = \frac{(a + c) \pm \sqrt{{(a + c)}^{2} - 4 a c + 4 b^{2}}}{2}$

The quantity under the square root sign can be treated as follows:

${(a + c)}^{2} - 4 a c + 4 b^{2} = a^{2} + c^{2} + 2 a c - 4 a c + b^{2} = {(a - c)}^{2} + 4 b^{2}$

which is always positive and hence $λ$ cannot be complex .

Task!

Obtain the eigenvalues and the eigenvectors of the symmetric $2 \times 2$ matrix

$A = [\begin{matrix} 4 & - 2 \\ - 2 & 1 \end{matrix}]$

The characteristic equation for $A$ is

$(4 - λ) (1 - λ) + 4 = 0$ or $λ^{2} - 5 λ = 0$

giving $λ = 0$ and $λ = 5$ , both of which are of course real and also unequal (i.e. distinct). For the larger eigenvalue $λ = 5$ the eigenvector $X = [\begin{matrix} x \\ y \end{matrix}]$ satisfy

$[\begin{matrix} 4 & - 2 \\ - 2 & 1 \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} 5 x \\ 5 y \end{matrix}]$ i.e. $- x - 2 y = 0,$ $- 2 x - 4 y = 0$

Both equations tell us that $x = - 2 y$ so an eigenvector for $λ = 5$ is $X = [\begin{matrix} 2 \\ - 1 \end{matrix}]$ or anymultiple of this. For $λ = 0$ the associated eigenvectors satisfy

$4 x - 2 y = 0$ $- 2 x + y = 0$

i.e. $y = 2 x$ (from both equations) so an eigenvector is $Y = [\begin{matrix} 1 \\ 2 \end{matrix}]$ or any multiple.

We now look more closely at the eigenvectors $X$ and $Y$ in the last task. In particular we consider the product $X^{T} Y$ .

Task!

Evaluate $X^{T} Y$ from the previous task i.e. where

$X = [\begin{matrix} 2 \\ - 1 \end{matrix}] Y = [\begin{matrix} 1 \\ 2 \end{matrix}]$

$X^{T} Y = [2, - 1] [\begin{matrix} 1 \\ 2 \end{matrix}] = 2 \times 1 - 1 \times 2 = 2 - 2 = 0$

$X^{T} Y = 0$ means are $X$ and $Y$ are orthogonal .

Key Point 7

Two $n \times 1$ column vectors $X$ and $Y$ are orthogonal if $X^{T} Y = 0$ .

Task!

We obtained earlier in Section 22.1 Example 6 the eigenvalues of the matrix

$A = [\begin{matrix} 2 & - 1 & 0 \\ - 1 & 2 & - 1 \\ 0 & - 1 & 2 \end{matrix}]$

which, as we now emphasize, is symmetric . We found that the eigenvalues were $2, 2 + \sqrt{2}, 2 - \sqrt{2}$ which are real and distinct. The corresponding eigenvectors were, respectively

$X = [\begin{matrix} 1 \\ 0 \\ - 1 \end{matrix}] Y = [\begin{matrix} 1 \\ - \sqrt{2} \\ 1 \end{matrix}] Z = [\begin{matrix} 1 \\ \sqrt{2} \\ 1 \end{matrix}]$

(or, as usual, any multiple of these).

Show that these three eigenvectors $X, Y, Z$ are mutually orthogonal .

$X^{T} Y = [1, 0, - 1] [\begin{matrix} 1 \\ - \sqrt{2} \\ 1 \end{matrix}] = 1 - 1 = 0$

$Y^{T} Z = [1, - \sqrt{2}, 1] [\begin{matrix} 1 \\ \sqrt{2} \\ 1 \end{matrix}] = 1 - 2 + 1 = 0$

$Z^{T} X = [1, \sqrt{2}, 1] [\begin{matrix} 1 \\ 0 \\ - 1 \end{matrix}] = 1 - 1 = 0$

verifying the mutual orthogonality of these three eigenvectors.

2.1 General theory

The following proof that eigenvectors corresponding to distinct eigenvalues of a symmetric matrix are orthogonal is straightforward and you are encouraged to follow it through.

Let $A$ be a symmetric $n \times n$ matrix and let $λ_{1}, λ_{2}$ be two distinct eigenvalues of $A$ i.e. $λ_{1} \neq λ_{2}$ with associated eigenvectors $X, Y$ respectively. We have seen that $λ_{1}$ and $λ_{2}$ must be real since $A$ is symmetric. Then

$A X = λ_{1} X A Y = λ_{2} Y$ (1)

Transposing the first of there results gives

$X^{T} A^{T} = λ_{1} X^{T}$ (2)

(Remember that for any two matrices the transpose of a product is the product of the transposes in reverse order .)

We now multiply both sides of (2) on the right by $Y$ (as well as putting $A^{T} = A$ , since $A$ is symmetric) to give:

$X^{T} A Y = λ_{1} X^{T} Y$ (3)

But, using the second eigenvalue equation of (1), equation (3) becomes

$X^{T} λ_{2} Y = λ_{1} X^{T} Y$

or, since $λ_{2}$ is just a number,

$λ_{2} X^{T} Y = λ_{1} X^{T} Y$

Taking all terms to the same side and factorising gives

$(λ_{2} - λ_{1}) X^{T} Y = 0$

from which, since by assumption $λ_{1} \neq λ_{2}$ , we obtain the result

$X^{T} Y = 0$

and the orthogonality has been proved.

Key Point 8

The eigenvectors associated with distinct eigenvalues of a

symmetric matrix are mutually orthogonal .

The reader familiar with the algebra of vectors will recall that for two vectors whose Cartesian forms are

$\underset{̲}{a} = a_{x} \underset{̲}{i} + a_{y} \underset{̲}{j} + a_{z} \underset{̲}{k} \underset{̲}{b} = b_{x} \underset{̲}{i} + b_{y} \underset{̲}{j} + b_{z} \underset{̲}{k}$

the scalar (or dot) product is

$\underset{̲}{a} \cdot \underset{̲}{b} = a_{x} b_{x} + a_{y} b_{y} + a_{z} b_{z} .$

Furthermore, if $\underset{̲}{a}$ and $\underset{̲}{b}$ are mutually perpendicular then $\underset{̲}{a} \cdot \underset{̲}{b} = 0$ . (The word ‘orthogonal’ is sometimes used instead of perpendicular in the case.) Our result, that two column vectors are orthogonal if $X^{T} Y = 0$ , may thus be considered as a generalisation of the 3-dimensional result $\underset{̲}{a} \cdot \underset{̲}{b} = 0$ .

2.2 Diagonalization of symmetric matrices

Recall from our earlier work that

We can always diagonalize a matrix with distinct eigenvalues (whether these are real or complex).
We can sometimes diagonalize a matrix with repeated eigenvalues. (The condition for this to be possible is that any eigenvalue of multiplicity $m$ had to have associated with it $m$ linearly independent eigenvectors.)

The situation with symmetric matrices is simpler. Basically we can diagonalize any symmetric matrix. To take the discussions further we first need the concept of an orthogonal matrix.

A square matrix $A$ is said to be orthogonal if its inverse (if it exists) is equal to its transpose:

$A^{- 1} = A^{T}$ or, equivalently, $A A^{T} = A^{T} A = I .$

Example

An important example of an orthogonal matrix is

$A = [\begin{matrix} cos ϕ & sin ϕ \\ - sin ϕ & cos ϕ \end{matrix}]$

which arises when we use matrices to describe rotations in a plane.

\begin{array}{rcl} A A^{T} & = & [\begin{matrix} cos ϕ & sin ϕ \\ - sin ϕ & cos ϕ \end{matrix}] [\begin{matrix} cos ϕ & - sin ϕ \\ sin ϕ & cos ϕ \end{matrix}] \\ = & [\begin{matrix} {cos}^{2} ϕ + {sin}^{2} ϕ & 0 \\ 0 & {sin}^{2} ϕ + {cos}^{2} ϕ \end{matrix}] = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}] = I \end{array}

It is clear that $A^{T} A = I$ also, so $A$ is indeed orthogonal.

It can be shown, but we omit the details, that any $2 \times 2$ matrix which is orthogonal can be written in one of the two forms.

$[\begin{matrix} cos ϕ & sin ϕ \\ - sin ϕ & cos ϕ \end{matrix}]$ or $[\begin{matrix} cos ϕ & - sin ϕ \\ sin ϕ & cos ϕ \end{matrix}]$

If we look closely at either of these matrices we can see that

The two columns are mutually orthogonal e.g. for the first matrix we have
$(cos ϕ - sin ϕ) [\begin{matrix} sin ϕ \\ cos ϕ \end{matrix}] = cos ϕ sin ϕ - sin ϕ cos ϕ = 0$
Each column has magnitude 1 (because $\sqrt{{cos}^{2} ϕ + {sin}^{2} ϕ} = 1$ )

Although we shall not prove it, these results are necessary and sufficient for any order square matrix to be orthogonal.

Key Point 9

A square matrix $A$ is said to be orthogonal if its inverse (if it exists) is equal to its transpose:

$A^{- 1} = A^{T}$ or, equivalently, $A A^{T} = A^{T} A = I .$

A square matrix is orthogonal if and only if its columns are mutually orthogonal and each column has unit magnitude.

Task!

For each of the following matrices verify that the two properties above are satisfied. Then check in both cases that $A A^{T} = A^{T} A = I$ i.e. that $A^{T} = A^{- 1}$ .

$A = [\begin{matrix} \frac{\sqrt{3}}{2} & - \frac{1}{2} \\ \frac{1}{2} & \frac{\sqrt{3}}{2} \end{matrix}]$
$A = [\begin{matrix} \frac{1}{\sqrt{2}} & 0 & - \frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ - \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \end{matrix}]$

Since $(\frac{\sqrt{3}}{2} \frac{1}{2}) [\begin{matrix} - \frac{1}{2} \\ \frac{\sqrt{3}}{2} \end{matrix}] = - \frac{\sqrt{3}}{4} + \frac{\sqrt{3}}{4} = 0$ the columns are orthogonal.
Since $|\frac{\sqrt{3}}{2} + \frac{1}{2}| = \sqrt{\frac{3}{4} + \frac{1}{4}} = 1$ and $|- \frac{1}{2} + \frac{\sqrt{3}}{4}| = \sqrt{\frac{1}{4} + \frac{3}{4}} = 1$ each column has unit magnitude.

Straightforward multiplication shows $A A^{T} = A^{T} A = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}] = I .$
Proceed as in (1).

The following is the key result of this Section.

Key Point 10

Any symmetric matrix $A$ can be diagonalized using an orthogonal modal matrix $P$ via the transformation

P^{T} A P = D = [\begin{matrix} λ_{1} & 0 & \dots & 0 \\ 0 & λ_{2} & \dots & 0 \\ ⋮ & ⋱ \\ 0 & λ_{n} \end{matrix}]

It follows that any

n \times n

symmetric matrix must possess

n

mutually orthogonal eigenvectors even if some of the eigenvalues are repeated .

It should be clear to the reader that Key Point 10 is a very powerful result for any applications that involve diagonalization of a symmetric matrix. Further, if we do need to find the inverse of $P$ , then this is a trivial process since $P^{- 1} = P^{T} (Key Point 9) .$

Task!

The symmetric matrix

$A = [\begin{matrix} 1 & 0 & \sqrt{2} \\ 0 & 2 & 0 \\ \sqrt{2} & 0 & 0 \end{matrix}]$

has eigenvalues $2, 2, - 1$ (i.e. eigenvalue 2 is repeated with multiplicity 2.)

Associated with the non-repeated eigenvalue $- 1$ is an eigenvector:

$X = [\begin{matrix} 1 \\ 0 \\ - \sqrt{2} \end{matrix}]$ (or any multiple)

Normalize the eigenvector $X$ :

Normalizing $X$ which has magnitude $\sqrt{1^{2} + {(- \sqrt{2})}^{2}} = \sqrt{3}$ gives

$1 ∕ \sqrt{3} [\begin{matrix} 1 \\ 0 \\ - \sqrt{2} \end{matrix}] = [\begin{matrix} 1 ∕ \sqrt{3} \\ 0 \\ - \sqrt{2 ∕ 3} \end{matrix}]$
Investigate the eigenvectors associated with the repeated eigenvalue 2:

The eigenvectors associated with $λ = 2$ satisfy $A Y = 2 Y$

which gives $[\begin{matrix} - 1 & 0 & \sqrt{2} \\ 0 & 0 & 0 \\ \sqrt{2} & 0 & - 2 \end{matrix}] [\begin{matrix} x \\ y \\ z \end{matrix}] = [\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}]$

The first and third equations give

$- x + \sqrt{2} z = 0$

$\sqrt{2} x - 2 z = 0$ i.e. $x = \sqrt{2} z$

The equations give us no information about $y$ so its value is arbitrary.

Thus $Y$ has the form $Y = [\begin{matrix} \sqrt{2} β \\ α \\ β \end{matrix}]$ where both $α$ and $β$ are arbitrary.

A certain amount of care is now required in the choice of $α$ and $β$ if we are to find an orthogonal modal matrix to diagonalize $A$ .

For any choice

$X^{T} Y = (1 0 - \sqrt{2}) [\begin{matrix} \sqrt{2} β \\ α \\ β \end{matrix}] = \sqrt{2} β - \sqrt{2} β = 0.$

So $X$ and $Y$ are orthogonal. (The normalization of $X$ does not affect this.)

However, we also need two orthogonal eigenvectors of the form $[\begin{matrix} \sqrt{2} β \\ α \\ β \end{matrix}]$ . Two such are

$Y^{(1)} = [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}] (choosing β = 0, α = 1) and Y^{(2)} = [\begin{matrix} \sqrt{2} \\ 0 \\ 1 \end{matrix}] (choosing α = 0, β = 1)$

After normalization, these become $Y^{(1)} = [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}] Y^{(2)} = [\begin{matrix} \sqrt{2 ∕ 3} \\ 0 \\ 1 ∕ \sqrt{3} \end{matrix}]$

Hence the matrix $P = [X ⋮ Y^{(1)} ⋮ Y^{(2)}] = [\begin{matrix} 1 ∕ \sqrt{3} & 0 & \sqrt{2 ∕ 3} \\ 0 & 1 & 0 \\ - \sqrt{2 ∕ 3} & 0 & 1 ∕ \sqrt{3} \end{matrix}]$

is orthogonal and diagonalizes $A$ :

$P^{T} A P = [\begin{matrix} - 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{matrix}]$

2.3 Hermitian matrices

In some applications, of which quantum mechanics is one, matrices with complex elements arise.

If $A$ is such a matrix then the matrix ${\bar{A}}^{T}$ is the conjugate transpose of $A$ , i.e. the complex conjugate of each element of $A$ is taken as well as $A$ being transposed. Thus if

$A = [\begin{matrix} 2 + i & 2 \\ 3 i & 5 - 2 i \end{matrix}] then {\bar{A}}^{T} = [\begin{matrix} 2 - i & - 3 i \\ 2 & 5 + 2 i \end{matrix}]$

An Hermitian matrix is one satisfying

${\bar{A}}^{T} = A$

The matrix $A$ above is clearly non-Hermitian. Indeed the most obvious features of an Hermitian matrix is that its diagonal elements must be real. (Can you see why?) Thus

$A = [\begin{matrix} 6 & 4 + i \\ 4 - i & - 2 \end{matrix}]$

is Hermitian.

A $3 \times 3$ example of an Hermitian matrix is

$A = [\begin{matrix} 1 & i & 5 - 2 i \\ - i & 3 & 0 \\ 5 + 2 i & 0 & 2 \end{matrix}]$

An Hermitian matrix is in fact a generalization of a symmetric matrix. The key property of an Hermitian matrix is the same as that of a real symmetric matrix – i.e. the eigenvalues are always real .

2 Symmetric matrices

Answer

Answer

Answer

Answer

Answer

2.1 General theory

2.2 Diagonalization of symmetric matrices

Answer

Answer

Answer

2.3 Hermitian matrices