I give up. I'm forced into a matrix review page.

Grammarian note: It is one Matrix, two matrices.

So... Here is a quick review.

- Numbers
- Rows
- Columns
- Operations on rows and columns
- Matrices
- Matrix Multiplication
- Magnitudes
- Inverses
- Quadratic Forms
- Positive Definite

I have made a collection of Matrix Identities that I use. Reading them could be helpful in following the otherwise short derivations scattered around this paper.

A single number in matrix notation is called a **scalar**.
It can be looked at as a number, or as a 1 x 1 matrix, or as a one
element row or column.

A **row** (also called a row vector) is just an ordered
collection of elements. For example,

[ a b c ]

is a row.

If you have two rows of the same length, you can add the rows by adding the corresponding elements in each row. For example, the row

[ d e f ] + [ g h i ] = [ d+g e+h f+i ]

One can multiply a row by a scalar (number). For example,

2 * [ a b c ] = [ 2a 2b 2c ]

A row may have any number of elements, from one on up.

If Z is a row, Z(i) means the i'th element of that row.

A **column** (also called a column vector) is just like a row,
except it is arranged vertically. For example:

[ a ] [ b ] [ c ]

Columns can be added, or multiplied by a scalar (number) the same way that rows can:

[ a ] [ g ] [ a+g ] [ b ] + [ h ] = [ b+h ] [ c ] [ i ] [ c+i ] [ a ] [ 2a ] 2 * [ b ] = [ 2b ] [ c ] [ 2c ]

A column may have any number of elements, from one on up.

If Y is a column, then Y(i) means the i'th element of the column, counting from the top.

If you have a row and a column, you can form what's called an
"**inner product**", or "vector product", or a
"**dot product**". The inner product is
the product of the first element of the row with first element of the
column, plus the product of the second two elements, etc.
For example:

[ a b c ] [ d ] = [ a*d + b*e + c*f ] (or just the scalar) a*d + b*e + c*f [ e ] [ f ]

A row and column that have a dot product of zero are called
**Orthogonal vectors**.

There is also another kind of product
("**outer product**")

[ a ] [ d e f ] = [ a*d a*e a*f ] [ b ] [ b*d b*e b*f ] [ c ] [ c*d c*e c*f ]

A **matrix** can be looked at as a column of rows, or as
a row of columns.

Examples of matrices (with their sizes):

[ a b c ] This is a 3 x 3 matrix. [ d e f ] [ g h i ] [ a b c ] This is a 2 x 3 matrix. [ d e f ] [ a b ] This is a 3 x 2 matrix. [ c d ] [ e f ] [ a ] This is a 1 x 1 matrix. Or it is a number, your choice. (Or it is a scalar) [ a b c ] This is a 1 x 3 matrix, or it is a row, also your choice. [ a ] This is a 3 by 1 matrix, or it is a column, whichever way you [ b ] want, they are the same thing. [ c ]

If X is a matrix, X(i,j) means the element at the i'th row, and j'th column of the matrix. So, in the matrix X where X =

[ a b c ] X(2, 3) = f [ d e f ] X(3, 2) = h [ g h i ] [ j k l ]

A matrix is called a **square matrix** if it has the same
numbers of rows as columns.

A **zero matrix** is a matrix where all the elements are
zeros. For example

[ 0 0 0 ] is a zero matrix. [ 0 0 0 ] [ 0 0 0 ]

A zero matrix serves many of the same functions in matrix arithmetic that 0 does in regular arithmetic, and is the additive identity. It is often written as just "0" in matrix equations.

The **identity matrix** is the multiplicitive identity, and
is a square matrix that has one's on the diagonal, and zeros everywhere
else. For example:

[ 1 0 0 ] is an identity matrix. [ 0 1 0 ] [ 0 0 1 ]

The identity matrix is usually written "I".

An identity matrix serves may of the same functions in matrix arithmetic that 1 does in regular arithmetic.

The **diagonal** of a matrix are the elements that have
identical row and column numbers. (X(i,i)) For example:

The diagonal of [ a b c ] is a, e, and i. [ d e f ] [ g h i ]

A **diagonal matrix** is one that has non-zero elements
only on the diagonal.

[ * 0 0 0 ] Diagonal matrix. [ 0 * 0 0 ] [ 0 0 * 0 ] [ 0 0 0 * ]

A **block diagonal matrix** is like a diagonal matrix,
except that elements exist in the positions arranged as blocks. Example:
(Where * means a non-zero element.)

[ * 0 0 0 0 0 0 0 ] [ 0 * * 0 0 0 0 0 ] [ 0 * * 0 0 0 0 0 ] [ 0 0 0 * * * 0 0 ] This is a block diagonal matrix. [ 0 0 0 * * * 0 0 ] [ 0 0 0 * * * 0 0 ] [ 0 0 0 0 0 0 * * ] [ 0 0 0 0 0 0 * * ]

A **Band Matrix** has numbers near the diagonal of the
matrix, and nowhere else. The width of the band is called the
band width of the matrix.

[ * * * 0 0 0 0 0 ] [ * * * * 0 0 0 0 ] [ * * * * * 0 0 0 ] [ 0 * * * * * 0 0 ] This is a band matrix. [ 0 0 * * * * * 0 ] [ 0 0 0 * * * * * ] [ 0 0 0 0 * * * * ] [ 0 0 0 0 0 * * * ]

A matrix is called **sparse** if most of the elements in
it are zero.

[ 0 0 0 * 0 0 0 0 ] [ 0 * 0 0 0 0 * 0 ] [ 0 0 0 0 * 0 0 0 ] [ 0 0 * 0 0 0 0 0 ] This is a sparse matrix. [ * 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 * ] [ 0 0 * 0 0 * 0 0 ] [ 0 0 0 * 0 0 0 0 ]

A matrix is called **dense** if it is not sparse.

The **transpose of a matrix** "N" (Written N') is just a
matrix "P" such that N(i,j) = P(j,i). For example:

transpose ( [ a b c ] ) = [ a d ] (The matrix has just been [ d e f ] [ b e ] reflected across the [ c f ] diagonal.) Or with ' notation: if A = [ a b c ] then A' = [ a d ] [ d e f ] [ b e ] [ c f ]

If a matrix M has the property that

M'M = I

than the
matrix is called an **Orthogonal matrix**

The transpose of a product (AB)' is the product of the individual transposes in reverse order. (B'A')

The transpose of a row is a column, and the transpose of a column is a row. Example:

[ a b c]' = [ a ] [ b ] [ c ]

It is usually easier to write column matrices using transposes as

[a b c]'

than as

[a] [b] [c]

If a matrix is equal to its own transpose, it is called a
**symmetric** matrix. (See below)

A number is called a **scalar** when talking about matrices,
and can often be thought of as a 1x1 matrix

A scalar multiplied by a matrix is a conventional way of indicating that EVERY element of the matrix is multiplied by the number. In more formal terms, the matrix is multiplied by a diagonal matrix where every diagonal element has been set to that scalar

The transpose of a scalar is just that number.

(AB)' = B'A' in general. (A transpose of a product is the product of the transposes in reverse order)

A **symmetric matrix** is a square matrix equal to it's
transpose. For example:

[ a b c d ] is a symmetric matrix. [ b e f g ] (Note that it looks like it was reflected [ c f h i ] across the diagonal.) [ d g i j ]

A 1x1 matrix (just a number) or a scalar is symmetric.

A symmetric matrix has the property that A = A'.

A matrix with entries only below the diagonal, or with entries only
above the diagonal, is called a (lower, upper)
**triangular matrix**.
If the diagonal in those cases consists only of 1's, then the
matrix is unit triangular. Examples:

upper triangular lower triangular [ * * * * * ] [ * 0 0 0 0 ] [ 0 * * * * ] [ * * 0 0 0 ] [ 0 0 * * * ] [ * * * 0 0 ] [ 0 0 0 * * ] [ * * * * 0 ] [ 0 0 0 0 * ] [ * * * * * ] upper unit triangular lower unit triangular [ 1 * * * * ] [ 1 0 0 0 0 ] [ 0 1 * * * ] [ * 1 0 0 0 ] [ 0 0 1 * * ] [ * * 1 0 0 ] [ 0 0 0 1 * ] [ * * * 1 0 ] [ 0 0 0 0 1 ] [ * * * * 1 ]

Matrices of the same size may be added, by making a new matrix of the same size, with elements that just add the corresponding elements from the matrices being added. For example:

[ a b c ] + [ h i j ] = [ a+h b+i c+j ] [ d e f ] [ k l m ] [ d+k e+l f+m ]

A matrix may be multiplied by a scalar. The result is the same as multiplying each element by that scalar. For example:

3 * [ a b c ] = [ 3 0 ] * [ a b c ] = [ 3*a 3*b 3*c ] [ d e f ] [ 0 3 ] [ d e f ] [ 3*d 3*e 3*f ]

Multiplication of matrices is not that simple. The number of columns of the first matrix must be equal to the number of rows of the second. The element (I, J) of the new matrix is the dot product of the I'th row of the first matrix and the J'th column of the second. The result of the multiplication has as many rows as the first matrix, and as many columns as the second. Example:

[ a b c ] * [ g h ] = [ a*g+b*i+c*k a*h+b*j+c*l ] [ d e f ] [ i j ] [ d*g+e*i+f*k d*h+e*j+f*l ] [ k l ]

which can be looked at in several ways. Element (I,J) can be looked at as the dot product of the I'th row and J'th column:

[a b c] * [g h] = [ [a b c]*[g] | [a b c]*[h] ] = [ a*g+b*i+c*k a*h+b*j+c*l ] [d e f] [i j] [ [i] | [j] ] [ d*g+e*i+f*k d*h+e*j+f*l ] [k l] [ [k] | [l] ] [ ____________+____________ ] [ [d e f]*[g] | [d e f]*[h] ] [ [i] | [j] ] [ [k] | [l] ]

Or it can be looked at as the sum of outer products. For each column of the first matrix, the J'th product is the sum of the I'th column of the first matrix forming an outer product with the I'th row of the second. (This is not a typographical error... The J'th item really is column I of the first outer producted with row I of the second matrix.) For example:

[a b c] * [g h] [d e f] [i j] [k l] = [a]*[g h] + [b]*[i j] + [c]*[k l] [d] [e] [f] = [ a*g a*h ] + [ b*i b*j ] + [ c*k c*l ] = [ a*g+b*i+c*k a*h+b*j+c*l ] [ d*g d*h ] [ e*i e*j ] [ f*k f*l ] [ d*g+e*i+f*k d*h+e*j+f*l ]

As you can see in this example, the two definitions give the same result.

**MATRIX MULTIPlICATION DOES NOT COMMUTE** except for very
special cases. Unlike regular arithmetic, A*B with matrices is not the
same as B*A. (The proper term for this is that "matrix multiplication
does not commute") Because matrix multiplication does not commute,
we have to distinguish between **premultiplying** by M (as in
M*P) and **postmultiplying** by M (as in P*M).

Just as numbers can have magnitudes matrices can have magnitudes
matrices can have magnitudes. The magnitude of a matrix is called the
"**determinant**". If the determinant of a matrix is 0,
then the matrix is **Singular** and has no multiplicitive
inverse. I won't even get into how determinants are computed...
it is ugly and any good book on matrices goes into that in great detail.

What you need to know is that they are positive, negative, or zero, and if they
are zero there is no matrix inverse. If the determinant is zero, the
matrix is called **singular**.

Geek note: That doesn't actually mean that if the determinant is near zero the matrix is near singular. [See Golub and Van Loan for example.]

Side note: Any matrix with two identical rows is singular. Any matrix that has one row that is a multiple of another is singular. Any matrix with a zero row or zero column is singular.

If a matrix called "A" is not singular, there is another matrix [ inverse (A) ] such that A*inverse(A) is an identity matrix. This is very much like regular multiplication, where a*inverse(a) = 1 unless a = 0. I won't go into where we get the inverse from. Any good book on matrices will go into that in great detail, and it is rather ugly.

Geek Note: Often the inverse of matrix A is represented by A', although this can be confused with the use of the single quote for transpose.

What you need to know is that they exist, and they can undo a matrix multiplication. With the matrices we will be dealing with there are special fast ways of forming them.

The inverse(AB) = inverse(B)*inverse(A). (The inverse of a product is the product of the inverses in reverse order.)

In general, matrices can be factored into matrices of special forms that are easy to work with. For example, if a matrix factors to ABCD, and A, B, C, and D are easy to invert, then the inverse of the product is just the product of those inverses in reverse order.

A **quadratic form** is a very special multiplication
of matrices. For two matrices A, B, the quadratic form is

A * B * A'

or

A' * B * A

This form shows up all over the descriptions of least squares problems. It is a very important part of the discussion of weights and covariance matrices.

The result of this multiplication always yields a symmetric matrix if B is symmetric.

Geek note:

A' * A

is always symmetric, since it can be looked at as

A' * I * A

and the identity matrix is symmetric.

There is a very special property of some matrices. If v is a (column) vector, then the quadratic form v'Bv is a single number. (The matrix multiplications are of sizes (1xN)(NxN)(Nx1), so the result is 1x1.)

If, for any non-zero vector v, v'Bv > 0, then the B matrix
is called **positive definite**.

Geek note: If B is positive definite, it must also be square. (Otherwise the v'Bv multiplication wouldn't have conforming matrices.)

Geek note: A 1 x 1 matrix is positive definite if the only element is positive.

There is a lot of very special mathematics that goes with this, but the short answer is that positive definite matrices are are very easy to deal with, and show up all over the place when doing least squares methods.

The most important properties of positive definite matrices
are that they **have inverses** and are easy to
deal with numericly.

**Unfamiliarity**

A goodly number of forms that one is familiar with with algebra in real numbers look a lot different in matrix form because of the limits imposed by not being able to commute multiplication.

Putting something over a common denominator:

With real numbers: a/b + c/d = (ad+cb)/bd

This has hidden assumptions of multiplication commuting:

(a/b)(d/d) equaling (ad/bd) assumes you can rearrange
things to your hearts content.

You do have rules like

/(a/b) = b/a

and

a/b*b/c = a/c

but not many of the other familiar rules.

Because of this some things look unfamiliar when we play with them in matrix form.

For example, with real numbers:

a/b + c/d = a/b + c/d= a/b( /(a/b) + /(c/d))c/d [ An identity above.]= a/b( b/a + d/c )c/d [ /(x/y) = //y*/x = y/x ]= a (/b*b/a + /b*d/c )c/d [ Left distribution ]= a (/b*b/a*c + /b*d/c*c) /d [ Right distribution ]= a ( /a*c + /b*d ) /d [ Canceling ]= a ( c/a + d/b ) /d [ ab = ba ]

But compare that with the matrix equivalent:

a/b + c/d = a/b ( /(a/b) + /(c/d)) ) c/d [ Identity listed above ]= a/b ( //b/a + //d/c ) c/d [ /(x/y) = y/x ]= a/b ( b/a + d/c ) c/d [ //x=x ]= a (/b*b*/a + /b*d/c ) c/d [ Left distribution ]= a (/b*b*/a*c + /b*d/c*c ) /d [ Right distribution ]= a ( /a*c + /b*d ) /d [ Cancelation ]

Which doesn't look particularly familiar to most folk...

This page is http://www.cc.utah.edu/~nahaj/math/matrices.html © Copyright 2003-2010 by John Halleck, All Rights Reserved. This snapshot was last modified on August 19th, 2011 And the underlying file was last modified on January 21st, 2010