Matrices

Review of Matrices

**** THIS PAGE IS DEFUNCT *** It has been moved to http://www.cc.utah.edu/~nahaj/math/matrices.html

THIS PAGE IS NOT BEING UPDATED ANY MORE, the version at http://www.cc.utah.edu/~nahaj/math/matrices.html *IS* still being maintained, corrected, and expanded. I recomend you update your bookmarks.


I give up. You can't read the paper without a knowledge of the basics of matrices. Most cavers have long since forgotten everything they knew about them, or have never heard of them to begin with.

Grammarian note: It is one Matrix, two matrices.

So... Here is a quick review.

I have made a collection of Matrix Identities that I use. Reading them could be helpful in following the otherwise short derivations scattered around this paper.

Numbers

A single number in matrix notation is called a scalar. It can be looked at as a number, or as a 1 x 1 matrix, or as a one element row or column.

Rows

A row (also called a row vector) is just an ordered collection of elements. For example,

[ a  b  c ]

is a row.

If you have two rows of the same length, you can add the rows by adding the corresponding elements in each row. For example, the row

[ d e f ] + [ g h i ] = [ d+g  e+h  f+i ]

One can multiply a row by a scalar (number). For example,

2 * [ a b c ] = [ 2a 2b 2c ]

A row may have any number of elements, from one on up.

If Z is a row, Z(i) means the i'th element of that row.

Columns

A column (also called a column vector) is just like a row, except it is arranged vertically. For example:

[ a ]
[ b ]
[ c ]

Columns can be added, or multiplied by a scalar (number) the same way that rows can:

[ a ]     [ g ]     [ a+g ]
[ b ]  +  [ h ]  =  [ b+h ]
[ c ]     [ i ]     [ c+i ]

    [ a ]    [ 2a ]
2 * [ b ] =  [ 2b ]
    [ c ]    [ 2c ]

A column may have any number of elements, from one on up.

If Y is a column, then Y(i) means the i'th element of the column, counting from the top.

Operations on rows and columns

If you have a row and a column, you can form what's called an "inner product", or "vector product", or a "dot product". The inner product is the product of the first element of the row with first element of the column, plus the product of the second two elements, etc. For example:

[ a b c ] [ d ] =   a*d + b*e + c*f  (or)  [ a*d + b*e + c*f ]
          [ e ]
          [ f ]

A row and column that have a dot product of zero are called Orthogonal vectors.

There is also another kind of product ("outer product")

[ a ] [ d e f ] =  [ a*d  a*e  a*f ]
[ b ]              [ b*d  b*e  b*f ]
[ c ]              [ c*d  c*e  c*f ]

Matrices

A matrix can be looked at as a column of rows, or as a row of columns.

Examples of matrices (with their sizes):

[ a b c ]   This is a 3 x 3 matrix.
[ d e f ]
[ g h i ]

[ a b c ]   This is a 2 x 3 matrix.
[ d e f ]

[ a b ]     This is a 3 x 2 matrix.
[ c d ]
[ e f ]

[ a ]      This is a 1 x 1 matrix.  Or it is a number, your choice.
           (Or it is a scalar)

[ a b c ]  This is a 1 x 3 matrix, or it is a row, also your choice.

[ a ]      This is a 3 by 1 matrix, or it is a column, whichever way you
[ b ]      want, they are the same thing.
[ c ]

If X is a matrix, X(i,j) means the element at the i'th row, and j'th column of the matrix.

A matrix is called a square matrix if it has the same numbers of rows as columns.

A zero matrix is a matrix where all the elements are zeros. For example

   [ 0 0 0 ] is a zero matrix.
   [ 0 0 0 ]
   [ 0 0 0 ]

A zero matrix serves many of the same functions in matrix arithmetic that 0 does in regular arithmetic.

The identity matrix is a matrix that has one's on the diagonal, and zeros everywhere else. For example:

[ 1 0 0 ]  is an identity matrix.
[ 0 1 0 ]
[ 0 0 1 ]

The identity matrix is usually written "I".

An identity matrix serves may of the same functions in matrix arithmetic that 1 does in regular arithmetic.

The diagonal of a matrix are the elements that have identical row and column numbers. (X(i,i)) For example:

The diagonal of [ a b c ] is a, e, and i.
                [ d e f ]
                [ g h i ]

A diagonal matrix is one that has non-zero elements only on the diagonal.

[ * 0 0 0 ]  Diagonal matrix.
[ 0 * 0 0 ]
[ 0 0 * 0 ]
[ 0 0 0 * ]

A block diagonal matrix is like a diagonal matrix, except that elements exist in the positions arranged as blocks. Example: (Where * means a non-zero element.)

[ * 0 0 0 0 0 0 0 ]
[ 0 * * 0 0 0 0 0 ]
[ 0 * * 0 0 0 0 0 ]
[ 0 0 0 * * * 0 0 ]  This is a block diagonal matrix.
[ 0 0 0 * * * 0 0 ]
[ 0 0 0 * * * 0 0 ]
[ 0 0 0 0 0 0 * * ]
[ 0 0 0 0 0 0 * * ]

A Band Matrix has numbers near the diagonal of the matrix, and nowhere else. The width of the band is called the band width of the matrix.

[ * * * 0 0 0 0 0 ]
[ * * * * 0 0 0 0 ]
[ * * * * * 0 0 0 ]
[ 0 * * * * * 0 0 ]  This is a band matrix.
[ 0 0 * * * * * 0 ]
[ 0 0 0 * * * * * ]
[ 0 0 0 0 * * * * ]
[ 0 0 0 0 0 * * * ]

A matrix is called sparse if most of the elements in it are zero.

[ 0 0 0 * 0 0 0 0 ]
[ 0 * 0 0 0 0 * 0 ]
[ 0 0 0 0 * 0 0 0 ]
[ 0 0 * 0 0 0 0 0 ]  This is a sparse matrix.
[ * 0 0 0 0 0 0 0 ]
[ 0 0 0 0 0 0 0 * ]
[ 0 0 * 0 0 * 0 0 ]
[ 0 0 0 * 0 0 0 0 ]

A matrix is called dense if it is not sparse.

The transpose of a matrix "N" (Written N') is just a matrix "P" such that N(i,j) = P(j,i). For example:

transpose ( [ a b c ] ) =  [ a  d ]   (The matrix has just been
            [ d e f ]      [ b  e ]   reflected across the
                           [ c  f ]   diagonal.)
Or with ' notation:
if A = [ a b c ]  then A' = [ a d ]
       [ d e f ]            [ b e ]
                            [ c f ]

If a matrix M has the property that

M'M = I

than the matrix is called an Orthogonal matrix

The transpose of a product (AB)' is the product of the individual transposes in reverse order. (B'A')

The transpose of a row is a column, and the transpose of a column is a row. Example:

[ a  b  c]' = [ a ]
              [ b ]
              [ c ]

It is usually easier to write column matrices using transposes as

[a b c]'

than as

[a]
[b]
[c]

If a matrix is equal to its own transpose, it is called a symmetric matrix. (See below)

The transpose of a number is just a number.

(AB)' = B'A' in general. (A transpose of a product is the product of the transposes in reverse order)

A symmetric matrix is a square matrix equal to it's transpose. For example:

[ a b c d ]  is a symmetric matrix.
[ b e f g ]     (Note that it looks like it was reflected
[ c f h i ]     across the diagonal.)
[ d g i j ]

A 1x1 matrix (just a number) or a scalar is symmetric.

A symmetric matrix has the property that A = A'.

A matrix with entries only below the diagonal, or with entries only above the diagonal, is called a (lower, upper) triangular matrix. If the diagonal in those cases consists only of 1's, then the matrix is unit triangular. Examples:

upper triangular            lower triangular
 [ * * * * * ]               [ * 0 0 0 0 ]
 [ 0 * * * * ]               [ * * 0 0 0 ]
 [ 0 0 * * * ]               [ * * * 0 0 ]
 [ 0 0 0 * * ]               [ * * * * 0 ]
 [ 0 0 0 0 * ]               [ * * * * * ]

upper unit triangular     lower unit triangular
 [ 1 * * * * ]               [ 1 0 0 0 0 ]
 [ 0 1 * * * ]               [ * 1 0 0 0 ]
 [ 0 0 1 * * ]               [ * * 1 0 0 ]
 [ 0 0 0 1 * ]               [ * * * 1 0 ]
 [ 0 0 0 0 1 ]               [ * * * * 1 ]

Matrices of the same size may be added, by making a new matrix of the same size, with elements that just add the corresponding elements from the matrices being added. For example:

[ a b c ] + [ h i j ] = [ a+h b+i c+j ]
[ d e f ]   [ k l m ]   [ d+k e+l f+m ]

A matrix may be multiplied by a scalar. The result is the same as multiplying each element by that scalar. For example:

3 * [ a b c ] = [ 3*a 3*b 3*c ]
    [ d e f ]   [ 3*d 3*e 3*f ]

Matrix Multiplication

Multiplication of matrices is not that simple. The number of columns of the first matrix must be equal to the number of rows of the second. The element (I, J) of the new matrix is the dot product of the I'th row of the first matrix and the J'th column of the second. The result of the multiplication has as many rows as the first matrix, and as many columns as the second. Example:

[ a b c ]  *  [ g h ] = [ a*g+b*i+c*k   a*h+b*j+c*l ]
[ d e f ]     [ i j ]   [ d*g+e*i+f*k   d*h+e*j+f*l ]
              [ k l ]

which can be looked at in several ways. Element (I,J) can be looked at as the dot product of the I'th row and J'th column:

[a b c] * [g h] = [ [a b c]*[g]   [a b c]*[h] ] = [ a*g+b*i+c*k   a*h+b*j+c*l ]
[d e f]   [i j]   [         [i]           [j] ]   [ d*g+e*i+f*k   d*h+e*j+f*l ]
          [k l]   [         [k]           [l] ]
                  [                           ]
                  [ [d e f]*[g]   [d e f]*[h] ]
                  [         [i]           [j] ]
                  [         [k]           [l] ]

Or it can be looked at as the sum of outer products. For each column of the first matrix, the J'th product is the sum of the I'th column of the first matrix forming an outer product with the I'th row of the second. (This is not a typographical error... The J'th item really is column I of the first outer producted with row I of the second matrix.) For example:

[a b c] * [g h]
[d e f]   [i j]
          [k l]
=
[a]*[g h]   + [b]*[i j]   + [c]*[k l]
[d]           [e]           [f]
 =
[ a*g a*h ] + [ b*i b*j ] + [ c*k c*l ]  =  [ a*g+b*i+c*k   a*h+b*j+c*l ]
[ d*g d*h ]   [ e*i e*j ]   [ f*k f*l ]     [ d*g+e*i+f*k   d*h+e*j+f*l ]

As you can see in this example, the two definitions give the same result.

MATRIX MulTIPliCATION DOES NOT COMMUTE except for very special cases. Unlike regular arithmetic, A*B with matrices is not the same as B*A. (The proper term for this is that "matrix multiplication does not commute") Because matrix multiplication does not commute, we have to distinguish between premultiplying by M (as in M*P) and postmultiplying by M (as in P*M).

Magnitudes

Just as numbers can have magnitudes independent of sign (absolute value), matrices can have magnitudes. The magnitude of a matrix is called the "determinant". I won't even get into how they are computed here... any good book on matrices goes into that in great detail.

What you need to know is that they are positive, negative, or zero, and if they are zero there is no matrix inverse. If the determinant is zero, the matrix is called singular.

Geek note: That doesn't actually mean that if the determinant is near zero the matrix is near singular. [See Golub and Van Loan for example.]

Side note: Any matrix with two identical rows is singular. Any matrix that has one row that is a multiple of another is singular. Any matrix with a zero row or zero column is singular.

Inverses

If a matrix called "A" is not singular, there is another matrix [ inverse (A) ] such that A*inverse(A) is an identity matrix. This is very much like regular multiplication, where a*inverse(a) = 1 unless a = 0. I won't go into where we get the inverse from. Any good book on matrices will go into that in great detail, and it is rather ugly.

Geek Note: Often the inverse of matrix A is represented by A', although this can be confused with the use of the single quote for transpose.

What you need to know is that they exist, and they can undo a matrix multiplication. With the matrices we will be dealing with in this paper there are special fast ways of forming them.

The inverse(AB) = inverse(B)*inverse(A). (The inverse of a product is the product of the inverses in reverse order.)

In general, matrices can be factored into matrices of special forms that are easy to factor. If a matrix factors to ABCD, and A, B, C, and D are easy to invert, then the inverse of the product is just the product of those inverses in reverse order.

Quadratic Forms

A quadratic form is a very special multiplication of matrices. For two matrices A, B, the quadratic form is

A' * B * A

This form shows up all over the descriptions of the least squares problem. It is a very important part of the discussion of weights and covariance matrices.

The result of this multiplication always yields a symmetric matrix if B is symmetric.

Geek note:

A' * A

is always symmetric, since it can be looked at as

A' * I * A

and the identity matrix is symmetric.

Positive Definite

There is a very special property of some matrices. If v is a (column) vector, then the quadratic form v'Bv is a single number. (The matrix multiplications are of sizes (1xN)(NxN)(Nx1), so the result is 1x1.) If, for any non-zero vector v, v'Bv > 0, then the B matrix is called positive definite.

Geek note: If B is positive definite, it must also be square. (Otherwise the v'Bv multiplication wouldn't have conforming matrices.)

Geek note: A 1 x 1 matrix is positive definite if the element is positive.

There is a lot of very special mathematics that goes with this, but the short answer is that positive definite matrices are very easy to find the inverses of, and are very easy to deal with, and show up all over the place when doing least squares methods.

The most important properties of positive definite matrices are that they are Symmetric and have inverses. Both of these properties are VERY important below.

Unfamiliarity

A goodly number of forms that one is familiar with with algebra in real numbers look a lot different in matrix form because of the limits imposed by not being able to commute multiplication them.

Putting something over a common denominator:
  With real numbers:  a\b + c\d = (ad+cb)\bd
  This has hidden assumptions of multiplication commuting:
       (a\b)(d\d) equaling (ad\bd) assumes you can rearrange
                                   things to your hearts content.
You do have rules like
\(a\b) = b\a
and
a\b*b\c = a\c
but not many of the other familiar rules.

Because of this some things look unfamiliar when we play with them
in matrix form.

For example:
a\b + c\d =        a\b   +     c\d
          = a\b  \(a\b)  +   \(c\d))c\d   [ An identity above.
          = a\b(   b\a   +     d\c )c\d   [ \(x\y) = \\y*\x = y\x ]
          = a  (\b*b\a   + \b*d\c  )c\d   [ Left distribution ]
          = a  (\b*b\a*c + \b*d\c*c) \d   [ Right distribution ]
          = a  (    \a*c + \b*d    ) \d   [ Canceling ]

But compare that with the form:
a\b + \d*c = a\b ( \(a\b)    +  \(\d*c)  ) \d*c     [ Identity listed above ]
           = a\b (   b\a     +    \c*d   ) \d*c     [ \(x\y) = y\x ]
           = a   (\b*b*\a    + \b*\c*d   ) \d*c     [ Left distribution ]
           = a   (\b*b*\a*\d + \b*\c*d*\d)    c     [ Right distribution ]
           = a   (     \a*\d + \b*\c     )    c     [ Cancelation ]
           = a   (    \(d*a) +\(c*b)     )    c     [ \x\y = \(yx) ]

None of which look particularly familiar to most folk...

Go to ...


This page is http://www.cc.utah.edu/~nahaj/cave/survey/intro/matrices.html
© Copyright 2003 by John Halleck, All Rights Reserved.
This snapshot was last modified on November 6th, 2003