Block Matrix Givens Rotation


This page just covers some basics of a specific generalization of the Givens Rotation.




The obvious restrictions on inverses existing, and on matrices being conformable, apply to all of the following...


Consider the block matrix

G = [  A   AB' ] and the block vector V = [ X ]
    [ -CB  C   ]                          [ Y ]

Part one: can the matrix zero out a component of the vector?

[  A   AB' ] [ X ] =  [  AX+AB'Y ]
[ -CB  C   ] [ Y ]    [ -CBX+CY  ]

For this to be equal to

[  AX+AB'Y ]
[     0    ]

requires that -CBX+CY = 0, so an obvious choice of constraint is: B = Y /X

So the answer is yes, it can, if we set B to Y /X.

Part Two: Is the matrix orthogonal? (That is, is GG' the identity?)

G' = [ A'    (-CB)' ] = [ A'   -B'C' ]
     [ (AB')'  C'   ]   [ BA'    C'  ]

GG' =
[  A   AB'] [ A'   -B'C' ]
[ -CB  C  ] [ BA'    C'  ]
[ AA'+AB'BA   -AB'C'+AB'C' ] = [ AA'+AB'BA'      0     ]
[ -CBA'+CBA'  CBB'C'+CC'   ]   [     0      CC'+CBB'C' ]

So... G *IS* orthogonal *if* AA' + AB'BA' = I *and* CC' + CBB'C' = I

If  AA'+AB'BA' = I
then A(I+B'B)A' = I  
and  I+B'B = /A /A' = /(A'A)
so /(I+B'B) = A'A

So we could find an acceptable A by grabbing a Cholesky factor of /(I + B'B)

Similar reasoning applies to C, but with BB' in place of B'B.

So, it is orthogonal if we set:

(The traditional Cholesky factor, for example.)

Final result

If we have [ X ] and want to zero the Y component with
           [ Y ] 

an orthogonal block matrix transformation, then we:

set B to  Y * /X
set A to the right hand factor of some Z'Z factorization of /(I + B'B ) (Such as the Cholesky factor)
set C to the right hand factor of some Z'Z factorization of /(I + B B') (Such as the Cholesky factor)

The matrix we want is:
[  A   AB' ]
[ -CB   C  ]

Note that Y * /X has to be conformable, but Y need not be square. (I.E. we can zero a column if need be)


My environment has a lot of upper triangular submatrices, which was the reason for choosing the upper triangular Cholesky factor as opposed to other matrix square roots.

Note that while the "Cholesky" factorization traditionally means factoring it into U'U (or equivalently LL') my requirement is only that the one factor transpose by the other be the original. So I can use L'L or UU' forms, among others. This allows one to pick which kind of sparcity pattern you like for the problem at hand. Note that there are TWO factorizations in the setup (A and C), and the form of each may be chosen independently of each other.

Note that Y does not have to be square. Therefore this can be used (Just like Householder transformations) to zero entire columns at a time, and the number of zeros in the resulting transformation tends to 1/2 as the problem gets large.

Putting it into traditional form

(A quick summary of all the identities needed is at:

Consider A = Cholesky( /(I+B'B) ) 
[We don't much care WHICH factorization in the family we grab...
 as long as [Cholesky (X)]'[Cholesky (X)] = X; ]

Since B = Y / X, this is

   A =    Cholesky( /(I + /X' Y' Y /X) )

which is [by the identities I+/A B C /D  <=> /A (AD + BC) /D
                        and     (X Y)'   <=>    Y' X'
                        and      /X X    <=>    I

   A =  Cholesky( /( /X' (X' X + Y' Y) /X ) )

and distributing in the inverse gives
     [ by  /(ABC) = /C /B /A, and //A = A ]:

   A =  Cholesky( X /( X' X + Y' Y) X' )

This form should be familiar to those that do traditional Given's rotations.
In the same position, the scalar Givens has x/sqrt(x*x+y*y) which is
sqrt(xx / (xx + yy) )  Which compares well in form with:
chol(X/(X'X + Y'Y)X' )

Exploiting Sparsity of the transform

The Y matrix that we are zeroing above need not be square. This allows the zeroing of a column... When used this way the resultant Block Given's matrix has a good deal of sparsity.

If Y is a column vector, and X is a scalar, the resulting (partitioned) block givens transform looks like (Using "|" and "_" to show the partitioning):

[  * | * * * * * * * ... ]
[ _______________________]
[  * | * * * * * * * ... ]
[  * | 0 * * * * * * ... ]
[  * | 0 0 * * * * * ... ]
[  * | 0 0 0 * * * * ... ]
[  * | 0 0 0 0 * * * ... ]
[  * | 0 0 0 0 0 * * ... ]
[  * | 0 0 0 0 0 0 * ... ]
[  . | . . . . . . . ... ]
[  . | . . . . . . . ... ]
[  . | . . . . . . . ... ]

The cooresponding Householder matrix has, of course, no zero elements if Y has none.

But this is not that useful an observation. In exchange for this sparsity you end up with having to do a massive Cholesky when forming the transformation. And the cooresponding householder while full is near trivial to compute.

The obvious numerical stability issue is that bb' or b'b is [famously] unstable numerically. However we need never actually compute them in order to compute Cholesky(I+bb') or Cholesky(I+b'b). If, instead, one starts by doing a QR factorization of b' or b, you have Cholesky(bb') or Cholesky(b'b). And there are algorithms to stably update a Cholesky factor with what it would be after adding identity and refactoring it. [TODO: Need references.]

One step QR factorization for Least Squares (for some problems)

There are some classes of problems where the block Given's allows a one step reduction of the problem to a QR factorization. I'll include my favorate example here...

For example, Surveying problems that have neither triangulation nor trilateralization [I.E. the average survey a land surveyor would do] have the property [assuming the survey is connected and contains at least one control point] that Block Given's can directly produce the QR factorization in one step.

For this class of problem you have N points, and M observations of the [linearized] problem such that each observation is of the form "From point A to point B the delta is D" (where D is typically a column 3-vector for 3d). I'll ignore the weighted problem, and my QR paper shows how to cast the weighted problem into this form. [TODO: This duplicates some introduction from my QR factorization paper... which is probably a better place to get it then here.]

Geometric constriants of a typical survey require that some point be tied down. (Sometimes just assumed to be zero, sometimes an explict control point.) Surveys are also constrained to be connected. Together these two constraints mean that the survey has a spanning tree. [Usually a very large number of different spanning trees, in fact... choice of which observations are included in the spanning tree can make a *big* difference in both final accuracy and amount of work when solving the problem.] [TODO: characterize my best heristics in a more formal form.]

Note that there are LOTS of special cases in the redundant rows that can be reduced by orthogonalization methods near trivally before hitting the problem with the full transformation. [Example: any two observations that tie together the same two points.] [TODO: and ordering the redundant rows also affects final accuracy and work load]

One can number the points in a "spanning tree order": Start at a control point, and then pick a point not in the survey but reachable in one step. That's your next numbered point. Add it to the survey and repeat until all points are included. A reverse spanning tree order is obtained by reversing this numbering. (Or just forming it reversed.)

If you consider the links that form the spanning tree (with the points numbered in a reverse spanning tree order), and sort then into descending order, you have an upper triangular matrix. And all the redundant observations are a collection of redundant rows.

Now that the problem is reduced to an upper triangular matrix and redundant rows, the left side the matrix has a form like:


Or, obviously:


Note: For a typical land survey problem, the redundant rows generally have only two elements. This rather extreme sparseness can be exploited when forming B'B below.

Block Givens can be directly applied to this to zero out the Y, in one step, and the Least Squares problem then requires solving the resulting problem.

Since the second row of the Block givens just zeros out the Y, we need not actually compute it. We can just compute the first row, and apply it, and zero out the Y [actually, just ignore it], and solve the resultant exactly determined system.

Since X is upper triangular (truely, and not just block triangular in the survey case), computing B is fairly easy. And instead of computing (I+B'B) we can observe that using the Cholesky to compute B makes B start out upper triangular, and the computation of (I+B'B) can be looked at as Cholesky factor update, [A somewhat degenerate update By Carlson's algorithm.] and can be computed with relatively low overhead.

I believe that there are transformations in the Block Given's family that can clear Y and preserve X as upper triangular if it starts upper triangular. [TODO: I need to add my notes that there is a family in general. ] [TODO: I need to finish proofs to show that this property is preserved with the right restrictions.] For those transformations, the transformation *IS* the Q needed for a QR transformation, and the QR factorization of the usual Least Squares solution by orthogonalization methods is done in one step.

Disclaimer: Of course, the individual Cholesky factorizations of expressions (of the form I+bb' or I+b'b) needed to form the rotation probably suffer from some of the kind of numerical problems that a Normal Equations solution does... [TODO: One should probably be able to use F. Incertus's block matrix generalization of the Moler-Morrison algorithm (or maybe even mine) to stably compute I+bb' or I+b'b without actually having to compute bb' or b'b, after doing a QR factorization of b' or b]

Go to ...

This page is
© Copyright 2003-2010 by John Halleck, All Rights Reserved.
This snapshot was last modified on August 19th, 2011