Eigenvector algorithms for symmetric matrices: Introduction

My main aim in this series of posts is to describe the kernel — or the essential idea — behind some of the simple (and not-so-simple) eigenvector algorithms. If you’re manipulating or mining datasets, chances are you’ll be dealing with matrices a lot. In fact, if you’re starting out with matrix operations of any sort, I highly recommend following Professor Gilbert Strang’s lectures on Linear Algebra, particularly if your math is a bit rusty.

I have several reasons for writing this series. My chief motivation behind trying to understand these algorithms has stemmed from trying to do PCA (Principal Components Analysis) on a medium size dataset (20000 samples, 56 dimensional). I felt (and still feel) pretty uncomfortable about calling LAPACK routines and walking away with the output without trying to understand what goes on inside the code that I just called. Of course, one cannot really dive into the thick of things without understanding some of the basics: in my case, after watching a couple of the lectures, I began to wish that I had better mathematics teachers in school.

There are generally a few hurdles that someone returning to a subject only half-remembered must overcome. The first of this is the notation. If you’re reading a paper, or a piece of text about math, unless you are somewhat familiar with the notation, your eyes generally tend to sweep right past the formulae and the identities; those happen to form the heart of what you’re reading!

However, notation can be learned. A more vexing issue at times (not always) is the fact that most texts are written with a very ‘technical’ audience in mind. By ‘technical’, I mean mathematicians, graduate/undergraduate/PhD math students, and so on. The result is that a lot of things are either taken for granted, or are introduced in a fashion which might not make sense to a newcomer at all. The most frequent symptom of this (at least for me) is the phrase “It is elementary to show that:” followed by some identity. Hello! How is this elementary?
In many such cases, there are two alternatives.

One, keep browsing for more papers or expositions on the same subject and find one which, hopefully, will shed some more light on the derivation of the aforementioned ‘elementary’ identity. Or two, more rewarding, is to try to prove that yourself. There are no shortcuts to mathematics, indeed.

These are the very reasons why I intend to introduce the core of the idea for some of these eigenvector algorithms in the hope that this helps someone who may be in a situation similar to mine…now or later. The algorithms I intend to look at will include:

  • Jacobi transformations
  • Householder reflections and QR/QL Factorisation
  • Cuppen’s Divide and Conquer