My main aim in this series of posts is to describe the kernel — or the essential idea — behind some of the simple (and not-so-simple) eigenvector algorithms. If you’re manipulating or mining datasets, chances are you’ll be dealing with matrices a lot. In fact, if you’re starting out with matrix operations of any sort, I highly recommend following Professor Gilbert Strang’s lectures on Linear Algebra, particularly if your math is a bit rusty.
I have several reasons for writing this series. My chief motivation behind trying to understand these algorithms has stemmed from trying to do PCA (Principal Components Analysis) on a medium size dataset (20000 samples, 56 dimensional). I felt (and still feel) pretty uncomfortable about calling LAPACK routines and walking away with the output without trying to understand what goes on inside the code that I just called. Of course, one cannot really dive into the thick of things without understanding some of the basics: in my case, after watching a couple of the lectures, I began to wish that I had better mathematics teachers in school.