There are multiple ways to skin matrix multiplication. If you begin to think about it, there are probably 4 or 5 ways in which you could approach matrix multiplication. In this post, we look at another, easier, way of multiplying two matrices, and attempt to build a MapReduce version of the algorithm. Before we dive into the code itself, we’ll quickly review the actual algebraic process we’re trying to parallelise. Continue reading Parallelisation : Writing a linear matrix algorithm for Map-Reduce
I’ve recently gotten interested in the parallelisation of algorithms in general; specifically, the type of algorithm design compatible with the MapReduce model of programming. Given that I’ll probably be dealing with bigger quantities of data in the near future, it behooves me to start think about parallelisation, actively. In this post, I will look at the matrix multiplication algorithm which uses block decomposition, to recursively compute the product of two matrices. I have spoken of the general idea here; you may want to read that first for the linear algebra groundwork, before continuing on with this post. Continue reading Parallelisation : Refactoring a recursive block matrix algorithm for Map-Reduce
I return to the first example about basis vectors, when I spoke of linear transformations. The linear transformation we had was this:
The operation it performed on basis vectors of the standard basis S was one of scaling, and scaling only. When operated on by a linear transformation matrix, if a vector is only scaled as a consequence, that vector is an eigenvector of the matrix, and the scalar is the corresponding eigenvalue. This is just the definition of an eigenvector, which I rewrite below:
My main aim in this series of posts is to describe the kernel — or the essential idea — behind some of the simple (and not-so-simple) eigenvector algorithms. If you’re manipulating or mining datasets, chances are you’ll be dealing with matrices a lot. In fact, if you’re starting out with matrix operations of any sort, I highly recommend following Professor Gilbert Strang’s lectures on Linear Algebra, particularly if your math is a bit rusty.
I have several reasons for writing this series. My chief motivation behind trying to understand these algorithms has stemmed from trying to do PCA (Principal Components Analysis) on a medium size dataset (20000 samples, 56 dimensional). I felt (and still feel) pretty uncomfortable about calling LAPACK routines and walking away with the output without trying to understand what goes on inside the code that I just called. Of course, one cannot really dive into the thick of things without understanding some of the basics: in my case, after watching a couple of the lectures, I began to wish that I had better mathematics teachers in school.
Continue reading Eigenvector algorithms for symmetric matrices: Introduction