# Parallel Coordinate visualisation of 28k, 5-dimensional data

This is the visualisation of the same dataset that I’ve been working on for a while, for exploring different data mining and visualisation techniques. Currently, the axes aren’t labelled, and the color coding is for different categories. Looks like a really interesting way to explore the data.

# A detour through data visualisation

I should have seen it coming. Text communicates well – up to a point. All the current analyses I’ve been working on, starting from Self-Organising Maps to Decision Trees, are very well served by good, solid visualisation. My current need is a way to visualise data structures effectively, even if it is merely a bunch of nodes which can be expanded/collapsed to show more information. Additionally, it would be nice (not necessary) for this to happen interactively, but I don’t mind a command line-driven approach. In fact, I prefer the command line; makes it easier to drive it through a scripting language like Ruby.
So far, I’ve looked at Processing, d3.js and ProtoVis. I like the idea of d3.js: the data-driven approach makes a lot of sense, but I think I need to refresh quite a bit of CSS-fu to take advantage of its capabilities. Apart from visualisation derived from data mining algorithms, showing the data as-is in an aesthetic manner is also a worthy goal at this point. In particular, the parallel coordinates visualisation caught my eye.
Oh well, at least I know what I’ll be doing for the next few days.

# Decision Trees

Continuing on with my exploration of the data mining landscape, I extracted out a decision tree out of the data under scrutiny. It is the same data (20,000 samples, 56 samples), but the dimensions I’ve chosen for partitioning are a bit different from the raw attributes. I’ve conflated the 56 dimensions into a single number, since this is a test score we’re talking about, and I’m not sure modelling the indivdual responses for constructing the tree would be the best idea. I’m not really looking for close fits, buckets or bins should be adequate for discretising the response space.
Accordingly, I’ve partitioned the pre-scores as EXCELLENT, GOOD, AVERAGE, etc. The attribute that I’m attempting to predict is also a score, but this is the post-score. Well, not really the score itself, the improvement in score is a more sensible metric to attempt to guess.

I had a problem with trying to visualise the data, but I’ve been able to make do with indenting the different levels of decision nodes; this should be fine till I really need to use a visualisation library. I think the code could use a bit of work – probability notation does not lend itself easy to elegant variable naming. I’ll probably write a lot more on this topic once I’m into Bayesian Nets.

# Matrix Theory: An essential proof for eigenvector computations

I’ve avoided proofs unless absolutely necessary, but the relation between the same eigenvector expressed in two different bases, is important.
Given that AS is the linear transformation matrix in standard basis S, and AB is its counterpart in basis B, we can write the relation between them as:

$A_B = C^{-1}A_SC\\* A_S = CA_BC^{-1}$

where C is the similarity transformation. We’ve seen this relation already; check here if you’ve forgotten about it.
Continue reading Matrix Theory: An essential proof for eigenvector computations

# Matrix Theory: Diagonalisation and Eigenvector Computation

$A= \left( \begin{array}{cc} 2 & 0 \\ 0 & 5 \end{array} \right)\$

The operation it performed on basis vectors of the standard basis S was one of scaling, and scaling only. When operated on by a linear transformation matrix, if a vector is only scaled as a consequence, that vector is an eigenvector of the matrix, and the scalar is the corresponding eigenvalue. This is just the definition of an eigenvector, which I rewrite below:

$A.x = \lambda x$

# Matrix Theory: Basis change and Similarity transformations

### Basis Change

Understand that there is nothing extremely special about the standard basis vectors [1,0] and [0,1]. All 2D vectors may be represented as linear combinations of these vectors. Thus, the vector [7,24] may be written as:

$\left( \begin{array}{cccc} 7 \\ 24 \end{array} \right)\ = 7. \left( \begin{array}{cccc} 1 \\ 0 \end{array} \right)\ + 24. \left( \begin{array}{cccc} 0 \\ 1 \end{array} \right)\$

# Matrix Theory: Linear transformations and Basis vectors

### Symmetric Matrices

A symmetric matrix looks like this:

$A= \left( \begin{array}{cccc} a & d & n & w \\ d & b & h & e \\ n & h & c & i \\ w & e & i & d \end{array} \right)\$

Notice how the values are reflected across the diagonal a-b-c-d; this holds true for any symmetric matrix.
Continue reading Matrix Theory: Linear transformations and Basis vectors

# Eigenvector algorithms for symmetric matrices: Introduction

My main aim in this series of posts is to describe the kernel — or the essential idea — behind some of the simple (and not-so-simple) eigenvector algorithms. If you’re manipulating or mining datasets, chances are you’ll be dealing with matrices a lot. In fact, if you’re starting out with matrix operations of any sort, I highly recommend following Professor Gilbert Strang’s lectures on Linear Algebra, particularly if your math is a bit rusty.

I have several reasons for writing this series. My chief motivation behind trying to understand these algorithms has stemmed from trying to do PCA (Principal Components Analysis) on a medium size dataset (20000 samples, 56 dimensional). I felt (and still feel) pretty uncomfortable about calling LAPACK routines and walking away with the output without trying to understand what goes on inside the code that I just called. Of course, one cannot really dive into the thick of things without understanding some of the basics: in my case, after watching a couple of the lectures, I began to wish that I had better mathematics teachers in school.
Continue reading Eigenvector algorithms for symmetric matrices: Introduction

# Guiding MapReduce-based matrix multiplications with Quadtree Segmentation

I’ve been following the Linear Algebra series of lectures from MIT’s OpenCourseWare site. While watching Lecture 3 (I’m at Lecture 6 now), Professor Strang enumerates 5 methods of matrix multiplication. Two of those provided insights I wish my school teachers had provided me, but it was the fifth method which got me thinking.
The method is really a meta-method, and is a way of breaking down multiplication of large matrices in a recursive fashion. To demonstrate the idea, here’s a simple multiplication between two 2×2 matrices.