Technology and Art
This is the easiest way I’ve been able to explain to myself around the orthogonality of matrix spaces. The argument will essentially be based on the geometry of planes which extends naturally to hyperplanes.
Some quick definitions first:
Terminology Note: Math textbooks and will usually define the column space as the image of \(A\), and the null space as the kernel of \(A\). \(A\) is defined in such contexts as a Linear Mapping. In this perspective, \(A\) is a linear transformation which operates on a vector \(x\) to give a new vector, possibly of a different dimensionality, and therefore, in a new vector space.
Thus, you may essentially view a matrix multiplication of \(A\) (\(m\times n\)) with a vector \(x\) (\(n\times 1\)) in two ways:
The important point is that any argument we make around the column space and null space of \(A\) applies exactly to the row space and left null space of \(A^T\), and vice versa.
For purposes of this discussion, I’ll pick a matrix which already has linearly independent column and row vectors.
\[A= \begin{bmatrix} a_{11} && a_{12} && ... && a_{1N} \\ a_{21} && a_{22} && ... && a_{2N} \\ a_{31} && a_{32} && ... && a_{3N} \\ \vdots && \vdots && \vdots && \vdots \\ a_{M1} && a_{M2} && ... && a_{MN} \\ \end{bmatrix}\]Let’s consider the non-zero null space of \(A\) and pick a vector from that space. Let that vector be \(x_O=(x_{O1}, x_{O2}, x_{O3}, ..., x_{ON})\).
\[A= \begin{bmatrix} a_{11} && a_{12} && ... && a_{1N} \\ a_{21} && a_{22} && ... && a_{2N} \\ a_{31} && a_{32} && ... && a_{3N} \\ \vdots && \vdots && \vdots && \vdots \\ a_{M1} && a_{M2} && ... && a_{MN} \\ \end{bmatrix} \begin{bmatrix} x_{O1} \\ x_{O1} \\ x_{O1} \\ \vdots \\ x_{O1} \\ \end{bmatrix} = \begin{bmatrix} a_{11}x_{O1} + a_{12}x_{O2} + a_{13}x_{O3} + ... + a_{1N}x_{ON} \\ a_{21}x_{O1} + a_{22}x_{O2} + a_{23}x_{O3} + ... + a_{2N}x_{ON} \\ a_{31}x_{O1} + a_{32}x_{O2} + a_{33}x_{O3} + ... + a_{3N}x_{ON} \\ \vdots \\ a_{M1}x_{O1} + a_{M2}x_{O2} + a_{M3}x_{O3} + ... + a_{MN}x_{ON} \\ \end{bmatrix} = 0\]Let’s take the equation of the first row:
\[a_{11}x_{O1} + a_{12}x_{O2} + a_{13}x_{O3} + ... + a_{1N}x_{ON}=0\]This represents a hyperplane: \(\mathbf{a_{11}x + a_{12}x + a_{13}x + ... + a_{1N}x=0}\) with the normal vector \(\mathbf{\hat{n}=(a_{11}, a_{12}, a_{13}, ..., a_{1N})}\). Note that \(\hat{n}\) is also one of the row vectors which spans A’s row space.
By the basic definition of hyperplanes and normal vectors (for a quick refresher, see Vectors, Normals, and Hyperplanes), we can say that:
This argument can be extended to all row vectors in \(A\), proving that \(A\)’s null space is orthogonal to every row vector in \(A\). By the property of linearity, this implies that \(A\)’s null space is orthogonal to \(A\)’s row space, i.e., \(\mathbf{N(A)\perp R(A)}\).
Now, apply the same argument for \(A^T\), i.e., the null space of \(A^T\) is orthogonal to \(A^T\)’s row space, i.e., \(\mathbf{N(A^T)\perp R(A^T)}\). But, we already know that:
\(\mathbf{R(A^T)=C(A)}\): The row space of \(A^T\) is the column space of \(A\). \(\mathbf{N(A^T)=LN(A)}\): The null space of \(A^T\) is the left null space of \(A\).
Thus, the left null space of \(A\) is orthogonal to the column space of \(A\).
To summarise: