Linear Transformations

This is a page dedicated to helping people understand what a linear transformation is.


It provides information regarding three specific things: the historical development of linear transformation (i.e, when did it first show up), an explanation of the mathematics of, and underlying, linear transformations, and a discussion of their significance and applications in linear algebra, differential geometry, and the mathematical foundations of quantum mechanics. Scroll down to learn more!

Linear transformations are central component of modern linear algebra, but they are a relatively new development in the long history of the discipline. Linear algebra began merely as the study of solutions to linear equations, and its history extends all the way back to ancient Mesopotamia. Today, linear transformations have a number of important applications, ranging from computer graphics to quantum mechanics (though in the case of the latter, they are often referred to as linear linear operators, but more on that later). But first, here's a definition!

Let \(V\) and \(W\) be vector spaces over the field \(\mathbb{K}\) of dimensions \(n\) and \(m\), respectively, let \(\mathbf{u}, \mathbf{v}\in V\) and \(\alpha\in\mathbb{F}\), and let \(f:V\longrightarrow W\) be a function with the following properties: \begin{align*} f(\mathbf{u}+\mathbf{v})&=f(\mathbf{u})+f(\mathbf{v})\\ f(\alpha\mathbf{v})&=\alpha f(\mathbf{v}). \end{align*} Such a function \(f\) is called a linear transformation.

To the readers gaping in horror at this immediate appearance of the abstract mathematical formulation I say, "Worry not!" This site will help you develop some intuition regarding linear transformations that will greatly illuminate the definition you just read. But before we start to build our linear algebraic intuition, allow me to introduce, in brief, the content included on this site.

In the content that follows you will find an a brief overview of the history of this fascinating area of Linear Algebra, an explanation of some of the more elementary mathematics (a more in-depth overview would require a great deal of abstract algebra, differential geometry, and functional analysis), and a look at some of the most interesting and far-reaching applications of linear transformations.

The study of solutions to linear equations is an ancient endeavor. The study of linear systems of two equations with two unknowns can be traced back to the Fertile Crescent. Babylonian scholars developed an interesting solution algorithm that involved a guess that both unknowns had the same value, followed by a correction (Informed readers may notice similarities between this kind of guess-then-correct method and the Babylonians other famous algorithm, the False-Position method).

Chinese scholars made much greater strides in developing the theory of linear systems of equations, going so far as to develop the “Counting Board”, the first matrix-like representation of a linear system. An example can be found in an ancient text that translates to Book on Numbers and Computation.

Fangcheng
Here's a (slow-changing) gif that shows a counting board in action. Can you figure out how this translates to a system of linear equations?

For this reason, Linear Algebra is frequently cited as originating in 3rd century BCE China.

When Descartes introduced coordinate geometry, he set the groundwork for what I consider to be the most beautiful interpretation of solutions to systems of linear equations, namely, the geometric. This is what we are going to spend the bulk of our time discussing right now, as it is an extremely helpful perspective for understanding linear transformations.

Let's take a moment to look at what Descartes provided us with. A system of equations involving two equations (sometimes called two-dimensional systems) and two unknowns could be interpreted entirely in terms of the graphs of the lines defined by the equations. Two dimensional systems with a single solution appear as lines in the plane with a single point (with coordinates that solve the system) of intersection, those with infinitely many solutions appear as lines superimposed on one another, and systems with no solution appear as parallel lines.

Systems of three equations and three unknowns (three-dimensional linear systems) are a bit more complicated, but still quite simple to understand when viewed geometrically. A system with a single solutions appears as the intersection of three planes at a single point (with coordinates that satisfy the system). One with no solutions manifests as three parallel planes or as planes that intersect with only one other plane at a time (which creates an interesting tetrahedron-shaped, or infinite triangular cylinder-shaped enclosure). One with infinitely many solutions appears as three overlapping planes.

There are, however, other systems with which elementary linear algebra students engage but can still be understood in geometric terms. For example, a system of two equations with three unknowns appears either as parallel planes (when no solution exists) or as a line, given that the system will always have infinitely many solutions if there is at least one. Or perhaps students are given three or more equations with only two unknowns. If all respective lines don't intersect at a single point (which will probably be the case), the system has no solution. An analogous situation appears with four or more equations involving only three unknowns.

That should be sufficient for now. We will return to this geometric interpretation in another section, but for now, we will move on.

But there isn't actually a lot to move on to, unless we leave the confines of linear algebra!

The modern conception of a linear algebra, with its vector spaces, eigenvalues and vectors, and, most importantly for the purposes of this project, the idea of a linear transformation, is a fairly recent development. In fact, the modern conception of a vector space, a necessary precursor to linear transformations, only came to be in 1888, being defined by Giuseppe Peano (scroll down to see his photo). By 1900, the idea of an abstract linear transformation of a finite-dimensional vector space over the set of real numbers, with its accompanying matrix representation, finally entered the realm of discourse among mathematician (In the next section we will expound on how these new abstractions served to revolutionize the study of linear systems).

Following the establishment of these unifying principles (the abstract vector space and the concept of a linear transformation), mathematicians sought to generalize the kind of structure they found in vector spaces. Ring theory was connected to linear algebra by way of vector spaces, as was functional analysis.

A Brief History of Linear Algebra (Podcast)

Before we get into the nitty-gritty, let's take a moment to try and understand why we use the term transformation, as it helps to consider an alternate interpretation of solutions to a system of linear equations.

Students frequently think of linear systems of equation as problems that require how many of each of however many unknowns there are, will simultaneously satisfy each equation. This makes sense, given that students are generally introduced to the topic using problems that involve determining the number of *insert object name (e.g. shirt)* with different *insert descriptive characteristic (e.g. color)*.

Unfortunately, while understandable, this approach to linear systems robs linear algebra of much of the kind of mathematical beauty we saw in the geometric solutions afforded us by Descartes analytic geometry, and probably doesn't even seem to connect to our weird definition from the introduction!

First, let's start with a linear system of equations and for the sake of simplicity, keep it to three equations and three unknowns. We are going to have to assume that the readers know at least a little bit about how to multiple matrices and vectors, otherwise we will be at this all day. Let \(a_{ij}, b_i\in\mathbb{R},1\leq i,j\leq 3\) (meaning \(i\) and \(j\) are indices that range from \(1\) to three each): \begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3 &= b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3&=b_3 \end{align*}

We can write this system as an equation involving matrix-vector multiplication: \[ A\mathbf{x}=\mathbf{b} \] by writing, \[ \Bigg( \begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{array}\Bigg) \Bigg(\begin{array}{c} x_1\\ x_2\\ x_3 \end{array}\Bigg) = \Bigg( \begin{array}{c} b_1\\ b_2\\ b_3 \end{array}\Bigg) \]

Here is where things get interesting, but it requires us to recall (or learn for the first time) a theorem from linear algebra.

One of the most tragically under-emphasized theorems in linear algebra relates matrices with linear transformation of finite dimensional vectors spaces. It states that for every \(n\) by \(m\) matrix (let's call it \(A\)), there is a linear transformation \(f\) with an \(m\) dimensional real vector space (say \(V\)) as its domain and an \(n\) dimensional real vector space (say \(W\)) as its range, such that for every \(\mathbf{v}\in V\), \(f(\mathbf{v})=A\mathbf{v}\)! In other words, you get the exact same vector as output from \(f\) as you get when you perform matrix-vector multiplication using \(A\)! In other other words, every matrix is a linear transformation between vector spaces over the real numbers (i.e., vector spaces where the scalars are all real numbers).

Do you see what this means!?

Every linear system of equations can be thought of as tasking us with the challenge of finding which vector in one space gets mapped to the one provided on the right-hand side of the given equation! Things become even more interesting when we drop the equation and just consider the matrix form of a linear transformation (or just make sure we are viewing matrices as linear transformations). Linear transformations can be thought of as matrix-vector multiplication that takes every vector in one vector space to one in another vector space. In some sense, you could say they transform vector spaces. This terminology becomes even more ``on the nose'' when the vector spaces are the same, that is, when \(V=W\). In this case, the linear transformation quite literally transforms the vector space. And it does it in a linear way!

What do I mean by that?

Well, the "linearity" requirements, those equations \begin{align*} f(\mathbf{u}+\mathbf{v})&=f(\mathbf{u})+f(\mathbf{v})\\ f(\alpha\mathbf{v})&=\alpha f(\mathbf{v}), \end{align*} guarantee that in real vector spaces (like the one represented by the Cartesian coordinate plane!) every single line of points (or point vectors if you like) is transformed into another line (although in the case of the identity matrix its the same line)! The transformations are "linear" in a very literal sense!

A additional property of linear transformations is that the image of the zero vector under a linear map is always the zero vector.

Here, let's prove it!

We will use our already defined vector spaces \(V\) and \(W\). Let \(\mathbf{0_V}\) and \(\mathbf{0_W}\) be the zero vectors in \(V\) and \(W\), respectively. By linearity, we know that for any vector \(\mathbf{v}\in V\), that \(T(\mathbf{0_v})=T(0\cdot \mathbf{v})=0\cdot T(\mathbf{v})=\mathbf{0_W}\). Since it holds for any vector of \(V\), it certainly holds when \(v=\mathbf{0_V}\).

But going back to the idea that we can represent linear transformations, this idea actually works with any finite dimensional vector space over the real numbers, including some much stranger ones. For example, functions behave a lot like vectors in the plane! we can add and subtract them and scale them, and we end up with functions. In fact, the set of polynomial function with a specific degree is actually a finite-dimensional vector space that is isomorphic (i.e., structurally the same) as the normal geometry vector space with dimension one greater than the degree of the polynomials. This means that any kind of function on this space of functions that possesses the linearity properties can be represented as a matrix!

This is kind of bizarre!

For example, a commonly referenced linear function on a space of polynomial functions is the derivative from elementary calculus. We can represent the derivative as a matrix! The same goes for any antiderivative we choose! However, due to the changes in degree, it is safest (to avoid leaving our vector space) to use the space of polynomial functions of any degree. However, this produces an additional complication. The matrix for the derivative ends up being infinite...

Weird, right?

The following video provides a nice explanation of what we are talking about, along with some truly beautiful graphical representations.

So, we are going to come up with matrices for the derivative and the antiderivative with a constant term of zero and, for the sake of simplicity, we are going to produce the matrix of the derivative for the case when our vector space is the space of polynomial functions of degree two! However, we won't do the antiderivative for degree two polynomials as it would make us (geometrically) enter a four dimensional space that we can't actually visualize, but we can do the case of degree one polynomials.

Before we do this, let's briefly discuss how you come up with the matrix form of a linear transformation between finite dimensional vector spaces. The "linearity" aspect uniquely determines where every vector goes so long as we can compute where a basis of our vector space goes. We will definitely go for standard bases (though that isn't technically necessary).

For the case of geometric vector spaces, like three-dimensional Euclidean space, we use the stand basis vectors, where are the length one vectors parallel to the coordinate axes, that point in the positive direction of the axis to which they are parallel. It turns out that the column vectors of a given matrix are exactly the vectors to which the chosen basis vectors are mapped!

For example, for the matrix \[ \Bigg( \begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{array}\Bigg) \] can be viewed as taking the the vector \((1,0,0)\) to \((a_{11},a_{21},a_{31})\), \((0,1,0)\) to \((a_{12},a_{22},a_{32})\), and \((0,0,1)\) to \((a_{13},a_{23},a_{33})\).

For the case of our strange polynomial space, we need a way to think of polynomials as geometric vectors. We can do this by associating the basis \(\{1,x,x^2\}\) with the standard basis in three dimensional Euclidean space. This means the degree two polynomial \(c+bx+ax^2\) can be thought of as the three vector in three-dimensional Euclidean space \((c, b, a)\). So if we want to figure out a matrix for the derivative and antiderivative, we just need to figure out, in terms of the basis we chose for our polynomial space, where each basis vector is mapped by the derivative operator. This isn't too bad if you know elementary calculus.

Since \(\frac{d}{dx}(1)=0\), \(\frac{d}{dx}(x)=1\), and \(\frac{d}{dx}(x^2)=2x\), we can write the matrix for the standard derivative as \[ \Bigg( \begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 2 \\ 0 & 0 & 0 \end{array}\Bigg). \]

So for a general degree two polynomial \(c+bx+ax^2\), with corresponding geometric vector \((c,b,a\), the derivative (which we should expect to be \(b+2ax\)), is computed as follows: \begin{align*} \Bigg( \begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 2 \\ 0 & 0 & 0 \end{array}\Bigg) \Bigg(\begin{array}{c} c \\ b \\ a \end{array}\Bigg)&= \Bigg(\begin{array}{c} 0\cdot c +1\cdot b + 0\cdot a \\ 0 \cdot c + 0 \cdot b + 2\cdot a \\ 0 \cdot a +0\cdot b +0\cdot a \end{array}\Bigg)\\ &=\Bigg(\begin{array}{c} b \\ 2a \\ 0 \end{array}\Bigg) \end{align*} which we convert back to the polynomial \(b\cdot 1+2a\cdot x + 0\cdot x^2=b+2ax\), as expected.

For an antiderivative, as we stated earlier, we won't antidifferentiate \(x^2\) because that takes us into a higher dimensional space. But we can still make this work. We know that an (notice an instead of the, as there are infinitely many and we will only be producing one of them) antiderivative of \(1\) is \(x\) and an antiderivative of \(x\) is \(\frac{1}{2}x^2\). Therefore we can express the antiderivative (where the constant we add is 0) as \begin{align*} \Bigg( \begin{array}{cc} 0 & 0\\ 1 & 0\\ 0 & \frac{1}{2} \end{array}\Bigg) \end{align*} Then for our polynomial \(b+2ax\), we end up with the matrix-vector product: \begin{align*} \Bigg( \begin{array}{ccc} 0 & 0 \\ 1 & 0 \\ 0 & \frac{1}{2} \end{array}\Bigg) \bigg(\begin{array}{c} b \\ 2a \\ \end{array}\bigg)&= \Bigg(\begin{array}{c} 0\cdot b +0\cdot 2a \\ 1 \cdot b + 0 \cdot 2a\\ 0 \cdot b +\frac{1}{2}\cdot 2a \end{array}\Bigg)\\ &=\Bigg(\begin{array}{c} 0 \\ b \\ a \end{array}\Bigg) \end{align*} which we then write as the polynomial \(bx+ax^2\).

This may not be the most practical way of finding derivatives and antiderivatives, but it has at least help me connect my geometric understanding of Euclidean vector spaces with the more abstract vector spaces.

We could go on for a considerable amount of time, discussing the meaning of things like eigenvalues and eigenvectors, but we will leave that for another day.

Here's a nice recap of what we just discussed (and even a little bit more!):

Linear transformations undergird computer graphics almost entirely. This should make sense given the more obvious geometric interpretation of transforming a vector space. This video provides a nice explanation:

But wait, there's more! Special relativity in physics makes use of linear transformations for things like the famous Lorentz boost. In fact, lots of physics relies linear transformation. For example, ``metrics'', which are used for incorporating the notion of distance into curved spaces, are often represented as matrices (a.k.a. linear transformations). One of the most famous metrics is called the Minkowski metric. The following video is, admittedly, a bit high-level, but you can still appreciate the use of matrices (which we can now basically always think of as linear transformations):

If we extend our notion of vector spaces to the more general concept of a Hilbert Space, we start to develop the mathematics that underlies quantum mechanics. In quantum mechanics, observation plays a central role (e.g. Schrodinger's cat comes to mind... scroll down to see an image). Observables (i.e., energy position, momentum, or other things we can observe), for example, are just the linear operators (just another term for linear transformations) on a Hilbert Space which satisfy a special condition that qualifies them to be called ``self-adjoint''. Self-adjoint linear operators can be viewed as a class of complex-valued matrices called ``Hermitian'', which possess some desirable properties (see Spectral Theory for more information.

We could go on, and on, and on. Even though in many cases the geometric interpretation is no longer sensible, linear transformation permeate nearly topics from a huge number of scientific disciplines!

References

Source Title Citation
Book Linear Algebra Done Right Axler, S. (2017). Linear Algebra Done Right. Place of publication not identified: Springer.
Book The Theory of Linear Algebra Beasley, L. B. (2011). The Theory of Linear Algebra. Logan, UT: Utah State University.
Book Matrices and Linear Transformations Cullen, C. G. (1990). Matrices and Linear Transformations. New York: Dover.
Book Linear Algebra Friedberg, S. H., Insel, A. J., & Spence, L. E. (2003). Linear algebra (4th ed.). Upper Saddle River, New Jersey: Pearson Education.
Book The Linear Algebra a Beginning Graduate Student Ought To Know Golan, J. S. (2012). The Linear Algebra a Beginning Graduate Student Ought to Know. Dordrecht: Springer.
Journal Article Transformations of Interval Linear Systems of Equations and Inequalities Hladík, M. (2016). Transformations of interval linear systems of equations and inequalities. Linear and Multilinear Algebra, 65(2), 211-223. doi:10.1080/03081087.2016.1180339
Book Algebra Hungerford, T. W. (1974). Algebra. New York: Rinehart and Winston.
Journal Article Trace Preserving Linear Transformtions on Matrix Algebras Kovascs, A. (1977). Trace preserving linear transformations on matrix algebras. Linear and Multilinear Algebra, 4(4), 243-250. doi:10.1080/03081087708817158
Book Linear Algebra Lang, S. (2000). Linear Algebra. New York: Springer.
Book Linear Algebra With Applications Leon, S. J. (2010). Linear Algebra With Applications. Upper Saddle River, NJ: Pearson Education International.
Journal Article Linear Transformations Linear Transformations. (2016). Exercises in Linear Algebra, 111-140. doi:10.1142/9789813143050_0004
Book An Introduction to Linear Transformations in Hilbert Space Murray, F. (1941). An Introduction to Linear Transformations in Hilbert Space. (AM-4). Princeton University Press. Retrieved from http://www.jstor.org/stable/j.ctt1bc547d
Book Introduction to Linear Algebra Strang, G. (2016). Introduction to Linear Algebra. Wellesley, MA: Wellesley - Cambridge Press.
Book Linear Algebra and Its Applications Strang, G. (2006). Linear Algebra and Its Applications. Belmont, CA: Brooks Cole.
Book Matrix Groups for Undergraduates Tapp, K. (2005). Matrix Groups for Undergraduates. Providence, RI: American Mathematical Society.