Vectors, Linear Transformations, and Matrix Forms

A two-dimensional vector, $\bar{v} = \begin{pmatrix}x\\y\end{pmatrix}$, can be thought of representing the displacement that, when starting at the origin, would move something to the point $(x,y)$.

Consequently, a natural interpretation for the sum of two vectors would be the total displacement that results from their individual applications. This can be seen in the picture below, where the red vector represents the total displacement resulting from the application of the blue and green vectors. Fortunately, as seen below, whether we follow the the displacement arising from the blue vector with that from the green vector (the solid segments), or vice-versa (the dashed segments), the overall displacement is the same. (Hence, the addition of vectors is commutative.)

As indicated in the picture, this suggests how we might find the sum of two vectors algebraically as well:
$$\begin{pmatrix}x_1\\y_1\end{pmatrix} + \begin{pmatrix}x_2\\y_2\end{pmatrix} = \begin{pmatrix}x_1+x_2\\y_1+y_2\end{pmatrix}$$
Further, understanding how vectors get added together suggests a definition for a constant (or scalar) multiple of a vector. For example, it would be natural to assume that $3\begin{pmatrix}x\\y\end{pmatrix}$ is a displacement three times as large as $\begin{pmatrix}x\\y\end{pmatrix}$. Consequently, $$3\begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}x\\y\end{pmatrix} + \begin{pmatrix}x\\y\end{pmatrix} + \begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}x+x+x\\y+y+y\end{pmatrix} = \begin{pmatrix}3x\\3y\end{pmatrix}$$
In keeping with this example, we define the scalar multiple of a two-dimensional vector, for any real value $c$ (which is called the "scalar" in question), to be
$$c\begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}cx\\cy\end{pmatrix}$$

Linear Transformations (applied to, and producing 2D vectors)

$F$ is a linear transformation (operating on vectors) if and only if, for all scalars $c$ and vectors $\bar{x}$ and $\bar{y}$ we have:

  1. $F(c\bar{x}) = cF(\bar{x})$
  2. $F(\bar{x}+\bar{y}) = F(\bar{x}) + F(\bar{y})$

An amazing consequence of this definition is that knowledge of what a specific linear transformation (operating on two-dimensional vectors) does to the vectors $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ is sufficient to determine what that transformation does to any (two-dimensional) vector. As an example, suppose we know that for a given linear transformation, $M$, we have
$$M\begin{pmatrix}1\\0\end{pmatrix} = \begin{pmatrix}4\\5\end{pmatrix}$$
$$M\begin{pmatrix}0\\1\end{pmatrix} = \begin{pmatrix}-3\\7\end{pmatrix}$$
Now suppose we wish to know what $M$ does to an arbitrary vector, say $\begin{pmatrix}-6\\8\end{pmatrix}$:
$$\begin{align*}
M\begin{pmatrix}-6\\8\end{pmatrix} &= M \left( \begin{pmatrix}-6\\0\end{pmatrix} + \begin{pmatrix}0\\8\end{pmatrix} \right) \\
&= M\begin{pmatrix}-6\\0\end{pmatrix} + M\begin{pmatrix}0\\8\end{pmatrix} \quad \textrm{...by property (2)}\\
&= M\left( -6 \begin{pmatrix}1\\0\end{pmatrix} \right) + M \left( 8 \begin{pmatrix}0\\1\end{pmatrix} \right) \\
&= -6 M\begin{pmatrix}1\\0\end{pmatrix} + 8 M \begin{pmatrix}0\\1\end{pmatrix} \quad \textrm{...by property (1)}\\
&= -6 \begin{pmatrix}4\\5\end{pmatrix} + 8 \begin{pmatrix}-3\\7\end{pmatrix} \\
&= \begin{pmatrix}(-6) \cdot 4\\ (-6) \cdot 5\end{pmatrix} + \begin{pmatrix} 8 \cdot (-3)\\8 \cdot 7\end{pmatrix} \\
&= \begin{pmatrix}(-6) \cdot 4 + 8 \cdot (-3)\\ (-6) \cdot 5 + 8 \cdot 7\end{pmatrix} \\
&= \begin{pmatrix}-48\\26\end{pmatrix}
\end{align*}$$

Because we can completely characterize a linear transformation in this way -- through knowledge of where it sends $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ -- we "name" a linear transformation (which is both applied to, and produces two-dimensional vectors) with a matrix whose columns represent the output vectors corresponding to inputs $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$, respectively.

Notice below, how much more efficiently we can perform the same evaluation found a moment ago, by using the matrix form:
$$M\begin{pmatrix}-6\\8\end{pmatrix} = \begin{bmatrix}4 & -3\\5 & 7\end{bmatrix} \begin{pmatrix}-6\\8\end{pmatrix} = \begin{pmatrix}4 \cdot (-6) + (-3) \cdot 8\\5 \cdot (-6) + 7 \cdot 8\end{pmatrix} = \begin{pmatrix}-48\\26\end{pmatrix}$$
In general, if we have a linear transformation, $M=\begin{bmatrix}a & b\\c & d\end{bmatrix}$,
$$M\begin{pmatrix}x\\y\end{pmatrix} = \begin{bmatrix}a & b\\c & d\end{bmatrix} \begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}ax+by \\ cx + dy\end{pmatrix}$$

When are Combinations of Linear Transformations also Linear Transformations?

Suppose $F$ and $G$ are linear transformations.

Is the sum, $F+G$, a linear transformation? To answer this question, let us check the two defining properties of a linear transformation.

  1. Is $(F+G)(c \bar{x}) = c(F+G)(\bar{x})$ ?

    Yes! Consider the following:
    $$\begin{align*}
    (F+G)(c \bar{x}) &= F(c \bar{x}) + G(c \bar{x})\\
    &= cF(\bar{x}) + cG(\bar{x})\\
    &= c(F(\bar{x}) + G(\bar{x}))\\
    &= c(F+G)(\bar{x})
    \end{align*}$$

  2. Is $(F+G)(\bar{x} + \bar{y}) = (F+G)(\bar{x}) + (F+G)(\bar{y})$ ?

    Yes! Consider the following:
    $$\begin{align*}
    (F+G)(\bar{x}+\bar{y}) &= F(\bar{x}+\bar{y})+G(\bar{x}+\bar{y})\\
    &= F(\bar{x}) + F(\bar{y}) + (G(\bar{x}) + G(\bar{y}))\\
    &= F(\bar{x}) + G(\bar{x}) + F(\bar{y}) + G(\bar{y})\\
    &= (F+G)(\bar{x}) + (F+G)(\bar{y})
    \end{align*}$$

So the sum of two linear transformations is itself a linear transformation!


What about the difference, $F-G$? Is this a linear transformation? Let us again check the two defining properties of a linear transformation:

  1. Is $(F-G)(c \bar{x}) = c(F-G)(\bar{x})$ ?

    Yes! Consider the following:
    $$\begin{align*}
    (F-G)(c \bar{x}) &= F(c \bar{x}) - G(c \bar{x})\\
    &= cF(\bar{x}) - cG(\bar{x})\\
    &= c(F(\bar{x}) - G(\bar{x}))\\
    &= c(F-G)(\bar{x})
    \end{align*}$$

  2. Is $(F-G)(\bar{x} + \bar{y}) = (F-G)(\bar{x}) + (F-G)(\bar{y})$ ?

    Yes! Consider the following:
    $$\begin{align*}
    (F-G)(\bar{x}+\bar{y}) &= F(\bar{x}+\bar{y})-G(\bar{x}+\bar{y})\\
    &= F(\bar{x}) + F(\bar{y}) - (G(\bar{x}) + G(\bar{y}))\\
    &= F(\bar{x}) - G(\bar{x}) + F(\bar{y}) - G(\bar{y})\\
    &= (F-G)(\bar{x}) + (F-G)(\bar{y})
    \end{align*}$$

So the difference of two linear transformations is itself a linear transformation!


What about the product, $FG$? Is this a linear transformation? Let us again check the two defining properties of a linear transformation:

  1. Is $(FG)(c \bar{x}) = c(FG)(\bar{x})$ ?

    Consider the following:
    $$\begin{align*}
    (FG)(c \bar{x}) &= F(c \bar{x}) \cdot G(c \bar{x})\\
    &= cF(\bar{x}) \cdot cG(\bar{x})\\
    &= c^2 (F(\bar{x}) \cdot G(\bar{x}))\\
    &= c^2 (FG)(\bar{x})\\
    &\neq c (FG)(\bar{x}) \textrm{ for all values of $c$}
    \end{align*}$$

So the product of two linear transformations is not a linear transformation!


What about the composition, $F \circ G$? Is this a linear transformation? Let us again check the two defining properties of a linear transformation:

  1. Is $(F \circ G)(c \bar{x}) = c(F \circ G)(\bar{x})$ ?

    Yes! Consider the following:
    $$\begin{align*}
    (F \circ G)(c \bar{x}) &= F(G(c\bar{x}))\\
    &= F(cG(\bar{x}))\\
    &= cF(G(\bar{x}))\\
    &= c(F \circ G)(\bar{x})
    \end{align*}$$

  2. Is $(F \circ G)(\bar{x} + \bar{y}) = (F \circ G)(\bar{x}) + (F \circ G)(\bar{y})$ ?

    Yes! Consider the following:
    $$\begin{align*}
    (F \circ G)(\bar{x} + \bar{y}) &= F(G(\bar{x} + \bar{y}))\\
    &= F(G(\bar{x}) + G(\bar{y}))\\
    &= F(G(\bar{x})) + F(G(\bar{y}))\\
    &= (F \circ G)(\bar{x}) + (F \circ G)(\bar{y})
    \end{align*}$$

So the composition of two linear transformations is itself a linear transformation!


So the sum, difference, and composition of two linear transformations are themselves linear transformations. Consequently, if we are talking about linear transformations operating on two-dimensional vectors, then we can also say that the sum, difference, and composition of two linear transformations can be written as a matrix, whose first and second columns are determined by where the vectors $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ are taken under each, respectively.

This gives us a way to find the matrix form for the sum, difference, and composition of two linear transformations (operating on two-dimensional vectors) directly from the matrix forms for the linear transformations being combined.


Consider the sum of two $2 \times 2$ matrices:

$$\begin{align*}
\left( \begin{bmatrix}a & b\\c & d\end{bmatrix}+ \begin{bmatrix}e & f\\g & h\end{bmatrix} \right) \begin{pmatrix}1\\0\end{pmatrix} &= \begin{bmatrix}a & b\\c & d\end{bmatrix}\begin{pmatrix}1\\0\end{pmatrix} + \begin{bmatrix}e & f\\g & h\end{bmatrix}\begin{pmatrix}1\\0\end{pmatrix} \\
&= \begin{pmatrix}a\\c\end{pmatrix} + \begin{pmatrix}e\\g\end{pmatrix} \\
&= \begin{pmatrix}a+e\\c+g\end{pmatrix}
\end{align*}$$
while,
$$\begin{align*}
\left( \begin{bmatrix}a & b\\c & d\end{bmatrix}+ \begin{bmatrix}e & f\\g & h\end{bmatrix} \right) \begin{pmatrix}0\\1\end{pmatrix} &= \begin{bmatrix}a & b\\c & d\end{bmatrix}\begin{pmatrix}0\\1\end{pmatrix} + \begin{bmatrix}e & f\\g & h\end{bmatrix}\begin{pmatrix}0\\1\end{pmatrix} \\
&= \begin{pmatrix}b\\d\end{pmatrix} + \begin{pmatrix}f\\h\end{pmatrix} \\
&= \begin{pmatrix}b+f\\d+h\end{pmatrix}
\end{align*}$$
Thus, we can describe the sum of the two linear transformations (operating on two-dimensional vectors) in matrix form as:
$$\begin{bmatrix}a & b\\c & d\end{bmatrix}+ \begin{bmatrix}e & f\\g & h\end{bmatrix} = \begin{bmatrix}a+e & b+f\\c+g & d+h\end{bmatrix}$$


The difference of two linear transformations (operating on two-dimensional vectors) can be found in an almost identical way, yielding:
$$\begin{bmatrix}a & b\\c & d\end{bmatrix}- \begin{bmatrix}e & f\\g & h\end{bmatrix} = \begin{bmatrix}a-e & b-f\\c-g & d-h\end{bmatrix}$$


The composition of two such transformations proves a bit more interesting...

$$\begin{align*}
\left( \begin{bmatrix}a & b\\c & d\end{bmatrix} \circ \begin{bmatrix}e & f\\g & h\end{bmatrix} \right) \begin{pmatrix}1\\0\end{pmatrix} &= \begin{bmatrix}a & b\\c & d\end{bmatrix} \left( \begin{bmatrix}e & f\\g & h\end{bmatrix} \begin{pmatrix}1\\0\end{pmatrix} \right) \\
&= \begin{bmatrix}a & b\\c & d\end{bmatrix} \begin{pmatrix}e\\g\end{pmatrix}\\
&= \begin{pmatrix}ae + bg\\ce + dg\end{pmatrix}
\end{align*}$$
while,
$$\begin{align*}
\left( \begin{bmatrix}a & b\\c & d\end{bmatrix} \circ \begin{bmatrix}e & f\\g & h\end{bmatrix} \right) \begin{pmatrix}0\\1\end{pmatrix} &= \begin{bmatrix}a & b\\c & d\end{bmatrix} \left( \begin{bmatrix}e & f\\g & h\end{bmatrix} \begin{pmatrix}0\\1\end{pmatrix} \right) \\
&= \begin{bmatrix}a & b\\c & d\end{bmatrix} \begin{pmatrix}f\\h\end{pmatrix}\\
&= \begin{pmatrix}af+bh\\cf+dh\end{pmatrix}
\end{align*}$$
Thus, we can describe the composition of the two linear transformations (operating on two-dimensional vectors) in matrix form as:
$$\begin{bmatrix}a & b\\c & d\end{bmatrix}\circ \begin{bmatrix}e & f\\g & h\end{bmatrix} = \begin{bmatrix}ae+bg & af+bh\\ce+dg & cf+dh\end{bmatrix}$$
Notice, somewhat unexpectedly -- the method of combining matrices above is typically described as matrix "multiplication" even though a "composition" lies at its heart!

◆ ◆ ◆