In Section 3.1 we learned to multiply matrices together. In this section, we learn to “divide” by a matrix. This allows us to solve the matrix equation \(Ax=b\) in an elegant way:
\[ Ax = b \quad\iff\quad x = A^ b. \nonumber \]
One has to take care when “dividing by matrices”, however, because not every matrix has an inverse, and the order of matrix multiplication is important.
The reciprocal or inverse of a nonzero number \(a\) is the number \(b\) which is characterized by the property that \(ab = 1\). For instance, the inverse of \(7\) is \(1/7\). We use this formulation to define the inverse of a matrix.
Let \(A\) be an \(n\times n\) (square) matrix. We say that \(A\) is if there is an \(n\times n\) matrix \(B\) such that
\[ AB = I_n \quad\text\quad BA = I_n. \nonumber \]
In this case, the matrix \(B\) is called the of \(A\text\) and we write \(B = A^\).
We have to require \(AB = I_n\) and \(BA = I_n\) because in general matrix multiplication is not commutative. However, we will show in Corollary 3.6.1 in Section 3.6 that if \(A\) and \(B\) are \(n\times n\) matrices such that \(AB = I_n\text\) then automatically \(BA = I_n\).
Verify that the matrices
We will check that \(AB = I_2\) and that \(BA = I_2\).
Therefore, \(A\) is invertible, with inverse \(B\).
There exist non-square matrices whose product is the identity. Indeed, if
then \(AB = I_2.\) However, \(BA\neq I_3\text\) so \(B\) does not deserve to be called the inverse of \(A\).
One can show using the ideas later in this section that if \(A\) is an \(n\times m\) matrix for \(n\neq m\text\) then there is no \(m\times n\) matrix \(B\) such that \(AB = I_m\) and \(BA = I_n\). For this reason, we restrict ourselves to square matrices when we discuss matrix invertibility.
Let \(A\) and \(B\) be invertible \(n\times n\) matrices.
Why is the inverse of \(AB\) not equal to \(A^ B^\text\) If it were, then we would have
But there is no reason for \(ABA^ B^\) to equal the identity matrix: one cannot switch the order of \(A^\) and \(B\text\) so there is nothing to cancel in this expression. In fact, if \(I_n = (AB)(A^ B^)\text\) then we can multiply both sides on the right by \(BA\) to conclude that \(AB = BA\). In other words, \((AB)^ = A^ B^\) if and only if \(AB=BA\).
More generally, the inverse of a product of several invertible matrices is the product of the inverses, in the opposite order; the proof is the same. For instance,
So far we have defined the inverse matrix without giving any strategy for computing it. We do so now, beginning with the special case of \(2\times 2\) matrices. Then we will give a recipe for the \(n\times n\) case.
The of a \(2\times 2\) matrix is the number
There is an analogous formula for the inverse of an \(n\times n\) matrix, but it is not as simple, and it is computationally intensive. The interested reader can find it in Subsection Cramer's Rule and Matrix Inverses in Section 4.2.
Then \(\det(A) = 1\cdot 4 - 2\cdot 3 = -2.\) By the Proposition \(\PageIndex\), the matrix \(A\) is invertible with inverse
The following theorem gives a procedure for computing \(A^\) in general.
Let \(A\) be an \(n\times n\) matrix, and let \((\,A\mid I_n\,)\) be the matrix obtained by augmenting \(A\) by the identity matrix. If the reduced row echelon form of \((\,A\mid I_n\,)\) has the form \((\,I_n\mid B\,)\text\) then \(A\) is invertible and \(B = A^\). Otherwise, \(A\) is not invertible.
Proof
First suppose that the reduced row echelon form of \((\,A\mid I_n\,)\) does not have the form \((\,I_n\mid B\,)\). This means that fewer than \(n\) pivots are contained in the first \(n\) columns (the non-augmented part), so \(A\) has fewer than \(n\) pivots. It follows that \(\text(A)\neq\\) (the equation \(Ax=0\) has a free variable), so there exists a nonzero vector \(v\) in \(\text(A)\). Suppose that there were a matrix \(B\) such that \(BA=I_n\). Then
\[ v = I_nv = BAv = B0 = 0, \nonumber \]
which is impossible as \(v\neq 0\). Therefore, \(A\) is not invertible.
Now suppose that the reduced row echelon form of \((\,A\mid I_n\,)\) has the form \((\,I_n\mid B\,)\). In this case, all pivots are contained in the non-augmented part of the matrix, so the augmented part plays no role in the row reduction: the entries of the augmented part do not influence the choice of row operations used. Hence, row reducing \((\,A\mid I_n\,)\) is equivalent to solving the \(n\) systems of linear equations \(Ax_1 = e_1,\,Ax_2=e_2,\,\ldots,Ax_n=e_n\text\) where \(e_1,e_2,\ldots,e_n\) are the standard coordinate vectors, Note 3.3.2 in Section 3.3:
The columns \(x_1,x_2,\ldots,x_n\) of the matrix \(B\) in the row reduced form are the solutions to these equations:
By Fact 3.3.2 in Section 3.3, the product \(Be_i\) is just the \(i\)th column \(x_i\) of \(B\text\) so
\[ e_i = Ax_i = ABe_i \nonumber \]
for all \(i\). By the same fact, the \(i\)th column of \(AB\) is \(e_i\text\) which means that \(AB\) is the identity matrix. Thus \(B\) is the inverse of \(A\).
Find the inverse of the matrix
We augment by the identity and row reduce:
By the Theorem \(\PageIndex\), the inverse matrix is
Is the following matrix invertible?
We augment by the identity and row reduce:
At this point we can stop, because it is clear that the reduced row echelon form will not have \(I_3\) in the non-augmented part: it will have a row of zeros. By the Theorem \(\PageIndex\), the matrix is not invertible.
In this subsection, we learn to solve \(Ax=b\) by “dividing by \(A\).”
Let \(A\) be an invertible \(n\times n\) matrix, and let \(b\) be a vector in \(\mathbb^n .\) Then the matrix equation \(Ax=b\) has exactly one solution:
\[ x = A^ b. \nonumber \]
Proof
\[ \begin Ax = b \quad\implies\amp\quad A^(Ax) = A^ b \\ \quad\implies\amp\quad (A^ A)x = A^ b \\ \quad\implies\amp\quad I_n x = A^ b \\ \quad\implies\amp\quad x = A^ b. \end \nonumber \]
Here we used associativity of matrix multiplication, and the fact that \(I_n x = x\) for any vector \(b\).
Solve the matrix equation
By the Theorem \(\PageIndex\), the only solution of our linear system is
Solve the system of equations
\[\left\ 2x_1 &+& 3x_2 &+& 2x_3 &=& 1\\ x_1 &<>&<>& + &3x_3 &=& 1\\ 2x_1 &+& 2x_2 &+& 3x_3 &=& 1.\end\right.\nonumber\]
First we write our system as a matrix equation \(Ax = b\text\) where
Next we find the inverse of \(A\) by augmenting and row reducing:
By the Theorem \(\PageIndex\), the only solution of our linear system is
The advantage of solving a linear system using inverses is that it becomes much faster to solve the matrix equation \(Ax=b\) for other, or even unknown, values of \(b\). For instance, in the above example, the solution of the system of equations
\[\left\2x_1 &+& 3x_2 &+& 2x_3 &=& b_1\\ x_1 &<>&<>& + &3x_3 &=& b_2\\ 2x_1 &+& 2x_2 &+& 3x_3 &=& b_3,\end\right.,\nonumber\]
where \(b_1,b_2,b_3\) are unknowns, is
As with matrix multiplication, it is helpful to understand matrix inversion as an operation on linear transformations. Recall that the identity transformation, Definition 3.1.2 in Section 3.1, on \(\mathbb^n \) is denoted \(\text_<\mathbb^n >\).
A transformation \(T\colon\mathbb^n \to\mathbb^n \) is if there exists a transformation \(U\colon\mathbb^n \to\mathbb^n \) such that \(T\circ U = \text_<\mathbb^n >\) and \(U\circ T = \text_<\mathbb^n >\). In this case, the transformation \(U\) is called the of \(T\text\) and we write \(U = T^\).
The inverse \(U\) of \(T\) “undoes” whatever \(T\) did. We have
\[ T\circ U(x) = x \quad\text\quad U\circ T(x) = x \nonumber \]
for all vectors \(x\). This means that if you apply \(T\) to \(x\text\) then you apply \(U\text\) you get the vector \(x\) back, and likewise in the other order.
Define \(f\colon \mathbb \to\mathbb \) by \(f(x) = 2x\). This is an invertible transformation, with inverse \(g(x) = x/2\). Indeed,
\[ f\circ g(x) = f(g(x)) = f\biggl(\frac x2\biggr) = 2\biggl(\frac x2\biggr) = x \nonumber \]
\[ g\circ f(x) = g(f(x)) = g(2x) = \frac2 = x. \nonumber \]
In other words, dividing by \(2\) undoes the transformation that multiplies by \(2\).
Define \(f\colon \mathbb \to\mathbb \) by \(f(x) = x^3\). This is an invertible transformation, with inverse \(g(x) = \sqrt[3]x\). Indeed,
\[ f\circ g(x) = f(g(x)) = f(\sqrt[3]x) = \bigl(\sqrt[3]x\bigr)^3 = x \nonumber \]
\[ g\circ f(x) = g(f(x)) = g(x^3) = \sqrt[3] = x. \nonumber \]
In other words, taking the cube root undoes the transformation that takes a number to its cube.
Define \(f\colon \mathbb \to\mathbb \) by \(f(x) = x^2\). This is not an invertible function. Indeed, we have \(f(2) = 2 = f(-2)\text\) so there is no way to undo \(f\text\) the inverse transformation would not know if it should send \(2\) to \(2\) or \(-2\). More formally, if \(g\colon \mathbb \to\mathbb \) satisfies \(g(f(x)) = x\text\) then
\[ 2 = g(f(2)) = g(2) \quad\text\quad -2 = g(f(-2)) = g(2), \nonumber \]
which is impossible: \(g(2)\) is a number, so it cannot be equal to \(2\) and \(-2\) at the same time.
Define \(f\colon \mathbb \to\mathbb \) by \(f(x) = e^x\). This is not an invertible function. Indeed, if there were a function \(g\colon \mathbb \to\mathbb \) such that \(f\circ g = \text_<\mathbb>\text\) then we would have
\[ -1 = f\circ g(-1) = f(g(-1)) = e^. \nonumber \]
But \(e^x\) is a positive number for every \(x\text\) so this is impossible.
Let \(T\colon\mathbb^2 \to\mathbb^2 \) be dilation by a factor of \(3/2\text\) that is, \(T(x) = 3/2x\). Is \(T\) invertible? If so, what is \(T^\text\)
Let \(U\colon\mathbb^2 \to\mathbb^2 \) be dilation by a factor of \(2/3\text\) that is, \(U(x) = 2/3x\). Then
\[ T\circ U(x) = T\biggl(\frac 23x\biggr) = \frac 32\cdot\frac 23x = x \nonumber \]
\[ U\circ T(x) = U\biggl(\frac 32x\biggr) = \frac 23\cdot\frac 32x = x. \nonumber \]
Hence \(T\circ U = \text_<\mathbb
Let \(T\colon\mathbb^2 \to\mathbb^2 \) be counterclockwise rotation by \(45^\circ\). Is \(T\) invertible? If so, what is \(T^\text\)
Let \(U\colon\mathbb^2 \to\mathbb^2 \) be clockwise rotation by \(45^\circ\). Then \(T\circ U\) first rotates clockwise by \(45^\circ\text\) then counterclockwise by \(45^\circ\text\) so the composition rotates by zero degrees: it is the identity transformation. Likewise, \(U\circ T\) first rotates counterclockwise, then clockwise by the same amount, so it is the identity transformation. In other words, clockwise rotation by \(45^\circ\) undoes counterclockwise rotation by \(45^\circ\).
Let \(T\colon\mathbb^2 \to\mathbb^2 \) be the reflection over the \(y\)-axis. Is \(T\) invertible? If so, what is \(T^\text\)
The transformation \(T\) is invertible; in fact, it is equal to its own inverse. Reflecting a vector \(x\) over the \(y\)-axis twice brings the vector back to where it started, so \(T\circ T = \text_<\mathbb
To say that \(T\) is one-to-one and onto means that \(T(x)=b\) has exactly one solution for every \(b\) in \(\mathbb^n \).
Suppose that \(T\) is invertible. Then \(T(x)=b\) always has the unique solution \(x = T^(b)\text\) indeed, applying \(T^\) to both sides of \(T(x)=b\) gives
and applying \(T\) to both sides of \(x = T^(b)\) gives
Conversely, suppose that \(T\) is one-to-one and onto. Let \(b\) be a vector in \(\mathbb^n \text\) and let \(x = U(b)\) be the unique solution of \(T(x)=b\). Then \(U\) defines a transformation from \(\mathbb^n \) to \(\mathbb^n \). For any \(x\) in \(\mathbb^n \text\) we have \(U(T(x)) = x\text\) because \(x\) is the unique solution of the equation \(T(x) = b\) for \(b = T(x)\). For any \(b\) in \(\mathbb^n \text\) we have \(T(U(b)) = b\text\) because \(x = U(b)\) is the unique solution of \(T(x)=b\). Therefore, \(U\) is the inverse of \(T\text\) and \(T\) is invertible.
Suppose now that \(T\) is an invertible transformation, and that \(U\) is another transformation such that \(T\circ U = \text_<\mathbb
\[ T^\circ T\circ U\circ T = T^\circ\text_<\mathbb
We have \(T^\circ T = \text_<\mathbb
If instead we had assumed only that \(U\circ T = \text_<\mathbb
It makes sense in the above Definition \(\PageIndex\) to define the inverse of a transformation \(T\colon\mathbb^n \to\mathbb^m \text\) for \(m\neq n\text\) to be a transformation \(U\colon\mathbb^m \to\mathbb^n \) such that \(T\circ U = \text_<\mathbb^m >\) and \(U\circ T = \text_<\mathbb^n >\). In fact, there exist invertible transformations \(T\colon\mathbb^n \to\mathbb^m \) for any \(m\) and \(n\text\) but they are not linear, or even continuous.
If \(T\) is a linear transformation, then it can only be invertible when \(m = n\text\) i.e., when its domain is equal to its codomain. Indeed, if \(T\colon\mathbb^n \to\mathbb^m \) is one-to-one, then \(n\leq m\) by Note 3.2.1 in Section 3.2, and if \(T\) is onto, then \(m\leq n\) by Note 3.2.2 in Section 3.2. Therefore, when discussing invertibility we restrict ourselves to the case \(m=n\).
Find an invertible (non-linear) transformation \(T\colon\mathbb^2 \to\mathbb\).
As you might expect, the matrix for the inverse of a linear transformation is the inverse of the matrix for the transformation, as the following theorem asserts.
Let \(T\colon\mathbb^n \to\mathbb^n \) be a linear transformation with standard matrix \(A\). Then \(T\) is invertible if and only if \(A\) is invertible, in which case \(T^\) is linear with standard matrix \(A^\).
Proof
Suppose that \(T\) is invertible. Let \(U\colon\mathbb^n \to\mathbb^n \) be the inverse of \(T\). We claim that \(U\) is linear. We need to check the defining properties, Definition 3.3.1, in Section 3.3. Let \(u,v\) be vectors in \(\mathbb^n \). Then
\[ u + v = T(U(u)) + T(U(v)) = T(U(u) + U(v)) \nonumber \]
by linearity of \(T\). Applying \(U\) to both sides gives
\[ U(u + v) = U\bigl(T(U(u) + U(v))\bigr) = U(u) + U(v). \nonumber \]
Let \(c\) be a scalar. Then
\[ cu = cT(U(u)) = T(cU(u)) \nonumber \]
by linearity of \(T\). Applying \(U\) to both sides gives
\[ U(cu) = U\bigl(T(cU(u))\bigr) = cU(u). \nonumber \]
Since \(U\) satisfies the defining properties, Definition 3.3.1, in Section 3.3, it is a linear transformation.
Now that we know that \(U\) is linear, we know that it has a standard matrix \(B\). By the compatibility of matrix multiplication and composition, Theorem 3.4.1 in Section 3.4, the matrix for \(T\circ U\) is \(AB\). But \(T\circ U\) is the identity transformation \(\text_<\mathbb
Conversely, suppose that \(A\) is invertible. Let \(B = A^\text\) and define \(U\colon\mathbb^n \to\mathbb^n \) by \(U(x) = Bx\). By the compatibility of matrix multiplication and composition, Theorem 3.4.1 in Section 3.4, the matrix for \(T\circ U\) is \(AB = I_n\text\) and the matrix for \(U\circ T\) is \(BA = I_n\). Therefore,
\[ T\circ U(x) = ABx = I_nx = x \quad\text\quad U\circ T(x) = BAx = I_nx = x, \nonumber \]
which shows that \(T\) is invertible with inverse transformation \(U\).
Let \(T\colon\mathbb^2 \to\mathbb^2 \) be dilation by a factor of \(3/2\text\) that is, \(T(x) = 3/2x\). Is \(T\) invertible? If so, what is \(T^\text\)
In Example 3.1.5 in Section 3.1 we showed that the matrix for \(T\) is
The determinant of \(A\) is \(9/4\neq 0\text\) so \(A\) is invertible with inverse
By the Theorem \(\PageIndex\), \(T\) is invertible, and its inverse is the matrix transformation for \(A^\text\)
We recognize this as a dilation by a factor of \(2/3\).
Let \(T\colon\mathbb^2 \to\mathbb^2 \) be counterclockwise rotation by \(45^\circ\). Is \(T\) invertible? If so, what is \(T^\text\)
In Example 3.3.8 in Section 3.3, we showed that the standard matrix for the counterclockwise rotation of the plane by an angle of \(\theta\) is
\[ \left(\begin\cos\theta &-\sin\theta \\ \sin\theta &\cos\theta\end\right). \nonumber \]
Therefore, the standard matrix \(A\) for \(T\) is
where we have used the trigonometric identities
\[ \cos(45^\circ) = \frac 1 \qquad \sin(45^\circ) = \frac 1. \nonumber \]
The determinant of \(A\) is
\[ \det(A) = \frac 1\cdot\frac 1 - \frac 1\frac = \frac 12 + \frac 12 = 1, \nonumber \]
so the inverse is
By the Theorem \(\PageIndex\), \(T\) is invertible, and its inverse is the matrix transformation for \(A^\text\)
We recognize this as a clockwise rotation by \(45^\circ\text\) using the trigonometric identities
\[ \cos(-45^\circ) = \frac 1 \qquad \sin(-45^\circ) = -\frac 1. \nonumber \]
Let \(T\colon\mathbb^2 \to\mathbb^2 \) be the reflection over the \(y\)-axis. Is \(T\) invertible? If so, what is \(T^\text\)
In Example 3.1.4 in Section 3.1 we showed that the matrix for \(T\) is
This matrix has determinant \(-1\text\) so it is invertible, with inverse
By the Theorem \(\PageIndex\), \(T\) is invertible, and it is equal to its own inverse: \(T^ = T\). This is another way of saying that a reflection “undoes” itself.
This page titled 3.5: Matrix Inverses is shared under a GNU Free Documentation License 1.3 license and was authored, remixed, and/or curated by Dan Margalit & Joseph Rabinoff via source content that was edited to the style and standards of the LibreTexts platform.