The concept of inverse matrix is somewhat analogous to that of the reciprocal of a number. If a is a nonzero number, then 1/a is its reciprocal. The fraction 1/a is often written as a -1. Aside from the fact that only nonzero numbers have reciprocals, the key property of a nonzero number and its reciprocal is that their product is 1, that is, a • a -1 = 1. This makes a -1 the multiplicative inverse of the nonzero number a.
Only nonsingular square matrices A have inverses. (A square matrix is nonsingular if and only if its determinant is nonzero.) When A is nonsingular, its inverse, denoted A -1 is unique and has the key property that A • A –1 = I =A –1 • A, where I denotes the n × n identity matrix. The determinant of a square matrix A (of any order) is a single scalar (number), say a = det(A ). If this number is nonzero, the matrix is nonsingular, and accordingly has a reciprocal. Moreover, when det(A ) ≠ 0, the inverse of A exists and its determinant is the reciprocal of det(A ). That is, det(A –1) = (det(A))-1.
A tiny example will illustrate these concepts, albeit somewhat too simplistically. Let
Then, the determinant of A is the number det (A) = a11a22 – a12a21.
If det(A ) ≠ 0, then
As a check, one can see that
This formula for the inverse of a 2 × 2 matrix is useful for hand calculations, but its generalization to matrices of larger order is far more difficult conceptually and computationally. Indeed, the formula is
where adj(A ) is the so-called adjoint (or adjugate) of A. The adjoint of A is the “transposed matrix of cofactors” of A, that is, the matrix B with elements bij = (–1)i+j det(A(j│i )).
One way to carry out the inversion of a nonsingular matrix A is to consider the matrix equation A • X = I, where X stands for A -1. If A is n × n, then this equation can be viewed as a set of n separate equations of the form Ax = b where x is successively taken as the j th column of unknown matrix X and b is taken as the j th column of I (j = 1, …, n ). These equations can then be solved by Cramer’s rule.
The concept of the inverse of a matrix is of great theoretical value, but as may be appreciated from the above discussion, its computation can be problematic, just from the standpoint of sheer labor, not to mention issues of numerical reliability. Fortunately, there are circumstances in which it is not necessary to know the inverse of an n × n matrix A in order to solve an equation like Ax = b. One such circumstance is where the nonsingular matrix A is lower (or upper) triangular and all its diagonal elements are nonzero. In the case of lower triangular matrices, this means (i) aii ≠ 0 for all i = 1, …, n, (ii) a ij = 0 for all i = 1, …, n – 1 and j > i. Thus, for instance
is lower triangular; the fact that its diagonal elements 4,–1, and 5 are all nonzero makes this triangular matrix nonsingular. When A is nonsingular and lower triangular, solving the equation Ax = b is done by starting with the top equation a 11x 1 = b 1 and solving it for x. 1 In particular, x 1 = b 1│ a 11. This value is substituted into all the remaining equations. Then the process is repeated for the next equation. It gives x2 = [b2 – a21 (b1/a11)]a22. This sort of process is repeated until the last component of x is computed. This technique is called forward substitution. There is an analogous procedure called back substitution for nonsingular upper triangular matrices. Transforming a system of linear equations to triangular form makes its solution fairly uncomplicated.
Matrix inversion is thought by some to be a methodological cornerstone of regression analysis. The desire to invert a matrix typically arises in solving the normal equations generated by applying the method of ordinary least squares (OLS) to the estimation of parameters in a linear regression model. It might be postulated that the linear relationship
holds for some set of parameters β1, …,βk. To determine these unknown parameters, one runs a set of, say, n experiments by first choosing values X,i 2, …, Xik and then recording the outcome Y i for i = 2, …, n. In doing so, one uses an error term Ui for the i th experiment. This is needed because for a specific set of parameter values (estimates), there may be no solution to the set of simultaneous equations induced by (1). Thus, one writes
The OLS method seeks values of β1, …, βk that minimize the sum of the squares errors, that is With
this leads to the OLS problem of minimizing Y’Y – 2 Y’Xβ + β’X’Xβ. The first-order necessary and sufficient conditions for the minimizing vector β are the so-called normal equations X’Xβ = X’Y.
If the matrix X’X is nonsingular, then
Care needs to be taken in solving the normal equations. It can happen that X’X is singular. In that case, its inverse does not exist. Yet even when X’X is invertible, it is not always advisable to solve for β as in (3). For numerical reasons, this is particularly so when the order of the matrix is very large.
SEE ALSO Determinants; Hessian Matrix; Jacobian Matrix; Matrix Algebra; Regression Analysis
Marcus, Marvin, and Henryk Minc. 1964. A Survey of Matrix Theory and Matrix Inequalities. Boston: Allyn and Bacon.
Strang, Gilbert. 1976. Linear Algebra and its Applications. New York: Academic Press.
Richard W. Cottle
where I denotes the identity matrix, then B is the inverse matrix of A and A is said to be invertible with B. If it exists, B is unique and is denoted by A–1.