Consider a system of n linear equations for n unknowns, represented in matrix multiplication form as follows:
where the n × n matrix A has a nonzero determinant, and the vector x = ( x 1 , … , x n ) T {\displaystyle \mathbf {x} =(x_{1},\ldots ,x_{n})^{\mathsf {T}}} is the column vector of the variables. Then the theorem states that in this case the system has a unique solution, whose individual values for the unknowns are given by:
where A i {\displaystyle A_{i}} is the matrix formed by replacing the i-th column of A by the column vector b.
A more general version of Cramer's rule10 considers the matrix equation
where the n × n matrix A has a nonzero determinant, and X, B are n × m matrices. Given sequences 1 ≤ i 1 < i 2 < ⋯ < i k ≤ n {\displaystyle 1\leq i_{1}<i_{2}<\cdots <i_{k}\leq n} and 1 ≤ j 1 < j 2 < ⋯ < j k ≤ m {\displaystyle 1\leq j_{1}<j_{2}<\cdots <j_{k}\leq m} , let X I , J {\displaystyle X_{I,J}} be the k × k submatrix of X with rows in I := ( i 1 , … , i k ) {\displaystyle I:=(i_{1},\ldots ,i_{k})} and columns in J := ( j 1 , … , j k ) {\displaystyle J:=(j_{1},\ldots ,j_{k})} . Let A B ( I , J ) {\displaystyle A_{B}(I,J)} be the n × n matrix formed by replacing the i s {\displaystyle i_{s}} column of A by the j s {\displaystyle j_{s}} column of B, for all s = 1 , … , k {\displaystyle s=1,\ldots ,k} . Then
In the case k = 1 {\displaystyle k=1} , this reduces to the normal Cramer's rule.
The rule holds for systems of equations with coefficients and unknowns in any field, not just in the real numbers.
The proof for Cramer's rule uses the following properties of the determinants: linearity with respect to any given column and the fact that the determinant is zero whenever two columns are equal, which is implied by the property that the sign of the determinant flips if you switch two columns.
Fix the index j of a column, and consider that the entries of the other columns have fixed values. This makes the determinant a function of the entries of the jth column. Linearity with respect to this column means that this function has the form
where the C i , j {\displaystyle C_{i,j}} are coefficients that depend on the entries of A that are not in column j. So, one has
(Laplace expansion provides a formula for computing the C i , j {\displaystyle C_{i,j}} but their expression is not important here.)
If the function D j {\displaystyle D_{j}} is applied to any other column k of A, then the result is the determinant of the matrix obtained from A by replacing column j by a copy of column k, so the resulting determinant is 0 (the case of two equal columns).
Now consider a system of n linear equations in n unknowns x 1 , … , x n {\displaystyle x_{1},\ldots ,x_{n}} , whose coefficient matrix is A, with det(A) assumed to be nonzero:
If one combines these equations by taking C1,j times the first equation, plus C2,j times the second, and so forth until Cn,j times the last, then for every k the resulting coefficient of xk becomes
So, all coefficients become zero, except the coefficient of x j {\displaystyle x_{j}} that becomes det ( A ) . {\displaystyle \det(A).} Similarly, the constant coefficient becomes D j ( b 1 , … , b n ) , {\displaystyle D_{j}(b_{1},\ldots ,b_{n}),} and the resulting equation is thus
which gives the value of x j {\displaystyle x_{j}} as
As, by construction, the numerator is the determinant of the matrix obtained from A by replacing column j by b, we get the expression of Cramer's rule as a necessary condition for a solution.
It remains to prove that these values for the unknowns form a solution. Let M be the n × n matrix that has the coefficients of D j {\displaystyle D_{j}} as jth row, for j = 1 , … , n {\displaystyle j=1,\ldots ,n} (this is the adjugate matrix for A). Expressed in matrix terms, we have thus to prove that
is a solution; that is, that
For that, it suffices to prove that
where I n {\displaystyle I_{n}} is the identity matrix.
The above properties of the functions D j {\displaystyle D_{j}} show that one has MA = det(A)In, and therefore,
This completes the proof, since a left inverse of a square matrix is also a right-inverse (see Invertible matrix theorem).
For other proofs, see below.
Main article: Invertible matrix § Methods of matrix inversion
Let A be an n × n matrix with entries in a field F. Then
where adj(A) denotes the adjugate matrix, det(A) is the determinant, and I is the identity matrix. If det(A) is nonzero, then the inverse matrix of A is
This gives a formula for the inverse of A, provided det(A) ≠ 0. In fact, this formula works whenever F is a commutative ring, provided that det(A) is a unit. If det(A) is not a unit, then A is not invertible over the ring (it may be invertible over a larger ring in which some non-unit elements of F may be invertible).
Consider the linear system
which in matrix format is
Assume a1b2 − b1a2 is nonzero. Then, with the help of determinants, x and y can be found with Cramer's rule as
The rules for 3 × 3 matrices are similar. Given
Then the values of x, y and z can be found as follows:
Cramer's rule is used in the Ricci calculus in various calculations involving the Christoffel symbols of the first and second kind.11
In particular, Cramer's rule can be used to prove that the divergence operator on a Riemannian manifold is invariant with respect to change of coordinates. We give a direct proof, suppressing the role of the Christoffel symbols. Let ( M , g ) {\displaystyle (M,g)} be a Riemannian manifold equipped with local coordinates ( x 1 , x 2 , … , x n ) {\displaystyle (x^{1},x^{2},\dots ,x^{n})} . Let A = A i ∂ ∂ x i {\displaystyle A=A^{i}{\frac {\partial }{\partial x^{i}}}} be a vector field. We use the summation convention throughout.
Let ( x 1 , x 2 , … , x n ) ↦ ( x ¯ 1 , … , x ¯ n ) {\displaystyle (x^{1},x^{2},\ldots ,x^{n})\mapsto ({\bar {x}}^{1},\ldots ,{\bar {x}}^{n})} be a coordinate transformation with non-singular Jacobian. Then the classical transformation laws imply that A = A ¯ k ∂ ∂ x ¯ k {\displaystyle A={\bar {A}}^{k}{\frac {\partial }{\partial {\bar {x}}^{k}}}} where A ¯ k = ∂ x ¯ k ∂ x j A j {\displaystyle {\bar {A}}^{k}={\frac {\partial {\bar {x}}^{k}}{\partial x^{j}}}A^{j}} . Similarly, if g = g m k d x m ⊗ d x k = g ¯ i j d x ¯ i ⊗ d x ¯ j {\displaystyle g=g_{mk}\,dx^{m}\otimes dx^{k}={\bar {g}}_{ij}\,d{\bar {x}}^{i}\otimes d{\bar {x}}^{j}} , then g ¯ i j = ∂ x m ∂ x ¯ i ∂ x k ∂ x ¯ j g m k {\displaystyle {\bar {g}}_{ij}=\,{\frac {\partial x^{m}}{\partial {\bar {x}}^{i}}}{\frac {\partial x^{k}}{\partial {\bar {x}}^{j}}}g_{mk}} . Writing this transformation law in terms of matrices yields g ¯ = ( ∂ x ∂ x ¯ ) T g ( ∂ x ∂ x ¯ ) {\displaystyle {\bar {g}}=\left({\frac {\partial x}{\partial {\bar {x}}}}\right)^{\text{T}}g\left({\frac {\partial x}{\partial {\bar {x}}}}\right)} , which implies det g ¯ = ( det ( ∂ x ∂ x ¯ ) ) 2 det g {\displaystyle \det {\bar {g}}=\left(\det \left({\frac {\partial x}{\partial {\bar {x}}}}\right)\right)^{2}\det g} .
Now one computes
In order to show that this equals 1 det g ¯ ∂ ∂ x ¯ k ( A ¯ k det g ¯ ) {\displaystyle {\frac {1}{\sqrt {\det {\bar {g}}}}}{\frac {\partial }{\partial {\bar {x}}^{k}}}\left({\bar {A}}^{k}{\sqrt {\det {\bar {g}}}}\right)} ,it is necessary and sufficient to show that
which is equivalent to
Carrying out the differentiation on the left-hand side, we get:
where M ( i | j ) {\displaystyle M(i|j)} denotes the matrix obtained from ( ∂ x ∂ x ¯ ) {\displaystyle \left({\frac {\partial x}{\partial {\bar {x}}}}\right)} by deleting the i {\displaystyle i} th row and j {\displaystyle j} th column.But Cramer's Rule says that
is the ( j , i ) {\displaystyle (j,i)} th entry of the matrix ( ∂ x ¯ ∂ x ) {\displaystyle \left({\frac {\partial {\bar {x}}}{\partial x}}\right)} .Thus
completing the proof.
Consider the two equations F ( x , y , u , v ) = 0 {\displaystyle F(x,y,u,v)=0} and G ( x , y , u , v ) = 0 {\displaystyle G(x,y,u,v)=0} . When u and v are independent variables, we can define x = X ( u , v ) {\displaystyle x=X(u,v)} and y = Y ( u , v ) . {\displaystyle y=Y(u,v).}
An equation for ∂ x ∂ u {\displaystyle {\dfrac {\partial x}{\partial u}}} can be found by applying Cramer's rule.
First, calculate the first derivatives of F, G, x, and y:
Substituting dx, dy into dF and dG, we have:
Since u, v are both independent, the coefficients of du, dv must be zero. So we can write out equations for the coefficients:
Now, by Cramer's rule, we see that:
This is now a formula in terms of two Jacobians:
Similar formulas can be derived for ∂ x ∂ v , ∂ y ∂ u , ∂ y ∂ v . {\displaystyle {\frac {\partial x}{\partial v}},{\frac {\partial y}{\partial u}},{\frac {\partial y}{\partial v}}.}
Cramer's rule can be used to prove that an integer programming problem whose constraint matrix is totally unimodular and whose right-hand side is integer, has integer basic solutions. This makes the integer program substantially easier to solve.
Cramer's rule is used to derive the general solution to an inhomogeneous linear differential equation by the method of variation of parameters.
Applying Cramer's Rule gives
These values can be verified by substituting back into the original equations: 12 x + 3 y = ( 12 × 2 ) + ( 3 × ( − 3 ) ) = 24 − 9 = 15 {\displaystyle 12x+3y=(12\times {\color {red}2})+(3\times ({\color {red}-3}))=24-9=15} and 2 x − 3 y = ( 2 × 2 ) − ( 3 × ( − 3 ) ) = 4 − ( − 9 ) = 13 , {\displaystyle 2x-3y=(2\times {\color {red}2})-(3\times ({\color {red}-3}))=4-(-9)=13,}
as required.
Cramer's rule has a geometric interpretation that can be considered also a proof or simply giving insight about its geometric nature. These geometric arguments work in general and not only in the case of two equations with two unknowns presented here.
Given the system of equations
it can be considered as an equation between vectors
The area of the parallelogram determined by ( a 11 a 21 ) {\displaystyle {\binom {a_{11}}{a_{21}}}} and ( a 12 a 22 ) {\displaystyle {\binom {a_{12}}{a_{22}}}} is given by the determinant of the system of equations:
In general, when there are more variables and equations, the determinant of n vectors of length n will give the volume of the parallelepiped determined by those vectors in the n-th dimensional Euclidean space.
Therefore, the area of the parallelogram determined by x 1 ( a 11 a 21 ) {\displaystyle x_{1}{\binom {a_{11}}{a_{21}}}} and ( a 12 a 22 ) {\displaystyle {\binom {a_{12}}{a_{22}}}} has to be x 1 {\displaystyle x_{1}} times the area of the first one since one of the sides has been multiplied by this factor. Now, this last parallelogram, by Cavalieri's principle, has the same area as the parallelogram determined by ( b 1 b 2 ) = x 1 ( a 11 a 21 ) + x 2 ( a 12 a 22 ) {\displaystyle {\binom {b_{1}}{b_{2}}}=x_{1}{\binom {a_{11}}{a_{21}}}+x_{2}{\binom {a_{12}}{a_{22}}}} and ( a 12 a 22 ) . {\displaystyle {\binom {a_{12}}{a_{22}}}.}
Equating the areas of this last and the second parallelogram gives the equation
from which Cramer's rule follows.
This is a restatement of the proof above in abstract language.
Consider the map x = ( x 1 , … , x n ) ↦ 1 det A ( det ( A 1 ) , … , det ( A n ) ) , {\displaystyle \mathbf {x} =(x_{1},\ldots ,x_{n})\mapsto {\frac {1}{\det A}}\left(\det(A_{1}),\ldots ,\det(A_{n})\right),} where A i {\displaystyle A_{i}} is the matrix A {\displaystyle A} with x {\displaystyle \mathbf {x} } substituted in the i {\displaystyle i} th column, as in Cramer's rule. Because of linearity of determinant in every column, this map is linear. Observe that it sends the i {\displaystyle i} th column of A {\displaystyle A} to the i {\displaystyle i} th basis vector e i = ( 0 , … , 1 , … , 0 ) {\displaystyle \mathbf {e} _{i}=(0,\ldots ,1,\ldots ,0)} (with 1 in the i {\displaystyle i} th place), because determinant of a matrix with a repeated column is 0. So we have a linear map which agrees with the inverse of A {\displaystyle A} on the column space; hence it agrees with A − 1 {\displaystyle A^{-1}} on the span of the column space. Since A {\displaystyle A} is invertible, the column vectors span all of R n {\displaystyle \mathbb {R} ^{n}} , so our map really is the inverse of A {\displaystyle A} . Cramer's rule follows.
A short proof of Cramer's rule 12 can be given by noticing that x 1 {\displaystyle x_{1}} is the determinant of the matrix
On the other hand, assuming that our original matrix A is invertible, this matrix X 1 {\displaystyle X_{1}} has columns A − 1 b , A − 1 v 2 , … , A − 1 v n {\displaystyle A^{-1}\mathbf {b} ,A^{-1}\mathbf {v} _{2},\ldots ,A^{-1}\mathbf {v} _{n}} , where v n {\displaystyle \mathbf {v} _{n}} is the n-th column of the matrix A. Recall that the matrix A 1 {\displaystyle A_{1}} has columns b , v 2 , … , v n {\displaystyle \mathbf {b} ,\mathbf {v} _{2},\ldots ,\mathbf {v} _{n}} , and therefore X 1 = A − 1 A 1 {\displaystyle X_{1}=A^{-1}A_{1}} . Hence, by using that the determinant of the product of two matrices is the product of the determinants, we have
The proof for other x j {\displaystyle x_{j}} is similar.
Main article: Comparison of vector algebra and geometric algebra § Matrix related
A system of equations is said to be inconsistent when there are no solutions and it is called indeterminate when there is more than one solution. For linear equations, an indeterminate system will have infinitely many solutions (if it is over an infinite field), since the solutions can be expressed in terms of one or more parameters that can take arbitrary values.
Cramer's rule applies to the case where the coefficient determinant is nonzero. In the 2×2 case, if the coefficient determinant is zero, then the system is inconsistent if the numerator determinants are nonzero, or indeterminate if the numerator determinants are zero.
For 3×3 or higher systems, the only thing one can say when the coefficient determinant equals zero is that if any of the numerator determinants are nonzero, then the system must be inconsistent. However, having all determinants zero does not imply that the system is indeterminate. A simple example where all determinants vanish (equal zero) but the system is still inconsistent is the 3×3 system x+y+z=1, x+y+z=2, x+y+z=3.
Cramer, Gabriel (1750). "Introduction à l'Analyse des lignes Courbes algébriques" (in French). Geneva: Europeana. pp. 656–659. Retrieved 2012-05-18. https://www.europeana.eu/resolve/record/03486/E71FE3799CEC1F8E2B76962513829D2E36B63015 ↩
Kosinski, A. A. (2001). "Cramer's Rule is due to Cramer". Mathematics Magazine. 74 (4): 310–312. doi:10.2307/2691101. JSTOR 2691101. /wiki/Doi_(identifier) ↩
MacLaurin, Colin (1748). A Treatise of Algebra, in Three Parts. Printed for A. Millar & J. Nourse. https://archive.org/details/atreatisealgebr03maclgoog ↩
Boyer, Carl B. (1968). A History of Mathematics (2nd ed.). Wiley. p. 431. /wiki/Carl_Benjamin_Boyer ↩
Katz, Victor (2004). A History of Mathematics (Brief ed.). Pearson Education. pp. 378–379. ↩
Hedman, Bruce A. (1999). "An Earlier Date for "Cramer's Rule"" (PDF). Historia Mathematica. 26 (4): 365–368. doi:10.1006/hmat.1999.2247. S2CID 121056843. http://professorhedman.com/Cramers.Rule.pdf ↩
David Poole (2014). Linear Algebra: A Modern Introduction. Cengage Learning. p. 276. ISBN 978-1-285-98283-0. 978-1-285-98283-0 ↩
Joe D. Hoffman; Steven Frankel (2001). Numerical Methods for Engineers and Scientists, Second Edition. CRC Press. p. 30. ISBN 978-0-8247-0443-8. 978-0-8247-0443-8 ↩
Thomas S. Shores (2007). Applied Linear Algebra and Matrix Analysis. Springer Science & Business Media. p. 132. ISBN 978-0-387-48947-6. 978-0-387-48947-6 ↩
Zhiming Gong; M. Aldeen; L. Elsner (2002). "A note on a generalized Cramer's rule". Linear Algebra and Its Applications. 340 (1–3): 253–254. doi:10.1016/S0024-3795(01)00469-4. https://doi.org/10.1016%2FS0024-3795%2801%2900469-4 ↩
Levi-Civita, Tullio (1926). The Absolute Differential Calculus (Calculus of Tensors). Dover. pp. 111–112. ISBN 9780486634012. {{cite book}}: ISBN / Date incompatibility (help) 9780486634012 ↩
Robinson, Stephen M. (1970). "A Short Proof of Cramer's Rule". Mathematics Magazine. 43 (2): 94–95. doi:10.1080/0025570X.1970.11976018. /wiki/Doi_(identifier) ↩