Main article: Boolean satisfiability problem
The satisfiability problem consists in finding a satisfying assignment for a given formula in conjunctive normal form (CNF).
An example of such a formula is:
or, using a common notation:4
where A,B,C are Boolean variables, ¬ A {\displaystyle \lnot A} , ¬ C {\displaystyle \lnot C} , B {\displaystyle B} , and C {\displaystyle C} are literals, and ¬ A ∨ ¬ C {\displaystyle \lnot A\lor \lnot C} and B ∨ C {\displaystyle B\lor C} are clauses.
A satisfying assignment for this formula is e.g.:
since it makes the first clause true (since ¬ A {\displaystyle \lnot A} is true) as well as the second one (since C {\displaystyle C} is true).
This examples uses three variables (A, B, C), and there are two possible assignments (True and False) for each of them. So one has 2 3 = 8 {\displaystyle 2^{3}=8} possibilities. In this small example, one can use brute-force search to try all possible assignments and check if they satisfy the formula. But in realistic applications with millions of variables and clauses brute force search is impractical. The responsibility of a SAT solver is to find a satisfying assignment efficiently and quickly by applying different heuristics for complex CNF formulas.
Main article: Unit propagation
If a clause has all but one of its literals or variables evaluated at False, then the free literal must be True in order for the clause to be True. For example, if the below unsatisfied clause is evaluated with A = F a l s e {\displaystyle A=\mathrm {False} } and B = F a l s e {\displaystyle B=\mathrm {False} } we must have C = T r u e {\displaystyle C=\mathrm {True} } in order for the clause ( A ∨ B ∨ C ) {\displaystyle (A\lor B\lor C)} to be true.
The iterated application of the unit clause rule is referred to as unit propagation or Boolean constraint propagation (BCP).
Main article: Resolution (logic)
Consider two clauses ( A ∨ B ∨ C ) {\displaystyle (A\lor B\lor C)} and ( ¬ C ∨ D ∨ ¬ E ) {\displaystyle (\neg C\lor D\lor \neg E)} . The clause ( A ∨ B ∨ D ∨ ¬ E ) {\displaystyle (A\lor B\lor D\lor \neg E)} , obtained by merging the two clauses and removing both ¬ C {\displaystyle \neg C} and C {\displaystyle C} , is called the resolvent of the two clauses.
A sequent calculus-similar notation can be used to formalize many rewriting algorithms, including CDCL. The following are the rules a CDCL solver can apply in order to either find or fail or find a satisfying assignment, i.e. A = ( l 1 , ¬ l 2 , l 3 , . . . ) {\displaystyle A=(l_{1},\neg l_{2},l_{3},...)} and conflict clause C {\displaystyle C} .
Propagate If a clause in the formula Φ {\displaystyle \Phi } has exactly one unassigned literal l {\displaystyle l} in A {\displaystyle A} , with all other literals in the clause assigned false in A {\displaystyle A} , extend A {\displaystyle A} with l {\displaystyle l} . This rule represents the idea a currently false clause with only one unset variable left forces that variable to be set in such a way as to make the entire clause true, otherwise the formula will not be satisfied.
{ l 1 , … , l n , l } ∈ Φ ¬ l 1 , … , ¬ l n ∈ A l , ¬ l ∉ A A := A l (Propagate) {\displaystyle {\frac {\begin{array}{c}\{l_{1},\dots ,l_{n},l\}\in \Phi \;\;\;\neg l_{1},\dots ,\neg l_{n}\in A\;\;\;\;\;l,\neg l\notin A\end{array}}{A:=A\;l}}{\text{ (Propagate)}}}
Decide If a literal l {\displaystyle l} is in the set of literals of Φ {\displaystyle \Phi } and neither l {\displaystyle l} nor ¬ l {\displaystyle \neg l} is in A {\displaystyle A} , then decide on the truth value of l {\displaystyle l} and extend A {\displaystyle A} with the decision literal ∙ l {\displaystyle \bullet l} . This rule represents the idea that if you aren't forced to do an assignment, you must choose a variable to assign and make note which assignment was a choice so you can go back if the choice didn't result in a satisfying assignment.
l ∈ Lits ( Φ ) l , ¬ l ∉ A A := A ∙ l (Decide) {\displaystyle {\frac {\begin{array}{c}l\in {\text{Lits}}(\Phi )\;\;\;l,\neg l\notin A\end{array}}{A:=A\;\bullet \;l}}{\text{ (Decide)}}}
Conflict If there is a conflicting clause { l 1 , … , l n } ∈ Φ {\displaystyle \{l_{1},\dots ,l_{n}\}\in \Phi } such that their negations ¬ l 1 , … , ¬ l n {\displaystyle \neg l_{1},\dots ,\neg l_{n}} are in A {\displaystyle A} , set the conflict clause C {\displaystyle C} to { l 1 , … , l n } {\displaystyle \{l_{1},\dots ,l_{n}\}} . This rule represents detecting a conflict when all literals in a clause are assigned to false under the current assignment.
C = NONE { l 1 , … , l n } ∈ Φ ¬ l 1 , … , ¬ l n ∈ A C := { l 1 , … , l n } (Conflict) {\displaystyle {\frac {\begin{array}{c}C={\text{NONE}}\;\;\;\{l_{1},\dots ,l_{n}\}\in \Phi \;\;\;\neg l_{1},\dots ,\neg l_{n}\in A\end{array}}{C:=\{l_{1},\dots ,l_{n}\}}}{\text{ (Conflict)}}}
Explain If the conflict clause C {\displaystyle C} is of the form { l } ∪ D {\displaystyle \{l\}\cup D} , there is an antecedent clause { l 1 , … , l n , ¬ l } ∈ Φ {\displaystyle \{l_{1},\dots ,l_{n},\neg l\}\in \Phi } and ¬ l 1 , … , ¬ l n {\displaystyle \neg l_{1},\dots ,\neg l_{n}} are assigned before ¬ l {\displaystyle \neg l} in A {\displaystyle A} , then explain the conflict by resolving C {\displaystyle C} with the antecedent clause. This rule explains the conflict by deriving a new conflict clause that is implied by the current conflict clause and a clause that caused the assignment of a literal in the conflict clause.
C = { l } ∪ D { l 1 , … , l n , ¬ l } ∈ Φ ¬ l 1 , … , ¬ l n , ¬ l ∈ A ¬ l 1 , … , ¬ l n assigned before ¬ l C := { l 1 , … , l n } ∪ D (Explain) {\displaystyle {\frac {\begin{array}{c}C=\{l\}\cup D\;\;\;\{l_{1},\dots ,l_{n},\neg l\}\in \Phi \;\;\;\neg l_{1},\dots ,\neg l_{n},\neg l\in A\;\;\;\;\;\neg l_{1},\dots ,\neg l_{n}{\text{ assigned before }}\neg l\end{array}}{C:=\{l_{1},\dots ,l_{n}\}\cup D}}{\text{ (Explain)}}}
Backjump If the conflict clause C {\displaystyle C} is of the form { l , l 1 , … , l n } {\displaystyle \{l,l_{1},\dots ,l_{n}\}} where lev ( ¬ l 1 ) ≤ ⋯ ≤ lev ( ¬ l n ) = i < lev ( ¬ l ) {\displaystyle {\text{lev}}(\neg l_{1})\leq \dots \leq {\text{lev}}(\neg l_{n})=i<{\text{lev}}(\neg l)} , then backjump to decision level i {\displaystyle i} and assign M := M [ i ] ¬ l {\displaystyle M:=M^{[i]}\neg l} and set C := NONE {\displaystyle C:={\text{NONE}}} . This rule performs a non-chronological backtracking by jumping back to a decision level implied by the conflict clause and asserting the negation of the literal that caused the conflict at a lower decision level.
C = { l , l 1 , … , l n } lev ( ¬ l 1 ) ≤ ⋯ ≤ lev ( ¬ l n ) = i < lev ( ¬ l ) C := NONE A := A [ i ] ¬ l (Backjump) {\displaystyle {\frac {\begin{array}{c}C=\{l,l_{1},\dots ,l_{n}\}\;\;\;{\text{lev}}(\neg l_{1})\leq \dots \leq {\text{lev}}(\neg l_{n})=i<{\text{lev}}(\neg l)\end{array}}{C:={\text{NONE}}\;\;\;A:=A^{[i]}\;\neg l}}{\text{ (Backjump)}}}
Learn Learned clauses can be added to the formula Φ {\displaystyle \Phi } . This rule represents the clause learning mechanism of CDCL solvers, where conflict clauses are added back to the clause database to prevent the solver from making the same mistake again in other branches of the search tree.
C ≠ NONE C ∉ Φ Φ := Φ ∪ { C } (Learn) {\displaystyle {\frac {\begin{array}{c}C\neq {\text{NONE}}\;\;\;C\notin \Phi \end{array}}{\Phi :=\Phi \cup \{C\}}}{\text{ (Learn)}}}
These 6 rules are sufficient for basic CDCL, but modern SAT solver implementations also usually add additional heuristic-controlled rules in order to be more efficient at traversing the search space and solve SAT problems faster.
Forget Learned clauses can be removed from the formula Φ {\displaystyle \Phi } to save memory. This rule represents the clause forgetting mechanism, where less useful learned clauses are removed to control the size of the clause database. Φ ′ ⊨ C {\displaystyle \Phi '\models C} denotes that the formula Φ ′ {\displaystyle \Phi '} without the clause C {\displaystyle C} still implies C {\displaystyle C} , meaning C {\displaystyle C} is redundant. C = NONE Φ = Φ ′ ∪ { C } Φ ′ ⊨ C Φ := Φ ′ (Forget) {\displaystyle {\frac {\begin{array}{c}C={\text{NONE}}\;\;\;\Phi =\Phi '\cup \{C\}\;\;\;\Phi '\models C\end{array}}{\Phi :=\Phi '}}{\text{ (Forget)}}}
Restart The solver can be restarted by resetting the assignment A {\displaystyle A} to the empty assignment A [ 0 ] {\displaystyle A^{[0]}} and setting the conflict clause C {\displaystyle C} to NONE {\displaystyle {\text{NONE}}} . This rule represents the restart mechanism, which allows the solver to jump out of a potentially unproductive search space and start over, often guided by the learned clauses. Note, learned clauses are still remembered through restarts, ensuring termination of the algorithm.
A := A [ 0 ] C := NONE (Restart) {\displaystyle {\frac {\begin{array}{c}\end{array}}{A:=A^{[0]}\;\;\;C:={\text{NONE}}}}{\text{ (Restart)}}}
Conflict-driven clause learning works as follows.
A visual example of CDCL algorithm:5
DPLL is a sound and complete algorithm for SAT. CDCL SAT solvers implement DPLL, but can learn new clauses and backtrack non-chronologically. Clause learning with conflict analysis affects neither soundness nor completeness. Conflict analysis identifies new clauses using the resolution operation. Therefore, each learnt clause can be inferred from the original clauses and other learnt clauses by a sequence of resolution steps. If cN is the new learnt clause, then ϕ is satisfiable if and only if ϕ ∪ {cN} is also satisfiable. Moreover, the modified backtracking step also does not affect soundness or completeness, since backtracking information is obtained from each new learnt clause.6
The main application of CDCL algorithm is in different SAT solvers including:
The CDCL algorithm has made SAT solvers so powerful that they are being used effectively in several real world application areas like AI planning, bioinformatics, software test pattern generation, software package dependencies, hardware and software model checking, and cryptography.
Related algorithms to CDCL are the Davis–Putnam algorithm and DPLL algorithm. The DP algorithm uses resolution refutation and it has potential memory access problem. Whereas the DPLL algorithm is OK for randomly generated instances, it is bad for instances generated in practical applications. CDCL is a more powerful approach to solve such problems in that applying CDCL provides less state space search in comparison to DPLL.
J.P. Marques-Silva; Karem A. Sakallah (November 1996). "GRASP-A New Search Algorithm for Satisfiability". Digest of IEEE International Conference on Computer-Aided Design (ICCAD). pp. 220–227. CiteSeerX 10.1.1.49.2075. doi:10.1109/ICCAD.1996.569607. ISBN 978-0-8186-7597-3. 978-0-8186-7597-3 ↩
J.P. Marques-Silva; Karem A. Sakallah (May 1999). "GRASP: A Search Algorithm for Propositional Satisfiability" (PDF). IEEE Transactions on Computers. 48 (5): 506–521. doi:10.1109/12.769433. Archived from the original (PDF) on 2016-03-04. Retrieved 2014-11-29. https://web.archive.org/web/20160304113135/http://www.broadinstitute.org/~ilya/area/grasp.pdf ↩
Roberto J. Bayardo Jr.; Robert C. Schrag (1997). "Using CSP look-back techniques to solve real world SAT instances" (PDF). Proc. 14th Nat. Conf. on Artificial Intelligence (AAAI). pp. 203–208. http://cse-wiki.unl.edu/wiki/images/0/06/Using_CSP_Look-Back_Techniques_to_Solve_Real-World_SAT_Instances.pdf ↩
In the pictures below, " + {\displaystyle +} " is used to denote "or", multiplication to denote "and", and a postfix " ′ {\displaystyle '} " to denote "not". /wiki/Postfix_notation ↩
Marques-Silva, Joao; Lynce, Ines; Malik, Sharad (February 2009). Handbook of Satisfiability (PDF). IOS Press. p. 138. ISBN 978-1-60750-376-7. 978-1-60750-376-7 ↩
"Glucose's home page". https://www.labri.fr/perso/lsimon/research/glucose/ ↩