Predicate Calculus: Conjunctive Normal Form

Representation of Predicate Calculus Formulas

In order to consider symbolic computing applications on the domain of the predicate calculus (first-order logic), we need a grammar for the textual representation and a compatible Hasekll representation.

A Grammar for Predicate Calculus Formulas

Consider the following grammar rules defining a textual representation for formulas of the predicate calculus. Overall, we have eight types of formula.

<formula> ::= <predicate-expression> | <negation>
              | <conjunction> | <disjunction>
              | <implication> | <equivalence>
              | <universal> | <existential>

Predicate expressions such as \(p(X, f(Y, Z))\) consist of a named predicate applied to terms involving functions and variables.

<predicate-expression> ::= <predicate-name> ["(" <term> {"," <term>} ")"]
<term> ::= <function-expression> | <variable>
<function-expression> ::= <function-name> ["(" <term> {"," <term>} ")"]

In this syntax, the number of terms in a <predicate-expression> or <function-expression> may be zero, in which case the parentheses are omitted. A predicate expression having zero terms is essentially a proposition. A function expression having zero terms is essentially a constant.

One convention for distinguishing predicate and function names from variables is to use symbols beginning with a capital letter for variables, and symbols beginning with a lower case letter for predicate or function names. (Prolog implementations often follow this convention.)

predicate-name = [a-z][A-Za-z_0-9]+
function-name = [a-z][a-z_0-9]+
variable = [A-Z][A-Za-z_0-9]+

For simplicity of parsing, the structure of the compound formulas can use the operator keywords with a prefix-parenthesized notation as follows.

<negation> ::= "not" "(" <formula> ")"
<conjunction> ::= "and" "(" [<formula> {"," <formula>}] ")"
<disjunction> ::= "or" "(" [<formula> {"," <formula>}] ")"
<implication> ::= "implies" "(" <formula> <formula> ")"
<equivalence> ::= "equiv" "(" <formula> <formula> ")"
<universal> ::= "forall "(" <variable> <formula> ")"
<existential> ::= "exists "(" <variable> <formula> ")"

How does this make parsing simpler?

Representation in Haskell

Given the textual representation of of formulas of the predicate calculus as defined by the grammar above, it is straightforward to define suitable Haskell data representations.

data Formula = Predicate Name [Term] | Negation Formula
               | Conj [Formula] | Disj [Formula]
               | Implies Formula Formula | Equiv Formula Formula
               | ForAll Variable Formula | Exists Variable Formula

data Term = Function Name [Term] | Var [Char]
type Name = [Char]
type Variable = [Char]

Conjunctive Normal Form

Conjunctive Normal Form is a canonical representation of formulas that is very simple to work with in theorem proving. It is a very flat structure consisting of logical connectives in 4 layers.

At the inner most layer, we simply have predicate expressions.
At the next layer, we have literals, which are either predicate expressions or negations of predicate expressions.
Literals are combined together in disjunctions to form clauses.
Finally, clauses are combined together to form a single conjunction that represents the entire original formula.

The CNF representation can then be converted to a clause-form equivalent representation reflected by the following Haskell data types.

type ClauseFormEquivalent = [Clause]
type Clause = [Literal]
data Literal = Pos Name [Term] | Neg Name [Term]

The process of conversion to canonical form involves the following steps.

Elimination of implications and equivalences
Inward propagation of negations
Variable standardization
Elimination of existential quantifiers
Elimination of universal quantifiers

Elimination of Implications and Equivalences

The first step in conversion to CNF is to apply the following reduction rules to eliminate implications and equivalences.

implies(f1, f2) reduces to or(not(f1), f2))
equiv(f1, f2) reduces to and(or(not(f1), f2)), or(not(f2), f1))

Let us write a Haskell function to do this. A standard inside-out approach does the job.

eliminate1 :: Formula -> Formula

eliminate1 f@(Predicate n terms) = f
eliminate1 (Negation f) = (Negation (eliminate1 f))
eliminate1 (Conj formulas) = Conj (map eliminate1 formulas)
eliminate1 (Disj formulas) = Disj (map eliminate1 formulas)
eliminate1 (ForAll v f) = ForAll v (eliminate1 f)
eliminate1 (Exists v f) = Exists v (eliminate1 f)
eliminate1 (Implies f1 f2) = Disj [Negation (eliminate1 f1), eliminate1 f2]
eliminate1 (Equiv f1 f2) = 
   let (e1, e2) = (eliminate1 f1, eliminate1 f2)
       in Conj [Disj [Negation e1, e2], Disj [Negation e2, e1]]

Inward Propagation of Negations

In this step, the goal is to transform a formula so that any negations remaining in the formula are innermost, i.e., they apply to predicate expressions.

The following reduction rules apply.

not(not(f)) \(\quad \text{reduces to} \quad\) f
not(and(f1, f2, .., fn)) \(\quad \text{reduces to} \quad\) or(not(f1), not(f2), ..., not(fn))
not(or(f1, f2, .., fn)) \(\quad \text{reduces to} \quad\)and(not(f1), not(f2), ..., not(fn))
not(forall(v, f)) \(\quad \text{reduces to} \quad\) exists(v, not(f))
not(exists(v, f)) \(\quad \text{reduces to} \quad\) forall(v, not(f))

One way to do this would be to use a standard inside-out approach, with a simplify and a make_negation routine. But if we have simplified subexpressions and then see an outer negation, our various negation rules will create more negations that have to be propagated inwards.

A simpler, more-efficient approach is perform the task outside-in.

propagate :: Formula -> Formula
negatef :: Formula -> Formula

propagate f@(Predicate n terms) = f
propagate (Negation f) = negate f
propagate (Conj formulas) = Conj (map propagate formulas)
propagate (Disj formulas) = Disj (map propagate formulas)
propagate (ForAll v f) = ForAll v (propagate f)
propagate (Exists v f) = Exists v (propagate f)

negatef f@(Predicate n terms) = (Negation f)
negatef (Negation f) = propagate f
negatef (Conj formulas) = Disj (map negatef formulas)
negatef (Disj formulas) = Conj (map negatef formulas)
negatef (ForAll v f) = Exists v (negatef f)
negatef (Exists v f) = ForAll v (negatef f)

Standardization of Variables

The next step prepares for the elimination of quantifiers by renaming variables to be globally unique.

We construct an environment data structure to map from the existing variable names in a formula to new unique names.
Each time we encounter a bound variable, we use a gensym routine to generate a new unique name that is not already bound in the environment.
- If the name does not exist in the environment, just use it.
- If the name is already in use, try appending a number.
- If a given formula used X in four places, the generated symbols could be X, X1, X2 and X3.

Exercise: write a recursive Haskell function to implement this process.

Eliminating Existential Quantifiers

The next step in conversion to CNF is to eliminate existential quantifiers by introducing Skolem functions.

Consider the variable \(Z\) in: \(\forall X \forall Y \exists Z p(X, Y, Z)))\)

If this is to be true, there must exist some \(Z\) (possibly dependent on \(X\) and \(Y\)) that makes it true.

We introduce a Skolem function \(g(X, Y)\) to represent a suitable value of \(Z\).

The formula can then be translated to \(\forall X \forall Y p(X, Y, g(X, Y)))\).

Of course, each existentially quantified variable requires a separate Skolem function.

Elimination of Universal Quantifiers

Next we can propagate universal quantifiers outwards, so that the formula is in prenex form, i.e., all the quantifiers are at the outermost levels, and apply to a single quantifier-free subformula called the matrix.

We then discard the outer quantifiers and use the matrix to represent the overall formula, with all variables implicitly universally quantified.

Conjunctive Normal Form

The final steps produce conjunctive normal form by flattening propagating conjunctions outwards and flattening conjunctions and disjunctions.

or(f1, and(f2, f3)) \(\quad \text{reduces to} \quad\)and(or(f1, f2), or(f1, f3))
or(f1, or(f2, f3)) \(\quad \text{reduces to} \quad\) or(f1, f2, f3)
and(f1, and(f2, f3)) \(\quad \text{reduces to} \quad\) and(f1, f2, f3)

Clause-Form Equivalent

The CNF representation is converted to a clause-form equivalent by eliminating explicit conjunction and disjunction operators. This involves a change of representation from the Formula data type that we have defined previously. We can define a suitable ClauseFormEquivalent type in Haskell as follows.

type ClauseFormEquivalent = [Clause]
type Clause = [Literal]
data Literal = Pos Name [Term] | Neg Name [Term]

Here, a formula is represented by a list of Clauses, all of which must be true for the formula to be true. In other words, the clauses are implicitly joined together in a conjunction.

Each clause is a list of Literals. The literals for any one clause are implicitly joined together in a disjunction.

Finally, literals consist of predicate expressions, possibly negated.

For example, given the CNF formula: and(or(f1, not(f2)), or(f1, f3))

What is the parsed Formula?
Conj [Disj [Predicate "f1" [], Negation (Predicate "f2" [])], Disj [Predicate "f1" [], Predicate "f3" []]]
What is the ClauseFormEquivalent?
[[Pos "f1" [], Neg "f2" []], [Pos "f1" [], Pos "f2" []]]

Updated Thu Nov. 22 2018, 13:58 by cameron.

Simon Fraser University
Engaging the World

CourSys

Predicate Calculus: Conjunctive Normal Form

Representation of Predicate Calculus Formulas

A Grammar for Predicate Calculus Formulas

Representation in Haskell

Conjunctive Normal Form

Elimination of Implications and Equivalences

Inward Propagation of Negations

Standardization of Variables

Eliminating Existential Quantifiers

Elimination of Universal Quantifiers

Conjunctive Normal Form

Clause-Form Equivalent