Chapter 13
Unification

13.1 Introduction

Unification is the solving of equations, either in free algebras of the form TΣ(X), or in relatively free algebras modulo a set E of equations, that is, in algebras of the form TΣ∕E(X). The first case is sometimes called syntactic unification. The second case is sometimes called E-unification or unification modulo E; if E is not explicitly mentioned, then it is called equational unification or semantic unification.

In solving any equation, such as, for example,

f(x,h(y)) = f (g(y),z)

we look for instances of the variables appearing in the equation that make both sides equal. Variables can of course be instantiated by substitutions. A substitution that makes both sides of the equation equal, that is, a solution of the equation, is called a unifier. For example, if we are solving the above equation in the free algebra TΣ(X) with X a countable set of variables and with Σ having a single sort (unsorted unification), the substitution σ = {x↦→g(y),z↦→h(y)} is a unifier, and indeed the so-called most general unifier, so that for any other unifier ρ there exists a substitution μ such that ρ = σ;μ, where σ;μ denotes composition of substitutions in diagrammatic order. That is, any other solution of the equation is an instance of the most general solution provided by σ.

Of course, some equations may not have syntactic unifiers, but may have semantic unifiers modulo some equations E. Consider, for example, the equation

f(h(y),x) = f (g(y),z)

which obviously does not have any solution in TΣ(X). It does, however, have a solution in TΣ∕C(X), where C is the commutativity axiom f(x,y) = f(y,x). Indeed, the exact same substitution σ solving the first equation f(x,h(y)) = f(g(y),z) in a syntactic way, is now a unifier solving the second equation f(h(y),x) = f(g(y),z) modulo C, because we have f(h(y),g(y)) = Cf(g(y),h(y)).

Unification is a fundamental deductive mechanism used in many automated deduction tasks (see Section 13.5 for a discussion of some of them). It is also very important in combining the paradigms of functional programming and logic programming (in the Prolog sense of “logic programming”). Furthermore, in the context of Maude, unification can be very useful to reason not only about equational theories (functional modules or theories), but also, as explained in Section 13.5.2, about rewrite theories (system modules or theories).

Therefore, it is very useful to have an efficient implementation of unification available in Core Maude, which is what this chapter describes. Specifically, we explain both how order-sorted unification modulo frequently occurring equational axioms and variant-based unification modulo convergent equational theories are supported in Maude.

13.2 Order-sorted unification

Although the most general equational theories supported by Maude are membership equational theories, to obtain practical unification algorithms, allowing us to effectively compute the solutions of an equational unification problem, it is important to restrict ourselves to order-sorted equational theories. We can define the basic concepts of order-sorted E-unification in full generality.

Given an order-sorted equational theory (Σ,E), an E-unification problem consists of a nonempty set of unification equations of the form t = ?t, written in the notation

 ? ′ ? ′ t1 = t1 ∧ ... ∧tn = tn

where n 1 and the “conjunction” operator is assumed to be associative and commutative.

Given such an E-unification problem, an E-unifier for it is an order-sorted substitution1 σ : Vars(t1,t1,,tn,tn)TΣ(X) (where we assume that the set X of variables contains a countable number of variables for each sort) such that, for all i = 1,,n,

E ⊢ (∀Yi)σ(ti) = σ(t′i),

where Y i = Vars(σ(ti)(ti)), that is, all the equations (Y i)σ(ti) = σ(ti) can be deduced in (membership) equational logic from the set of equations E.

A set of unifiers 𝒰 is called a complete set of E-unifiers for a given E-unification problem t1 = ?t1 tn = ?tn iff for any other E-unifier ρ of the same E-unification problem there exists a substitution μ and a unifier σ ∈𝒰 such that ρ = Eσ;μ, that is, for each variable x in the domain of ρ we have E ρ(x) = μ(σ(x)). A complete set of E-unifiers 𝒰 is called minimal if any proper subset of 𝒰 fails to be complete.

For an order-sorted equational theory (Σ,E), unification is said to be finitary if for any E-unification problem there is always a finite complete set of unifiers 𝒰. Similarly, unification for (Σ,E) is called unitary if it is finitary and for any E-unification problem a minimal complete set of unifiers is always either empty or a singleton set. We say that (Σ,E) has a unification algorithm if there is an algorithm generating a complete set of E-unifiers for any E-unification problem in (Σ,E).

Unlike unsorted syntactic unification, which always either fails or has a single most general unifier, order-sorted syntactic unification is not necessarily unitary, that is, there is in general no single most general unifier. What exists (if Σ is finite) is a finite minimal complete set of syntactic unifiers. For some commonly occurring theories having a unification algorithm, such as the theory A of associativity of a binary function symbol, it is well-known that unification is not finitary: in general an infinite number of solutions may exist. However, for other theories, such as commutativity or associativity-commutativity, unification is finitary, both when Σ is unsorted and when Σ is order-sorted (and finite).

13.2.1 A hybrid approach to equational order-sorted unification

We define a hybrid approach, in which we take advantage of Maude’s structuring of a module’s equations into equational axioms Ax, such as associativity, and/or commutativity, and/or identity, and equations E, which are assumed to be coherent,2 confluent, and terminating modulo Ax. We can then consider order-sorted equational theories of the form (Σ,E Ax) and decompose the E Ax-unification problem into two problems: one of Ax-unification, and another of E Ax-unification that uses an Ax-unification algorithm as a subroutine. This decomposition, as well as a similar one for membership equational theories, is explained in Section 13.5. The point of this decomposition is that Ax-unification needs to be built-in at the level of Core Maude’s C++ implementation for efficiency purposes and E Ax-unification can then be built on top of Ax-unification. Since the axioms Ax are well-known and unification algorithms exist for them, the task of building in efficient Ax-unification algorithms, although highly nontrivial, becomes manageable. Maude 2.6 implemented E Ax-unification in Maude itself, but E Ax-unification is also implemented in Core Maude’s C++ level since Maude 2.7.

13.3 Theories currently supported

As mentioned in Section 13.2.1, a practical way of dealing with order-sorted equational unification is to consider order-sorted theories of the form (Σ,E Ax), with Ax a set of commonly occurring axioms, declared in Maude as equational attributes (see Section 4.4.1), and E the remaining equations specified with the eq or ceq keywords. We can then decompose the E Ax-unification problem into an Ax-unification problem and an E Ax-unification problem that uses an Ax-unification algorithm as a subroutine. In such a decomposition, the efficiency of the Ax-unification algorithm becomes crucial.

Maude currently provides an order-sorted Ax-unification algorithm for all order-sorted theories (Σ,E Ax) such that the order-sorted signature Σ is preregular modulo Ax (see Sections 3.8 and 20.3.5) and the axioms Ax associated to function symbols are as follows:

Explicitly excluded are theories with a binary symbol declared with the idem attribute. However, both the lack of the id: attribute for associativity, and of the idem attribute in general can be avoided by observing that both these axioms have the finite variant property, so that, used as variant equations, they can be combined with all the other just-mentioned attributes by means of variant unification algorithms (see Section 14.7).

If we give to Maude a unification problem in a functional module of the form fmod,Ax) endfm where Σ and Ax satisfy the above requirements, we get a complete4 set of order-sorted unifiers modulo the theory (Σ,Ax). If, instead, we give the same problem to Maude in the functional module fmod,E Ax) endfm, then the equations E are ignored, and we get the same unifiers as for the module or theory obtained by omitting the equations E. Similarly, if we provide the same unification problem in a functional theory fth,E Ax) endfth, a system module mod,E Ax,R) endm or a system theory th,E Ax,R) endth, we again get the same set of unifiers as for the theory (Σ,Ax). All this is consistent with the decomposition idea mentioned above: to deal with order-sorted E Ax-unification, other methods, that use the Ax-unification algorithm as a component, can later be defined as we explain in Section 13.5.

Maude is even more tolerant than this. The user can give to Maude a unification problem in a functional module (or functional theory, or system module, or system theory) of the form fmod,E M Ax Ax) endfm (or the analogous specification in the other cases), where (Σ,Ax) satisfies the conditions mentioned above, but M is an optional set of membership axioms (that is, (Σ,E M AxAx) can be a membership equational theory and not just an order-sorted theory), and the axioms Ax violate those requirements (as explained in conditions (i)–(iii) above). Then what will happen is:

1.
As before, the additional equations E (or rules R) are completely ignored, and the membership axioms M are likewise ignored.
2.
If a unification problem involves the occurrence of a symbol satisfying axioms Ax at the root position of a non-ground subterm, the unification process will fail and a warning will be printed.
3.
If a unification problem involves the occurrence of symbols satisfying axioms Ax, but all such occurrences are always in ground subterms of the problem, then this special case of Ax Ax-unification is handled by Maude and the corresponding Ax Ax-unifiers are returned.

Furthermore, the functional module fmod,E M Ax Ax) endfm (or the analogous functional theory or system module or theory) may import predefined modules such as BOOL or NAT, so that function symbols in such predefined modules can also be used in unification problems.

13.4 The unify command

Given a functional module or theory, or a system module or theory, ModId, the user can give to Maude a unification command of the form:

 
unify [ n ] in ModId :
Term-1 =? Term’-1 /\ ... /\ Term-k =? Term’-k .

where k 1; n is an optional argument providing a bound on the number of unifiers requested, so that if the cardinality of the set of unifiers is greater than the specified bound, the unifiers beyond that bound are omitted; and ModId can be any module or theory declared in the current Maude session (as usual, if no module is mentioned, the current module is used).

For a simple example of syntactic order-sorted unification problem illustrating:

we can define the module

 
fmod UNIFICATION-EX1 is
protecting NAT .
op f : Nat Nat -> Nat .
op f : NzNat Nat -> NzNat .
op f : Nat NzNat -> NzNat .
endfm

and then give to Maude the following command:

 
Maude> unify f(X:Nat, Y:Nat) ^ B:NzNat =? A:NzNat ^ f(Y:Nat, Z:Nat) .

Solution 1
X:Nat --> #1:Nat
Y:Nat --> #2:NzNat
B:NzNat --> f(#2:NzNat, #3:Nat)
A:NzNat --> f(#1:Nat, #2:NzNat)
Z:Nat --> #3:Nat

Solution 2
X:Nat --> #1:NzNat
Y:Nat --> #2:Nat
B:NzNat --> f(#2:Nat, #3:NzNat)
A:NzNat --> f(#1:NzNat, #2:Nat)
Z:Nat --> #3:NzNat

The next example in the same module illustrates the use of the unify command with a unification problem consisting of two equations:

 
Maude> unify f(X:Nat, Y:NzNat) =? f(Z:NzNat, U:Nat)
/\ V:NzNat =? f(X:Nat, U:Nat) .

Solution 1
X:Nat --> #1:NzNat
Y:NzNat --> #2:NzNat
Z:NzNat --> #1:NzNat
U:Nat --> #2:NzNat
V:NzNat --> f(#1:NzNat, #2:NzNat)

Note that, as already mentioned, we could instead invoke the unify command in a functional or system module or theory having additional equations, memberships, or rules, which are always ignored. For example, we could have instead declared the system theory

 
th UNIFICATION-EX2 is
protecting NAT .
op f : Nat Nat -> Nat .
op f : NzNat Nat -> NzNat .
op f : Nat NzNat -> NzNat .
eq f(f(N:Nat, M:Nat), K:Nat) = f(N:Nat, M:Nat) .
rl f(N:Nat, M:Nat) => 0 .
endth

so that, if we give again the same unify commands above, we will obtain exactly the same sets of order-sorted unifiers as for the UNIFICATION-EX1 module.

The above examples illustrate a further point about the form of the returned unifiers, namely, that in each assignment X --> t in a unifier, the variables appearing in the term t are always fresh variables of the form #n:Sort. The user is required not to use variables of this form in the submitted unification problem. A warning is printed if this requirement is violated:

 
Maude> unify in NAT : X:Nat ^ #1:Nat =? #2:Nat .
Warning: unsafe variable name #1:Nat in unification problem.

13.4.1 Non-supported unification examples

The handling of unification problems in modules with operators whose equational axioms are excluded from the current unification algorithm can be illustrated by means of the following module:

 
fmod UNIFICATION-EX3 is
protecting NAT .
op f : Nat Nat -> Nat [idem] .
endfm

As already mentioned, a unification problem using such an idempotent function symbol f in a non-ground subterm will mean that the unification attempt fails and a warning is issued:

 
Maude> unify f(f(X:Nat, Y:Nat), Z:Nat) =? f(A:Nat, B:Nat) .
Warning: Term f(X:Nat, Y:Nat, Z:Nat) is non-ground and unification
for its top symbol is not currently supported.

Instead, if all symbols satisfying unsupported equational axioms Ax only appear in ground subterms of the unification problem, the unification attempt succeeds with the correct set of order-sorted Ax Ax-unifiers:

 
Maude> unify X:Nat + f(f(41, 42),43) =? Y:Nat + f(41,f(42,43)) .

Solution 1
X:Nat --> #1:Nat + f(41, f(42, 43))
Y:Nat --> #1:Nat + f(f(41, 42), 43)

Solution 2
X:Nat --> f(41, f(42, 43))
Y:Nat --> f(f(41, 42), 43)

Note, however, that, as already mentioned, unification modulo the idem attribute can be achieved by variant-based unification using the explicit equation f(N:Nat,N:Nat) = N:Nat (see Section 14.7).

13.4.2 Associative-commutative (AC) unification examples

The use of a bound on the number of unifiers, as well as the use of the associative-commutative (AC) operator + in the predefined NAT module (see Section 7.2), plus the fact that even small AC-unification problems can generate a large number of unifiers are all illustrated by the following command:

 
Maude> unify [100] in NAT :
X:Nat + X:Nat + Y:Nat =? A:Nat + B:Nat + C:Nat .
Solution 1
X:Nat --> #1:Nat + #2:Nat + #3:Nat + #5:Nat + #6:Nat + #8:Nat
Y:Nat --> #4:Nat + #7:Nat + #9:Nat
A:Nat --> #1:Nat + #1:Nat + #2:Nat + #3:Nat + #4:Nat
B:Nat --> #2:Nat + #5:Nat + #5:Nat + #6:Nat + #7:Nat
C:Nat --> #3:Nat + #6:Nat + #8:Nat + #8:Nat + #9:Nat
...

Solution 100
X:Nat --> #1:Nat + #2:Nat + #3:Nat + #4:Nat
Y:Nat --> #5:Nat
A:Nat --> #1:Nat + #1:Nat + #2:Nat
B:Nat --> #2:Nat + #3:Nat
C:Nat --> #3:Nat + #4:Nat + #4:Nat + #5:Nat

The following unification command in the predefined CONVERSION module (see Section 7.9) illustrates a further point on the handling of built-in constants and functions. Built-in constants, even though infinite in number (all strings, all quoted identifiers, all natural numbers, and so on), are handled and can be used in unification problems. But built-in functions are not considered built-in for unification purposes; therefore, built-in evaluation of such functions is not performed during the unification.

 
Maude> unify in CONVERSION :
X:String < "foo" + Y:Char =?
Z:String + string(pi) < "foo" + Z:String .

Solution 1
X:String --> #1:Char + string(pi)
Y:Char --> #1:Char
Z:String --> #1:Char

13.4.3 Unification examples with the iter attribute

The following example illustrates the efficiency of order-sorted unification modulo the iter theory (in this example in combination with the comm theory). Consider the following module:

 
fmod ITER-EXAMPLE is
sorts NzEvenNat EvenNat OddNat NzNat Nat EvenInt OddInt NzInt Int .
subsorts OddNat < OddInt NzNat < NzInt < Int .
subsorts EvenNat < EvenInt Nat < Int .
subsorts NzEvenNat < NzNat EvenNat < Nat .

op 0 : -> EvenNat .

op s : EvenNat -> OddNat [iter] .
op s : OddNat -> NzEvenNat [iter] .
op s : Nat -> NzNat [iter] .
op s : EvenInt -> OddInt [iter] .
op s : OddInt -> EvenInt [iter] .
op s : Int -> Int [iter] .

op _+_ : Int Int -> Int [comm gather (E e)] .
op _+_ : OddInt OddInt -> EvenInt [ditto] .
op _+_ : EvenInt EvenInt -> EvenInt [ditto] .
op _+_ : OddInt EvenInt -> OddInt [ditto] .
op _+_ : Nat Nat -> Nat [ditto] .
op _+_ : Nat NzNat -> NzNat [ditto] .
op _+_ : OddNat OddNat -> NzEvenNat [ditto] .
op _+_ : NzEvenNat EvenNat -> NzEvenNat [ditto] .
op _+_ : EvenNat EvenNat -> EvenNat [ditto] .
op _+_ : OddNat EvenNat -> OddNat [ditto] .
endfm

We can then give the unification commands:

 
Maude> unify in ITER-EXAMPLE :
s^1000000(X:OddNat) =? s^100000000001(Y:Int) .
Decision time: 1ms cpu (1ms real)

Solution 1
X:OddNat --> s^99999000001(#1:EvenNat)
Y:Int --> #1:EvenNat

and

 
Maude> unify in ITER-EXAMPLE :
s^1000000(X:OddNat) =? s^100000000001(Y:Int + Z:Int + W:Int) .
Decision time: 2ms cpu (5ms real)

Solution 1
X:OddNat --> s^99999000001(#1:OddNat + (#2:OddNat + #3:EvenNat))
W:Int --> #1:OddNat
Z:Int --> #2:OddNat
Y:Int --> #3:EvenNat

Solution 2
X:OddNat --> s^99999000001(#1:OddNat + (#2:EvenNat + #3:OddNat))
W:Int --> #1:OddNat
Z:Int --> #2:EvenNat
Y:Int --> #3:OddNat

Solution 3
X:OddNat --> s^99999000001(#1:EvenNat + (#2:OddNat + #3:OddNat))
W:Int --> #1:EvenNat
Z:Int --> #2:OddNat
Y:Int --> #3:OddNat

Solution 4
X:OddNat --> s^99999000001(#1:EvenNat + (#2:EvenNat + #3:EvenNat))
W:Int --> #1:EvenNat
Z:Int --> #2:EvenNat
Y:Int --> #3:EvenNat

As already mentioned, assuming that no bound on the number of unifiers is specified by the user, Maude will always compute a complete set of order-sorted unifiers modulo Ax, for Ax the supported equational axioms described in Section 13.3. However, there is no guarantee that the computed set of unifiers is minimal, that is, some of the unifiers in the computed set may be redundant, since they could be obtained as instances (modulo Ax) of other unifiers in the set.

13.4.4 Associative-commutative with identity (ACU) unification examples

To illustrate the use of the unification command in the presence of ACU operators, let us consider yet another version of the vending machine example (first presented in Section 5.1 and in other sections of this document in different forms):

 
mod UNIF-VENDING-MACHINE is
sorts Coin Item Marking Money State .
subsort Coin < Money .
op empty : -> Money .
op __ : Money Money -> Money [assoc comm id: empty] .
subsort Money Item < Marking .
op __ : Marking Marking -> Marking [assoc comm id: empty] .
op <_> : Marking -> State .
ops $ q : -> Coin .
ops a c : -> Item .
var M : Marking .
rl [buy-c] : < M $ > => < M c > .
rl [buy-a] : < M $ > => < M a q > .
eq [change]: q q q q = $ .
endm

We can ask whether there is an equational unifier of two configurations, one containing at least two quarters, and another containing at least one dollar.

 
Maude> unify in UNIF-VENDING-MACHINE :
< q q X:Marking > =? < $ Y:Marking > .

Solution 1
X:Marking --> $
Y:Marking --> q q

Solution 2
X:Marking --> $ #1:Marking
Y:Marking --> q q #1:Marking

Notice that the computed set of unifiers is not minimal because the first solution is the instance of the second obtained by substituting the variable #1:Marking with the constant empty. The Maude implementation tries to eliminate redundant unifiers when this is easy to detect; but for efficiency reasons it does not always return a minimal set of such unifiers. However, Maude computes a minimal set of unifiers modulo axioms when invoked via the variant-based unification command of Section 14.8 below; for instance, take the previous command above and add the word “variant” at the beginning to obtain the minimal set (see Section 14.8 for details on how to invoke the command):

 
Maude> variant unify in UNIF-VENDING-MACHINE :
< q q X:Marking > =? < $ Y:Marking > .

Solution 1
X:Marking --> $ #1:Marking
Y:Marking --> q q #1:Marking

Recall that memberships are discarded completely. For instance, if we modify the previous example to include a membership definition for a new sort Quarter, any unification call with that sort may not succeed.

 
mod UNIF-VENDING-MACHINE-MB is
sorts Coin Item Marking Money State .
subsort Coin < Money .
op empty : -> Money .
op __ : Money Money -> Money [assoc comm id: empty] .
subsort Money Item < Marking .
op __ : Marking Marking -> Marking [assoc comm id: empty] .
op <_> : Marking -> State .
ops $ q : -> Coin .
ops a c : -> Item .

sort Quarter .
subsort Quarter < Coin .
mb q : Quarter .

var M : Marking .
rl [buy-c] : < M $ > => < M c > .
rl [buy-a] : < M $ > => < M a q > .
eq [change]: q q q q = $ .
endm

We can ask whether there is an equational unifier of two configurations, one containing at least two quarters, and another containing two quarters and a dollar, but it fails:

 
Maude> unify in UNIF-VENDING-MACHINE-MB :
< q q X:Marking > =? < $ Y:Quarter Z:Quarter > .

No unifier.

despite the fact that instantiating both Y and Z to q is part of a solution in the unification call above. The reason is that the membership is not used during ACU unification and therefore the algorithm unification treats the sort Quarter as empty.

13.4.5 Unification examples with an identity symbol

Let us illustrate the use of the different combinations of the identity attribute for unification. Let us consider first a module using the left-id attribute.

 
mod LEFTID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (E e) left id: e] .
ops a b c d e : -> Elem .
endm

Then the following two unification problems have a different meaning, where we have swapped the position of the variables. First, when we unify two terms where variables of sort Magma are at the left of the terms, we have both a syntactic unifier and a unifier modulo identity.

 
Maude> unify in LEFTID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> #1:Magma a
Y:Magma --> #1:Magma

When the variables are instead at the right side of the terms of sort Magma, there is clearly no unifier, since the term a a Y:Magma is parsed as (a a) Y:Magma in module LEFTID-UNIFICATION-EX due to the attribute gather (E e) (see Section 3.9).

 
Maude> unify in LEFTID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .
No unifier.

Consider now a similar module but for the right identity.

 
mod RIGHTID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (e E) right id: e] .
ops a b c d e : -> Elem .
endm

When we unify two terms where variables of sort Magma are at the left of the terms, there is clearly no unifier, since the term Y:Magma a a is parsed this time as Y:Magma (a a) in module RIGHTID-UNIFICATION-EX due to the attribute gather (e E) (see Section 3.9).

 
Maude> unify in RIGTHID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .
No unifier.

When the variables are instead at the right side of the terms of sort Magma, we have both a syntactic unifier and a unifier modulo identity:

 
Maude> unify in RIGTHID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> #1:Magma a
Y:Magma --> #1:Magma

Consider now a similar module but with the identity attribute.

 
mod ID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (E e) id: e] .
ops a b c d e : -> Elem .
endm

When we unify two terms where variables of sort Magma are at the left of the terms, we have both a syntactic unifier and a unifier modulo identity:

 
Maude> unify in ID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> #1:Magma a
Y:Magma --> #1:Magma

When the variables of sort Magma are instead at the right side of the terms of sort Magma, we only have a unifier modulo identity:

 
Maude> unify in ID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .

Solution 1
X:Magma --> a
Y:Magma --> e

And finally, when we add commutativity, we obtain slightly different results.

 
mod COMM-ID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (E e) comm id: e] .
ops a b c d e : -> Elem .
endm

When we unify two terms where variables of sort Magma are at the left of the terms, we have both a syntactic unifier and a unifier modulo identity and commutativity, but the latter is duplicated:

 
Maude> unify in COMM-ID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> a #1:Magma
Y:Magma --> #1:Magma

Solution 3
X:Magma --> a
Y:Magma --> e

When the variables of sort Magma are instead at the right side of the terms of sort Magma, we have several unifiers modulo identity and commutativity:

 
Maude> unify in COMMID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .

Solution 1
X:Magma --> a a
Y:Magma --> a

Solution 2
X:Magma --> a
Y:Magma --> e

Solution 3
X:Magma --> a
Y:Magma --> e

Note that the first solution is intriguing and is obtained by unifying terms (X:Magma a) and ((a a) Y:Magma).

13.4.6 Associative (A) unification examples

In general, unification modulo associativity is not finitary. However the Maude implementation provides a finitary and complete set of unifiers for a substantial class of unification problems where associative unification happens to be finitary. In problems outside of this class, a finite but in general incomplete set of unifiers is returned, with an explicit warning that the set may be incomplete. Let us illustrate the use of the unification command in the presence of A function symbols with the different capabilities and limitations. One of the key ideas for having finitary unification modulo associativity is to have variables over associative symbols being linear, i.e., having only one occurrence among all the terms in a unification problem.

Consider a very simple module with a binary associative function symbol:

 
fmod UNIFICATION-EX4 is
protecting NAT .
sort NList .
subsort Nat < NList .
op _:_ : NList NList -> NList [assoc] .
endfm

A unification problem using such an associative function symbol returns five unifiers without any problem because the unification problem is linear.

 
Maude> unify in UNIFICATION-EX4 : X:NList : Y:NList : Z:NList =? P:NList : Q:NList .

Solution 1
X:NList --> #1:NList : #2:NList
Y:NList --> #3:NList
Z:NList --> #4:NList
P:NList --> #1:NList
Q:NList --> #2:NList : #3:NList : #4:NList

Solution 2
X:NList --> #1:NList
Y:NList --> #2:NList : #3:NList
Z:NList --> #4:NList
P:NList --> #1:NList : #2:NList
Q:NList --> #3:NList : #4:NList

Solution 3
X:NList --> #1:NList
Y:NList --> #2:NList
Z:NList --> #3:NList : #4:NList
P:NList --> #1:NList : #2:NList : #3:NList
Q:NList --> #4:NList

Solution 4
X:NList --> #1:NList
Y:NList --> #2:NList
Z:NList --> #3:NList
P:NList --> #1:NList : #2:NList
Q:NList --> #3:NList

Solution 5
X:NList --> #1:NList
Y:NList --> #2:NList
Z:NList --> #3:NList
P:NList --> #1:NList
Q:NList --> #2:NList : #3:NList

The unification problem may not be linear but it may be easy to detect that there is no unifier, e.g. it is impossible to unify a list X concatenated with itself with another list Y concatenated also with itself but with a natural number, e.g. 1, in between.

 
Maude> unify in UNIFICATION-EX4 : X:NList : X:NList =? Y:NList : 1 : Y:NList .
No unifier.

When associative variables are non-linear, Maude has implemented a cycle detection that may have different outcomes:

1.
There are non-linear variables but only on one side of the problem. This implies that there is no risk of infinitary behavior and Maude will be complete with no need for cycle detection.
 
Maude> unify in UNIFICATION-EX4 : P:NList : P:NList =? 1 : Q:NList : 2 .

Solution 1
P:NList --> 1 : #1:NList : 2
Q:NList --> #1:NList : 2 : 1 : #1:NList

Solution 2
P:NList --> 1 : 2
Q:NList --> 2 : 1

2.
Cycle detection is used when non-linear variables appear on both sides but no variable occurs more than twice. It may detect spurious cycles, i.e., a cycle that does not correspond to unifiers, but then nothing is reported. If there is at least one unifier associated to the cycle, a warning will be printed and the acyclic solutions are returned.

 
Maude> unify in UNIFICATION-EX4 : 0 : X:NList =? X:NList : 0 .
Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

Solution 1
X:NList --> 0
Warning: Some unifiers may have been missed due to incomplete
unification algorithm(s).

Note that the unification problem 0:X = ?X:0 has an infinite family of most general unifiers {X↦→0n} for 0n being a list of n consecutive 0 elements.

3.
If neither of the two above cases apply, then Maude forces termination with an internal depth bound and, therefore, no cycle detection is activated. In this case, a warning will be printed and the solutions up to the internal depth bound are returned.

 
Maude> unify in UNIFICATION-EX4 :
X:NList : X:NList : X:NList =? Y:NList : Y:NList : Z:NList : Y:NList .
Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

Solution 1
X:NList --> #1:NList : #1:NList : #1:NList : #1:NList
Y:NList --> #1:NList : #1:NList : #1:NList
Z:NList --> #1:NList : #1:NList : #1:NList

Solution 2
X:NList --> #1:NList : #1:NList : #1:NList
Y:NList --> #1:NList : #1:NList
Z:NList --> #1:NList : #1:NList : #1:NList

Solution 3
X:NList --> #1:NList : #1:NList
Y:NList --> #1:NList
Z:NList --> #1:NList : #1:NList : #1:NList
Warning: Some unifiers may have been missed due to incomplete
unification algorithm(s).

If the verbose mode is activated (see Section 23.15), Maude will print different internal messages associated to the situations above:

13.5 Some applications of unification

In this section we review briefly some applications that can be developed using a unification infrastructure like the one described in this chapter. We begin in Section 13.5.1 by discussing narrowing and narrowing-based unification algorithms. These algorithms are actually implemented in Core Maude’s C++ and described in Chapter 14. We then explain in Section 13.5.2 how narrowing modulo an equational theory can be used for reachability analysis of concurrent systems described by rewrite theories, and, more generally, for symbolic temporal logic model checking of such systems. This narrowing-based reachability analysis is implemented in Core Maude’s C++ and described in Chapter 15. Finally, we discuss briefly other automated deduction applications, including theorem proving ones.

13.5.1 Narrowing-based unification

If we have a dedicated algorithm (as the one supported by Maude) to solve unification problems in an order-sorted theory (Σ,Ax), then we can use it as a component to obtain a unification algorithm for theories of the form (Σ,E Ax), provided the equations E are unconditional, coherent, confluent and terminating modulo Ax [85].

The technique used under such conditions to obtain an E Ax-unification algorithm from an Ax-unification algorithm is called narrowing, and is the obvious generalization of term rewriting to handle logical variables and perform a kind of symbolic execution. In ordinary term rewriting, if we want to apply a rewrite rule, say l r, to a term t at position p, the subterm t|p must be an instance of the lefthand side l, that is, there must be a substitution σ such that t|p = σ(l). Instead, in narrowing we can apply the rule l r at a non-variable position p in t, provided the unification problem t|p = ?l (where the variables of l and t are assumed disjoint) has a nonempty set of unifiers. For any such unifier 𝜃 we then narrow the original term t to the substitution instance under 𝜃 of t[r]p. We then write

t ↝ 𝜃(t[r]) p

for such a narrowing step. For example, in the standard, unsorted specification of the natural numbers, we can use the equation x + s(y) = s(x + y) as a rewrite rule to narrow the term x′∗ (y + z) at position 2 with substitution 𝜃 = {x↦→y′′,y↦→z′′,y↦→y′′,z↦→s(z′′)} to get the narrowing step

x′ ∗ (y′ + z′) ↝ x′ ∗ s(y′′ + z′′).

In this example, 𝜃 is the most general unifier for the syntactic unification problem y + z = ?x + s(y). However, in the same way as we can perform rewriting modulo a set of axioms Ax if we have an Ax-matching algorithm, we can likewise perform narrowing modulo a set Ax of axioms if we have an Ax-unification algorithm. That is, the unification problems t|p = ?l are now solved, not by syntactic unification, but by Ax-unification.

If a theory (Σ,E Ax) satisfies the above coherence, confluence, and termination modulo Ax requirements, we can systematically reduce E Ax-unification problems to narrowing problems as follows:

1.
we add a fresh new sort Truth to Σ with a constant tt;
2.
for each top sort of each connected component of sorts we add a binary predicate eq of sort Truth and add to E the equation eq(x,x) = tt, where x has such a top sort;
3.
we then reduce an E Ax-unification problem t = ?t to the narrowing reachability problem
eq(t,t′) ↝∗ tt

modulo Ax in the theory extending (Σ,E Ax) with these new sorts, operators, and equations, where E and the new equations are used as rewrite rules.

That is, we search for all narrowing paths modulo Ax from eq(t,t) to tt. Each such path then gives us a unifier of the equation t = ?t, just by composing the unifiers of each narrowing step in the path. As explained in [17], narrowing can be performed not only in order-sorted equational theories, but also in membership equational theories.

The just-described computation of E Ax-unifiers by narrowing modulo Ax yields a complete but in general infinite set of E Ax-unifiers. For the case when Ax = , some sufficient conditions are known ensuring termination of the basic narrowing strategy (see, e.g., [8453]), and therefore ensuring that the complete set of E Ax-unifiers computed by basic narrowing is finite. However, for commonly occurring sets of axioms Ax, such as associativity-commutativity (AC), it is well-known that narrowing modulo AC “almost never terminates” and, furthermore, that narrowing strategies facilitating termination such as basic narrowing are incomplete [13536]. Based on the idea of “variants” in [36], a complete, yet quite efficient in terms of its search space, narrowing strategy modulo Ax called folding variant narrowing has been proposed in [7071]. Furthermore, in [697123] sufficient checkable conditions on (Σ,E Ax) have been given ensuring that the E Ax-unification algorithm provided by folding variant narrowing modulo Ax is finitary.

In Maude 2.6, a narrowing library developed by Santiago Escobar implemented the folding variant narrowing as a component, making E Ax-unification available as part of Full Maude. Instead, E Ax-unification is implemented in Core Maude’s C++ level starting from Maude 2.7 (see Section 14.7).

13.5.2 Symbolic reachability analysis in rewrite theories

A rewrite theory5 , say = (Σ,E Ax,R), specified in Maude as a system module, describes a concurrent system whose states are E Ax-equivalence classes of ground terms, and whose local concurrent transitions are specified by the rules R. When formally analyzing the properties of , an important problem is ascertaining for specific patterns t and t the following symbolic reachability problem:

∃Xt −→ ∗ t′

with X the set of variables appearing in t and t, which for this discussion we may assume are a disjoint union of those in t and those in t. That is, t and t symbolically describe sets of concurrent states [[t]] and [[t]] (namely, all the ground substitution instances of t, resp. t, or, more precisely, the E Ax-equivalence classes associated to such ground instances). And we are asking: is there a state in [[t]] from which we can reach a state in [[t]] after a finite number of rewriting steps?

For example, may specify a cryptographic protocol, t may symbolically describe a set of initial states, and t may likewise describe a set of attack states. Then, if the above reachability question can be answered in the affirmative, the protocol is insecure against the kinds of attacks described by t. Furthermore, if the way of answering the reachability question is somehow constructive, we should be able to exhibit a concrete attack as a rewrite sequence violating the security of the protocol.

As explained in [112], provided the rewrite theory = (Σ,E Ax,R) is topmost (that is, all rewrites take place at the root of a term), or, as in the case of AC rewriting of object-oriented systems, is “essentially topmost,” and the rules R are coherent with E modulo Ax, narrowing with the rules R modulo the equations E Ax gives a constructive, sound, and complete method to solve reachability problems of the form Xtt, that is, such a problem has an affirmative answer if and only if we can find a finite narrowing sequence modulo E Ax of the form t 𝜃(t) for some 𝜃. The method is constructive, because instantiating t with the composition of the unifiers for each step in the narrowing sequence gives us a concrete rewrite sequence witnessing the existential formula.

Of course, narrowing with R modulo E Ax requires performing E Ax-unification at each narrowing step. As explained in Section 13.5.1, E Ax-unification can itself be performed by narrowing with the equations E modulo Ax, provided E is coherent, confluent, and terminating modulo Ax. Therefore, in performing symbolic reachability analysis in a rewrite theory = (Σ,E Ax,R) there are usually two levels of narrowing and two levels of unification: narrowing with R modulo E Ax for reachability, and narrowing with E modulo Ax for unification purposes. Similarly, unification modulo E Ax is performed by narrowing, while unification modulo Ax is usually performed in a built-in way.

This is exactly the approach taken in the Maude-NPA protocol analyzer [6667], where cryptographic protocols are formally specified as rewrite theories of the form = (Σ,E Ax,R), and the formal reachability analysis is performed in a backwards way, from an attack state to an initial state. This just means that we perform standard (forwards) reachability analysis with the rewrite theory 1 = (Σ,E Ax,R1), where R1 = {rl(lr) R}. The equational theory E Ax typically specifies the algebraic properties of the cryptographic functions used in the given protocol, for example, public key encryption and decryption, exclusive or, modular exponentiation, and so on. Reasoning modulo such algebraic properties is very important to gain high levels of assurance, since it is well-known that some cryptographic protocols that can be proved secure under the standard Dolev-Yao model, in which the cryptographic functions are treated as a “black box,” can actually be broken by an attacker that makes clever use of the algebraic properties of the cryptographic functions of the protocol. Besides using narrowing with rules modulo equations, the Maude-NPA tool uses several state space reduction techniques, including grammars that can describe sets of unreachable states that need not be explored [66], to drastically reduce the narrowing search space, often from an infinite set of states to a finite set of them, so that finite failure to find an attack becomes an actual proof of security.

Given a rewrite theory = (Σ,E Ax,R), we may be interested in verifying properties more general than existential questions of the form Xtt. Note that we can view such questions as questions about the violation of an invariant, because we can regard the set of states [[t]] as the complement of an invariant set of states, say I, which can be easily specified by an equationally-defined predicate. That is, proving the existential formula Xtt is the same thing as finding a counterexample for the assertion ,t|=I. This is just a temporal logic satisfaction assertion (see Chapter 12 ), but with the following nonstandard features: (i) the term t does not describe a single initial state, but a possibly infinite set [[t]] of initial states; and (ii) there is no guarantee that the set of reachable states is finite. Therefore, standard model-checking techniques may not be usable, because of a possible double infinity: in the number of initial states, and in the number of states reachable for each of those initial states. One can also generalize the above reachability question ,t|=I to questions of the form ,t|=φ, with φ a temporal logic formula. The papers [688] show how narrowing can be used (again, both at the level of transitions with rules R and at the level of equations E) to perform logical model checking to verify such temporal logic formulas; this is a a kind of symbolic model checking not in the binary decision diagram sense of “symbolic,” which still remains finite-state, but in a much more general sense in which possibly infinite sets of states are finitely described by patterns with logical variables.

In Maude 2.7, a narrowing-based reachability analysis developed by Santiago Escobar was implemented as part of Full Maude. In Maude 3.0, a narrowing-based reachability analysis has been directly implemented in Core Maude’s C++ (see Section 15).

13.5.3 Other automated deduction applications

The automated deduction application par excellence, and the one that historically, thanks to Alan Robinson, gave rise to the unification notion is resolution-based theorem proving [125]. Subsequent work by Gordon Plotkin [122] made it clear that not just syntactic unification, but unification modulo a set of equational axioms Ax is a very useful mechanism supporting theorem proving. Indeed, state-of-the-art resolution-based theorem provers routinely support unification modulo commonly occurring equational theories such as AC. Of course, the use of equational unification need not be restricted to resolution-based theorem provers. For example, the paper [127] shows how narrowing with sequent rules and equational unification can be used in a sequent-based theorem prover in which one can reason modulo both the equivalences given by the structural rules for sequents and also Boolean equivalences between formulas.

Yet another important application area is that of formal reasoning methods such as Knuth-Bendix equational completion (and its associated “inductionless induction” theorem-proving methods), checking local confluence of rewrite rules, and checking coherence of a set of rewrite rules with respect to a set of equations [136]. In all these formal reasoning methods one needs to compute critical pairs by unification of a term with a subterm of another term. In particular, tools such as the Maude Church-Rosser Checker (CRC) and Coherence Checker (ChC) were before restricted to theories where the only equational axiom supported was commutativity. The present built-in support for unification modulo a wide set Ax of axioms (further extensible to identity axioms by narrowing as explained in Section 13.5.1) has made it possible to have much more general versions of the Maude Church-Rosser Checker and Coherence Checker tools to reason about the confluence and coherence of Maude specifications modulo equational axioms specified as equational attributes in Maude modules and theories [555456].

13.6 Endogenous vs. exogenous order-sorted unification algorithms

The current Maude order-sorted unification algorithm modulo axioms Ax is what we might call an endogenous algorithm, in the sense that the computation of order-sorted unifiers is intimately integrated with the order-sorted reasoning process, so that unifiers that do not type under the order-sorted typing restrictions are never generated. This makes such an algorithm typically more efficient, because the order-sorted typing restrictions may drastically cut the number of generated unifiers, particularly modulo axioms such as AC where the number of unsorted unifiers can be very large. That is, order-sorted unification, even though it lacks the unitary property of unsorted syntactic unification and is in general more expensive than unsorted unification in the syntactic case, can often be more efficient in the modulo Ax case because of the drastic reductions that can be achieved by order-sorted typing restrictions in the number of Ax-unifiers. Moreover, even in the syntactic case, the efficiency of deductive processes that use order-sorted unification can substantially increase, because order-sorted unification will fail more often than unsorted unification, leading to smaller search spaces.

However, from the early papers on order-sorted unification such as, e.g., [128110130] a more modular, although typically less efficient, approach to order-sorted unification, which we might call exogenous has been known. The basic idea is to reuse an existing unsorted unification algorithm modulo some axioms Ax (under some conditions on Ax) to compute order-sorted Ax-unifiers in the following way:

1.
type information is removed from the order-sorted Ax-unification problem to convert it into an unsorted Ax-unification problem;
2.
a complete set of unsorted Ax-unifiers is computed; and
3.
the order-sorted Ax-unifiers of the original problem are obtained from the unsorted ones by a process of filtering the unsorted unifiers through an order-sorted reasoning process, in which the sorts of the variables in the original problem are taken into account. Each order-sorted unifier thus obtained is always a specialization of a corresponding unsorted one, where the unsorted variables have been specialized to given sorts; however, some unsorted unifiers, perhaps many, may be filtered out by this process and have no corresponding order-sorted unifiers.

For a state-of-the art study of the exogenous approach, allowing very general axioms Ax and proving the correctness of an order-sorted inference system to generate the order-sorted unifiers from the unsorted ones, see [82].

Both the endogenous and the exogenous approaches have their own advantages and disadvantages. The endogenous approach is more efficient, but it requires dedicated algorithms and implementations, so that unsorted unification algorithms and tools cannot be reused. The exogenous algorithms are less efficient because: (i) they can generate many unifiers that may later be discarded; (ii) a separate order-sorted filtering process is needed; and (iii) changes of representation, and even parsing, are required between unsorted and order-sorted representations (particularly when existing unsorted algorithms are reused). However, they are more modular and flexible, so that one can with relatively little effort obtain an order-sorted unification algorithm from an unsorted one.

In Maude we have experimented with, and benefited from, both an exogenous algorithm and the current endogenous one. The exogenous algorithm was developed in collaboration with Evelyn Contejean and Claude Marché from Université Paris-Sud, and involved also the efforts of Prasanna Thati and Joe Hendrix at UIUC. It reused the rich library of unsorted unification algorithms modulo axioms of the CiME system [38], which could be called from Maude in an experimental version. Inside Maude, it used the order-sorted inference system to compute order-sorted unifiers developed by Joe Hendrix and described in [82].

This exogenous algorithm has been extensively used in a previous version of the Maude-NPA tool, and has been shown effective in finding attacks to cryptographic protocols modulo nontrivial equational theories of the form E Ax [65]. The exogenous algorithm has also been extremely useful in testing the endogenous one. Because of the large number of unifiers generated and the complex nature of semantic unification algorithms, their testing is a nontrivial matter, and the automation of such testing is quite difficult. Thanks to the exogenous algorithm, and through the efforts of Ralf Sasse and Santiago Escobar, it has been possible to generate large numbers of random unification problems of different sizes in which the sets of unifiers generated by the exogenous and endogenous order-sorted unification algorithms have been automatically compared. This testing uncovered several bugs in an earlier alpha version of the Maude endogenous algorithm, and has also served to evaluate in practice the greater efficiency of the endogenous algorithm developed by Steven Eker.

13.7 Some notes on the implementation of unification

Order-sorted unification is NP-complete in general because Boolean algebra can be encoded as an order-sorted free theory signature and hence satisfiability can be reduced to an order-sorted free theory unification problem. In practice, reasonable performance can be obtained using a binary decision diagram technique to compute sorts for free variables occurring in unsorted unifiers. Furthermore in the AC case, sort information can be pushed into the unsorted unification algorithm and used to prune the Diophantine basis and the choice of subsets drawn from such a basis [60].

The unification theory combination framework and AC-unification algorithm are based on [16], while the Diophantine system solver used by the AC algorithm is based on [37]. The unification algorithm has been thoroughly tested by Santiago Escobar and Ralf Sasse using CiME [38] as an oracle, and has shown better average performance than CiME on the same problems.

The addition of ACU to the theories handled by the dedicated unification algorithm in Maude required substantial changes to the unification infrastructure implemented in previous versions of Maude for C and AC theories because of the problems associated with collapse theories. In this section we give an overview of the techniques used and highlight a novel algorithm for selecting sets of Diophantine basis elements during the computation of ACU unifiers.

13.7.1 Combining unification algorithms

The basic approach to solving unification problems where function symbols are drawn from more than one theory is variable abstraction where alien subterms, i.e., subterms headed by a symbol from a theory different from that of the top symbol of the parent term, are replaced by fresh variables to form pure unification subproblems which only involve variables and function symbols from a single theory and which can be passed to a unification algorithm for such a theory. Proving termination of combinations of algorithms is nontrivial, as variables are necessarily shared between theories and the unification of variables in one theory can create new unification subproblems in another theory, potentially ad infinitum. Stickel’s algorithm [131], which combined the AC and free theories, required an elaborate termination proof by Fages [72]. Boudet et al. [16] proposed a much simpler approach where all unification subproblems and variable bindings in a given theory are solved (and re-solved if another subproblem in that theory is created) simultaneously. This method requires a simultaneous E-unification algorithm for each theory E and was the method implemented in Maude for C, AC, and prior to the addition of ACU.

Collapse theories add two major complications to the combination of unification algorithms. Firstly, theory clashes where two terms with top symbols from different theories are required to unify can no longer be treated as a failure, since if one of the top symbols belongs to a collapse theory, a collapse may occur, yielding solutions. Secondly, compound cycles, that is, problems of the form x1 = ?t1(,x2,),x2 = ?t2(,x3,),,xn = ?tn(,x1,) where the terms ti are pure in different theories, can no longer be treated as failure, since solutions may be possible via collapse.

Several authors have proposed combination schemes that can handle collapse theories. We use a simplified version of an algorithm due to Boudet [15]. The original algorithm also handles nonregular theories but we omit that capability to simplify the implementation. The key idea is that each theory E needs a restricted simultaneous E-unification algorithm which solves the simultaneous unification problem for pure equations that are pure in E but where certain variables may be marked as only being allowed to unify with other variables. A theory clash subproblem f() = ?g(), is split into a disjunction of two subproblems each of which is a conjunction x = ?f() x = ?g() where x is a fresh variable. In one subproblem x is marked in the f equation and in the other subproblem x is marked in the g equation; either or both branches of the search may return solutions. Restricted unification is also used to break compound cycles. Because we do not handle nonregular theories, Boudet-style variable-elimination algorithms are unnecessary.

Boudet’s algorithm assumes that theories are disjoint, i.e., that they do not share function symbols. Because in Maude this is not quite true — identities can contain symbols from other theories — we need to handle a special kind of variable elimination. We illustrate the issue with the following example:

 
fmod UNIFICATION-CYCLE is
sort S .
vars X Y : S .
ops a b c d : -> S .
op f : S S -> S [assoc comm id: g(c, d)] .
op g : S S -> S [assoc comm id: f(a, b)] .
endfm

 
Maude> unify X =? f(Y, a, b) /\ Y =? g(X, c, d) .

Here the unification problem would already be in solved form but for the compound cycle formed by the X and Y variables. Restricted unification cannot break this cycle, since neither of the righthand sides can collapse out of their theory. However, putting Y = g(c, d) eliminates Y from the first equation yielding X = f(a, b) which eliminates X from the second equation, thus yielding a solution. This situation is somewhat pathological in Maude programs, and we do not really care about performance in its handling. Maude handles it by looking for this kind of cyclic dependency between theories when the signature is preprocessed and setting a flag so that a brute force variable elimination algorithm will be used to try and break compound cycles at unification time.

13.7.2 Combining incomplete unification algorithms

The unification infrastructure now supports the notion of incomplete unification algorithms (see Section 13.4.6). This has several major components. Firstly the algorithm for combining unification algorithms now prioritizes theories that have complete algorithms and amongst those, prioritizes theories that are expected to have fewer partial solutions. The idea of this heuristic is to minimize branching and when a unification problem with an incomplete unification algorithm has to be solved, to have more information on shared variables (ideally they would be bound to terms in collapse-free theories), in order to minimize the likelihood of encountering an incomplete case. Secondly, when different equations on the same theory are given as a unification problem, equations with a lower likelihood of encountering an incomplete case are prioritized. Thirdly, when incompleteness does occur it is tracked through the various levels of unification calls so that appropriate warnings can be issued.

13.7.3 Diophantine basis element selection

We solve restricted simultaneous ACU unification using an extension of the simultaneous AC unification algorithm in [16]. For an ACU function symbol f we are presented with a set of flattened pure equations that take the form f(x1p1,,xnpn) = ?f(y1q1,,ymqm) or x1 = ?f(y1q1,,ymqm). Each f-equation yields a Diophantine equation p1X1 + ⋅⋅⋅ + pnXn = q1Y 1 + ⋅⋅⋅ + qmY m or respectively, X1 = q1Y 1 + ⋅⋅⋅ + qmY m where the Xi’s and Y i’s are non-negative Diophantine variables. If an original variable is marked in some equation, the corresponding Diophantine variable receives an upper-bound of 1. Also, we may be able to obtain an upper-bound from order-sorting information, using the signature analysis technique in [57].

The general solution to a set of non-negative Diophantine equations is a set of basis elements from which all solutions can be obtained by linear combination. Upper-bound information may trivially eliminate some basis elements from consideration and can be used by the Diophantine solver to terminate the search for basis elements early.

A fresh variable zk is allocated for each basis element αk and unifiers are formed by finding sets of basis elements that satisfy certain properties and constructing assignments xi f(,zkαk,i,) where k ranges over the indices of the selected basis elements and αk,i is the value of Xi in the basis element αk.

The criteria for choosing the sets of basis elements is the key difference between AC unification, ACU unification, and restricted ACU unification. With AC unification, every selection of basis elements whose sum yields a nonzero value for each Xi and Y i must be considered. With ACU unification that requirement is lifted because of the availability of an identity element. The identity element also means that any assignment including basis element αk generalizes the same assignment with αk removed by assigning the identity element to zk and thus there is a single most general solution, formed by selecting all the basis elements.

In the case of restricted ACU unification, we may have upper-bounds on variables because they are marked. In Maude, order-sorted considerations may place upper-bounds on variables, and may also place a lower-bound of 1 on variables where the corresponding original variable has a sort that cannot take the identity element. In order to find a complete set of unifiers we need to find all maximal sets of basis elements whose sum satisfies the upper and lower-bounds on the variables.

Several explicit schemes for searching the subsets of basis elements were tried but the search was typically the dominant cost for ACU unification, often rendering the solution of quite modest unification problems impractical. In the current implementation this search is performed symbolically using a Binary Decision Diagram (BDD) [20] based algorithm. A BDD variable is allocated for each basis element, whose value, true or false, denotes whether the basis element is included in the subset. A BDD, called legal, is constructed, which evaluates to true on exactly those valuations that correspond to selections of basis elements that satisfy the upper- and lower-bound constraints on each Diophantine variable. Enforcement of the upper-bounds on the sum is done using dynamic programming and the BDD ite operation. Using the BDD legal, a second BDD, called maximal, is constructed which is true on exactly those valuations where legal is true, and changing a false into a true makes legal false. These valuations of the BDD variables and thus the subsets of basis elements they encode are then recovered by tracing the paths from the root to the true terminal in maximal. This method yielded a dramatic speed up (from hours to milliseconds) on problems of useful size.