## Chapter 12Unification and Variant Generation

### 12.1 Introduction

Unification is the solving of equations, either in free algebras of the form TΣ(X), or in relatively free algebras modulo a set E of equations, that is, in algebras of the form TΣ∕E(X). The first case is sometimes called syntactic unification. The second case is sometimes called E-unification or unification modulo E; if E is not explicitly mentioned, then it is called equational unification or semantic unification.

In solving any equation, such as, for example, we look for instances of the variables appearing in the equation that make both sides equal. Variables can of course be instantiated by substitutions. A substitution that makes both sides of the equation equal, that is, a solution of the equation, is called a unifier. For example, if we are solving the above equation in the free algebra TΣ(X) with X a countable set of variables and with Σ having a single sort (unsorted unification), the substitution σ = {x g(y),z h(y)} is a unifier, and indeed the so-called most general unifier, so that for any other unifier ρ there exists a substitution μ such that ρ = σ;μ, where σ;μ denotes composition of substitutions in diagrammatic order. That is, any other solution of the equation is an instance of the most general solution provided by σ.

Of course, some equations may not have syntactic unifiers, but may have semantic unifiers modulo some equations E. Consider, for example, the equation which obviously does not have any solution in TΣ(X). It does, however, have a solution in TΣ∕C(X), where C is the commutativity axiom f(x,y) = f(y,x). Indeed, the exact same substitution σ solving the first equation f(x,h(y)) = f(g(y),z) in a syntactic way, is now a unifier solving the second equation f(h(y),x) = f(g(y),z) modulo C, because we have f(h(y),g(y)) = Cf(g(y),h(y)).

Unification is a fundamental deductive mechanism used in many automated deduction tasks (see Section 12.6 for a discussion of some of them). It is also very important in combining the paradigms of functional programming and logic programming (in the Prolog sense of “logic programming”). Furthermore, in the context of Maude, unification can be very useful to reason not only about equational theories (functional modules or theories), but also, as explained in Section 12.6.2, about rewrite theories (system modules or theories).

Therefore, it is very useful to have an efficient implementation of unification available in Core Maude, which is what this chapter describes. Specifically, we explain both how order-sorted unification modulo frequently occurring equational axioms and variant-based unification modulo convergent equational theories are supported in Maude.

### 12.2 Order-sorted unification

Although the most general equational theories supported by Maude are membership equational theories, to obtain practical unification algorithms, allowing us to effectively compute the solutions of an equational unification problem, it is important to restrict ourselves to order-sorted equational theories. We can define the basic concepts of order-sorted E-unification in full generality.

Given an order-sorted equational theory (Σ,E), an E-unification problem consists of a nonempty set of unification equations of the form t = ?t, written in the notation where n 1 and the “conjunction” operator is assumed to be associative and commutative.

Given such an E-unification problem, an E-unifier for it is an order-sorted substitution1 σ : Vars(t1,t1,,tn,tn)-→TΣ(X) (where we assume that the set X of variables contains a countable number of variables for each sort) such that, for all i = 1,,n, where Y i = Vars(σ(ti)(ti)), that is, all the equations (Y i) σ(ti) = σ(ti) can be deduced in (membership) equational logic from the set of equations E.

A set of unifiers is called a complete set of E-unifiers for a given E-unification problem t1 = ?t1 tn = ?tn iff for any other E-unifier ρ of the same E-unification problem there exists a substitution μ and a unifier σ such that ρ = Eσ;μ, that is, for each variable x in the domain of ρ we have E ρ(x) = μ(σ(x)). A complete set of E-unifiers is called minimal if any proper subset of fails to be complete.

For an order-sorted equational theory (Σ,E), unification is said to be finitary if for any E-unification problem there is always a finite complete set of unifiers . Similarly, unification for (Σ,E) is called unitary if it is finitary and for any E-unification problem a minimal complete set of unifiers is always either empty or a singleton set. We say that (Σ,E) has a unification algorithm if there is an algorithm generating a complete set of E-unifiers for any E-unification problem in (Σ,E).

Unlike unsorted syntactic unification, which always either fails or has a single most general unifier, order-sorted syntactic unification is not necessarily unitary, that is, there is in general no single most general unifier. What exists (if Σ is finite) is a finite minimal complete set of syntactic unifiers. For some commonly occurring theories having a unification algorithm, such as the theory A of associativity of a binary function symbol, it is well-known that unification is not finitary: in general an infinite number of solutions may exist. However, for other theories, such as commutativity or associativity-commutativity, unification is finitary, both when Σ is unsorted and when Σ is order-sorted (and finite).

#### 12.2.1 A hybrid approach to equational order-sorted unification

We define a hybrid approach, in which we take advantage of Maude’s structuring of a module’s equations into equational axioms Ax, such as associativity, and/or commutativity, and/or identity, and equations E, which are assumed to be coherent,2 confluent, and terminating modulo Ax. We can then consider order-sorted equational theories of the form (Σ,E Ax) and decompose the E Ax-unification problem into two problems: one of Ax-unification, and another of E Ax-unification that uses an Ax-unification algorithm as a subroutine. This decomposition, as well as a similar one for membership equational theories, is explained in Section 12.6. The point of this decomposition is that Ax-unification needs to be built-in at the level of Core Maude’s C++ implementation for efficiency purposes and E Ax-unification can then be built on top of Ax-unification. Since the axioms Ax are well-known and unification algorithms exist for them, the task of building in efficient Ax-unification algorithms, although highly nontrivial, becomes manageable. Maude 2.6 implemented E Ax-unification in Maude itself, but Maude 2.7 implements E Ax-unification also in Core Maude’s C++ level.

### 12.3 Theories currently supported

As mentioned in Section 12.2.1, a practical way of dealing with order-sorted equational unification is to consider order-sorted theories of the form (Σ,E Ax), with Ax a set of commonly occurring axioms, declared in Maude as equational attributes (see Section 4.4.1), and E the remaining equations specified with the eq or ceq keywords. We can then decompose the E Ax-unification problem into an Ax-unification problem and an E Ax-unification problem that uses an Ax-unification algorithm as a subroutine. In such a decomposition, the efficiency of the Ax-unification algorithm becomes crucial.

Maude currently provides an order-sorted Ax-unification algorithm for all order-sorted theories (Σ,E Ax) such that the order-sorted signature Σ is preregular modulo Ax (see Sections 3.8 and 14.2.5) and the axioms Ax associated to function symbols are as follows:

• there can be arbitrary function symbols and constants with no equational attributes;
• the iter equational attribute can be declared for some unary symbols;
• different equational attributes can be declared for some binary function symbols:
• the assoc attribute3 ; usually referred to as A (for associative),
• the comm attribute; usually referred to as C (for commutative),
• the assoc comm attributes; usually referred to as AC (for associative and commutative),
• the assoc comm id: attributes; usually referred to as ACU (for associative and commutative with identity, or unit),
• the comm id: attributes; usually referred to as CU (for commutative and identity, or unit),
• the id: attribute; usually referred to as U (for identity, or unit),
• the left id: attribute; usually referred to as Ul (for left identity, or unit), and
• the right id: attribute; usually referred to as Ur (for right identity, or unit),

but no other equational attributes must be given for such symbols.

Explicitly excluded are theories with a binary symbol declared with the idem attribute. However, both the lack of the id: attribute for associativity, and of the idem attribute in general can be avoided by observing that both these axioms have the finite variant property, so that, used as variant equations, they can be combined with all the other just-mentioned attributes by means of variant unification algorithms (see Section 12.10).

If we give to Maude a unification problem in a functional module of the form fmod ,Ax) endfm where Σ and Ax satisfy the above requirements, we get a complete4 set of order-sorted unifiers modulo the theory (Σ,Ax). If, instead, we give the same problem to Maude in the functional module fmod ,E Ax) endfm, then the equations E are ignored, and we get the same unifiers as for the module or theory obtained by omitting the equations E. Similarly, if we provide the same unification problem in a functional theory fth ,E Ax) endfth, a system module mod ,E Ax,R) endm or a system theory th ,E Ax,R) endth, we again get the same set of unifiers as for the theory (Σ,Ax). All this is consistent with the decomposition idea mentioned above: to deal with order-sorted E Ax-unification, other methods, that use the Ax-unification algorithm as a component, can later be defined as we explain in Section 12.6.

Maude is even more tolerant than this. The user can give to Maude a unification problem in a functional module (or functional theory, or system module, or system theory) of the form fmod,E M Ax Ax) endfm (or the analogous specification in the other cases), where (Σ,Ax) satisfies the conditions mentioned above, but M is an optional set of membership axioms (that is, (Σ,E M AxAx) can be a membership equational theory and not just an order-sorted theory), and the axioms Axviolate those requirements (as explained in conditions (i)–(iii) above). Then what will happen is:

1.
As before, the additional equations E (or rules R) are completely ignored, and the membership axioms M are likewise ignored.
2.
If a unification problem involves the occurrence of a symbol satisfying axioms Axat the root position of a non-ground subterm, the unification process will fail and a warning will be printed.
3.
If a unification problem involves the occurrence of symbols satisfying axioms Ax, but all such occurrences are always in ground subterms of the problem, then this special case of Ax Ax-unification is handled by Maude and the corresponding Ax Ax-unifiers are returned.

Furthermore, the functional module fmod ,E M Ax Ax) endfm (or the analogous functional theory or system module or theory) may import predefined modules such as BOOL or NAT, so that function symbols in such predefined modules can also be used in unification problems.

### 12.4 The unify command

Given a functional module or theory, or a system module or theory, ModId, the user can give to Maude a unification command of the form:

unify [ ] in ModId :
Term-1 =? Term’-1 /\ ... /\ Term-k =? Term’-k .

where k 1; n is an optional argument providing a bound on the number of unifiers requested, so that if the cardinality of the set of unifiers is greater than the specified bound, the unifiers beyond that bound are omitted; and ModId can be any module or theory declared in the current Maude session (as usual, if no module is mentioned, the current module is used).

For a simple example of syntactic order-sorted unification problem illustrating:

• the use of the unify command;
• the use of the predefined operator _^_ in the NAT module, representing exponentiation on natural numbers; and
• the, in general, non-unitary nature of order-sorted unification,

we can define the module

fmod UNIFICATION-EX1 is
protecting NAT .
op f : Nat Nat -> Nat .
op f : NzNat Nat -> NzNat .
op f : Nat NzNat -> NzNat .
endfm

and then give to Maude the following command:

Maude> unify f(X:Nat, Y:Nat) ^ B:NzNat =? A:NzNat ^ f(Y:Nat, Z:Nat) .

Solution 1
X:Nat --> #1:Nat
Y:Nat --> #2:NzNat
B:NzNat --> f(#2:NzNat, #3:Nat)
A:NzNat --> f(#1:Nat, #2:NzNat)
Z:Nat --> #3:Nat

Solution 2
X:Nat --> #1:NzNat
Y:Nat --> #2:Nat
B:NzNat --> f(#2:Nat, #3:NzNat)
A:NzNat --> f(#1:NzNat, #2:Nat)
Z:Nat --> #3:NzNat

The next example in the same module illustrates the use of the unify command with a unification problem consisting of two equations:

Maude> unify f(X:Nat, Y:NzNat) =? f(Z:NzNat, U:Nat)
/\ V:NzNat =? f(X:Nat, U:Nat) .

Solution 1
X:Nat --> #1:NzNat
Y:NzNat --> #2:NzNat
Z:NzNat --> #1:NzNat
U:Nat --> #2:NzNat
V:NzNat --> f(#1:NzNat, #2:NzNat)

Note that, as already mentioned, we could instead invoke the unify command in a functional or system module or theory having additional equations, memberships, or rules, which are always ignored. For example, we could have instead declared the system theory

th UNIFICATION-EX2 is
protecting NAT .
op f : Nat Nat -> Nat .
op f : NzNat Nat -> NzNat .
op f : Nat NzNat -> NzNat .
eq f(f(N:Nat, M:Nat), K:Nat) = f(N:Nat, M:Nat) .
rl f(N:Nat, M:Nat) => 0 .
endth

so that, if we give again the same unify commands above, we will obtain exactly the same sets of order-sorted unifiers as for the UNIFICATION-EX1 module.

The above examples illustrate a further point about the form of the returned unifiers, namely, that in each assignment X --> t in a unifier, the variables appearing in the term t are always fresh variables of the form #n:Sort. The user is required not to use variables of this form in the submitted unification problem. A warning is printed if this requirement is violated:

Maude> unify in NAT : X:Nat ^ #1:Nat =? #2:Nat .
Warning: unsafe variable name #1:Nat in unification problem.

#### 12.4.1 Non-supported unification examples

The handling of unification problems in modules with operators whose equational axioms are excluded from the current unification algorithm can be illustrated by means of the following module:

fmod UNIFICATION-EX3 is
protecting NAT .
op f : Nat Nat -> Nat [idem] .
endfm

As already mentioned, a unification problem using such an idempotent function symbol f in a non-ground subterm will mean that the unification attempt fails and a warning is issued:

Maude> unify f(f(X:Nat, Y:Nat), Z:Nat) =? f(A:Nat, B:Nat) .
Warning: Term f(X:Nat, Y:Nat, Z:Nat) is non-ground and unification
for its top symbol is not currently supported.

Instead, if all symbols satisfying unsupported equational axioms Axonly appear in ground subterms of the unification problem, the unification attempt succeeds with the correct set of order-sorted Ax Ax-unifiers:

Maude> unify X:Nat + f(f(41, 42),43) =? Y:Nat + f(41,f(42,43)) .

Solution 1
X:Nat --> #1:Nat + f(41, f(42, 43))
Y:Nat --> #1:Nat + f(f(41, 42), 43)

Solution 2
X:Nat --> f(41, f(42, 43))
Y:Nat --> f(f(41, 42), 43)

Note, however, that, as already mentioned, unification modulo the idem attribute can be achieved by variant-based unification using the explicit equation f(N:Nat,N:Nat) = N:Nat (see Section 12.10).

#### 12.4.2 Associative-commutative (AC) unification examples

The use of a bound on the number of unifiers, as well as the use of the associative-commutative (AC) operator + in the predefined NAT module (see Section 7.2), plus the fact that even small AC-unification problems can generate a large number of unifiers are all illustrated by the following command:

Maude> unify  in NAT :
X:Nat + X:Nat + Y:Nat =? A:Nat + B:Nat + C:Nat .

Solution 1
X:Nat --> #1:Nat + #2:Nat + #3:Nat + #5:Nat + #6:Nat + #8:Nat
Y:Nat --> #4:Nat + #7:Nat + #9:Nat
A:Nat --> #1:Nat + #1:Nat + #2:Nat + #3:Nat + #4:Nat
B:Nat --> #2:Nat + #5:Nat + #5:Nat + #6:Nat + #7:Nat
C:Nat --> #3:Nat + #6:Nat + #8:Nat + #8:Nat + #9:Nat
...

Solution 100
X:Nat --> #1:Nat + #2:Nat + #3:Nat + #4:Nat
Y:Nat --> #5:Nat
A:Nat --> #1:Nat + #1:Nat + #2:Nat
B:Nat --> #2:Nat + #3:Nat
C:Nat --> #3:Nat + #4:Nat + #4:Nat + #5:Nat

The following unification command in the predefined CONVERSION module (see Section 7.9) illustrates a further point on the handling of built-in constants and functions. Built-in constants, even though infinite in number (all strings, all quoted identifiers, all natural numbers, and so on), are handled and can be used in unification problems. But built-in functions are not considered built-in for unification purposes; therefore, built-in evaluation of such functions is not performed during the unification.

Maude> unify in CONVERSION :
X:String < "foo" + Y:Char =?
Z:String + string(pi) < "foo" + Z:String .

Solution 1
X:String --> #1:Char + string(pi)
Y:Char --> #1:Char
Z:String --> #1:Char

#### 12.4.3 Unification examples with the iter attribute

The following example illustrates the efficiency of order-sorted unification modulo the iter theory (in this example in combination with the comm theory). Consider the following module:

fmod ITER-EXAMPLE is
sorts NzEvenNat EvenNat OddNat NzNat Nat EvenInt OddInt NzInt Int .
subsorts OddNat < OddInt NzNat < NzInt < Int .
subsorts EvenNat < EvenInt Nat < Int .
subsorts NzEvenNat < NzNat EvenNat < Nat .

op 0 : -> EvenNat .

op s : EvenNat -> OddNat [iter] .
op s : OddNat -> NzEvenNat [iter] .
op s : Nat -> NzNat [iter] .
op s : EvenInt -> OddInt [iter] .
op s : OddInt -> EvenInt [iter] .
op s : Int -> Int [iter] .

op _+_ : Int Int -> Int [comm gather (E e)] .
op _+_ : OddInt OddInt -> EvenInt [ditto] .
op _+_ : EvenInt EvenInt -> EvenInt [ditto] .
op _+_ : OddInt EvenInt -> OddInt [ditto] .
op _+_ : Nat Nat -> Nat [ditto] .
op _+_ : Nat NzNat -> NzNat [ditto] .
op _+_ : OddNat OddNat -> NzEvenNat [ditto] .
op _+_ : NzEvenNat EvenNat -> NzEvenNat [ditto] .
op _+_ : EvenNat EvenNat -> EvenNat [ditto] .
op _+_ : OddNat EvenNat -> OddNat [ditto] .
endfm

We can then give the unification commands:

Maude> unify in ITER-EXAMPLE :
s^1000000(X:OddNat) =? s^100000000001(Y:Int) .
Decision time: 1ms cpu (1ms real)

Solution 1
X:OddNat --> s^99999000001(#1:EvenNat)
Y:Int --> #1:EvenNat

and

Maude> unify in ITER-EXAMPLE :
s^1000000(X:OddNat) =? s^100000000001(Y:Int + Z:Int + W:Int) .
Decision time: 2ms cpu (5ms real)

Solution 1
X:OddNat --> s^99999000001(#1:OddNat + (#2:OddNat + #3:EvenNat))
W:Int --> #1:OddNat
Z:Int --> #2:OddNat
Y:Int --> #3:EvenNat

Solution 2
X:OddNat --> s^99999000001(#1:OddNat + (#2:EvenNat + #3:OddNat))
W:Int --> #1:OddNat
Z:Int --> #2:EvenNat
Y:Int --> #3:OddNat

Solution 3
X:OddNat --> s^99999000001(#1:EvenNat + (#2:OddNat + #3:OddNat))
W:Int --> #1:EvenNat
Z:Int --> #2:OddNat
Y:Int --> #3:OddNat

Solution 4
X:OddNat --> s^99999000001(#1:EvenNat + (#2:EvenNat + #3:EvenNat))
W:Int --> #1:EvenNat
Z:Int --> #2:EvenNat
Y:Int --> #3:EvenNat

As already mentioned, assuming that no bound on the number of unifiers is specified by the user, Maude will always compute a complete set of order-sorted unifiers modulo Ax, for Ax the supported equational axioms described in Section 12.3. However, there is no guarantee that the computed set of unifiers is minimal, that is, some of the unifiers in the computed set may be redundant, since they could be obtained as instances (modulo Ax) of other unifiers in the set.

#### 12.4.4 Associative-commutative with identity (ACU) unification examples

To illustrate the use of the unification command in the presence of ACU operators, let us consider yet another version of the vending machine example (first presented in Section 5.1 and in other sections of this document in different forms):

mod UNIF-VENDING-MACHINE is
sorts Coin Item Marking Money State .
subsort Coin < Money .
op empty : -> Money .
op __ : Money Money -> Money [assoc comm id: empty] .
subsort Money Item < Marking .
op __ : Marking Marking -> Marking [assoc comm id: empty] .
op <_> : Marking -> State .
ops \$ q : -> Coin .
ops a c : -> Item .
var M : Marking .
rl [buy-c] : < M \$ > => < M c > .
rl [buy-a] : < M \$ > => < M a q > .
eq [change]: q q q q = \$ .
endm

We can ask whether there is an equational unifier of two configurations, one containing at least two quarters, and another containing at least one dollar.

Maude> unify in UNIF-VENDING-MACHINE :
< q q X:Marking > =? < \$ Y:Marking > .

Solution 1
X:Marking --> \$
Y:Marking --> q q

Solution 2
X:Marking --> \$ #1:Marking
Y:Marking --> q q #1:Marking

Notice that the computed set of unifiers is not minimal because the first solution is the instance of the second obtained by substituting the variable #1:Marking with the constant empty. The Maude implementation tries to eliminate redundant unifiers when this is easy to detect; but for efficiency reasons it does not always return a minimal set of such unifiers. However, Maude computes a minimal set of unifiers modulo axioms when invoked via the variant-based unification command of Section 12.9 below; for instance, take the previous command above and add the word “variant” at the beginning to obtain the minimal set (see Section 12.9 for details on how to invoke the command):

Maude> variant unify in UNIF-VENDING-MACHINE :
< q q X:Marking > =? < \$ Y:Marking > .

Solution 1
X:Marking --> \$ #1:Marking
Y:Marking --> q q #1:Marking

Recall that memberships are discarded completely. For instance, if we modify the previous example to include a membership definition for a new sort Quarter, any unification call with that sort may not succeed.

mod UNIF-VENDING-MACHINE-MB is
sorts Coin Item Marking Money State .
subsort Coin < Money .
op empty : -> Money .
op __ : Money Money -> Money [assoc comm id: empty] .
subsort Money Item < Marking .
op __ : Marking Marking -> Marking [assoc comm id: empty] .
op <_> : Marking -> State .
ops \$ q : -> Coin .
ops a c : -> Item .

sort Quarter .
subsort Quarter < Coin .
mb q : Quarter .

var M : Marking .
rl [buy-c] : < M \$ > => < M c > .
rl [buy-a] : < M \$ > => < M a q > .
eq [change]: q q q q = \$ .
endm

We can ask whether there is an equational unifier of two configurations, one containing at least two quarters, and another containing two quarters and a dollar, but it fails:

Maude> unify in UNIF-VENDING-MACHINE-MB :
< q q X:Marking > =? < \$ Y:Quarter Z:Quarter > .

No unifier.

despite the fact that instantiating both Y and Z to q is part of a solution in the unification call above. The reason is that the membership is not used during ACU unification and therefore the algorithm unification treats the sort Quarter as empty.

#### 12.4.5 Unification examples with an identity symbol

Let us illustrate the use of the different combinations of the identity attribute for unification. Let us consider first a module using the left-id attribute.

mod LEFTID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (E e) left id: e] .
ops a b c d e : -> Elem .
endm

Then the following two unification problems have a different meaning, where we have swapped the position of the variables. First, when we unify two terms where variables of sort Magma are at the left of the terms, we have both a syntactic unifier and a unifier modulo identity.

Maude> unify in LEFTID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> #1:Magma a
Y:Magma --> #1:Magma

When the variables are instead at the right side of the terms of sort Magma, there is clearly no unifier, since the term a a Y:Magma is parsed as (a a) Y:Magma in module LEFTID-UNIFICATION-EX due to the attribute gather (E e) (see Section 3.9).

Maude> unify in LEFTID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .
No unifier.

Consider now a similar module but for the right identity.

mod RIGHTID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (e E) right id: e] .
ops a b c d e : -> Elem .
endm

When we unify two terms where variables of sort Magma are at the left of the terms, there is clearly no unifier, since the term Y:Magma a a is parsed this time as Y:Magma (a a) in module RIGHTID-UNIFICATION-EX due to the attribute gather (e E) (see Section 3.9).

Maude> unify in RIGTHID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .
No unifier.

When the variables are instead at the right side of the terms of sort Magma, we have both a syntactic unifier and a unifier modulo identity:

Maude> unify in RIGTHID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> #1:Magma a
Y:Magma --> #1:Magma

Consider now a similar module but with the identity attribute.

mod ID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (E e) id: e] .
ops a b c d e : -> Elem .
endm

When we unify two terms where variables of sort Magma are at the left of the terms, we have both a syntactic unifier and a unifier modulo identity:

Maude> unify in ID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> #1:Magma a
Y:Magma --> #1:Magma

When the variables of sort Magma are instead at the right side of the terms of sort Magma, we only have a unifier modulo identity:

Maude> unify in ID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .

Solution 1
X:Magma --> a
Y:Magma --> e

And finally, when we add commutativity, we obtain slightly different results.

mod COMM-ID-UNIFICATION-EX is
sorts Magma Elem .
subsorts Elem < Magma .
op __ : Magma Magma -> Magma [gather (E e) comm id: e] .
ops a b c d e : -> Elem .
endm

When we unify two terms where variables of sort Magma are at the left of the terms, we have both a syntactic unifier and a unifier modulo identity and commutativity, but the latter is duplicated:

Maude> unify in COMM-ID-UNIFICATION-EX : X:Magma a =? Y:Magma a a .

Solution 1
X:Magma --> a
Y:Magma --> e

Solution 2
X:Magma --> a #1:Magma
Y:Magma --> #1:Magma

Solution 3
X:Magma --> a
Y:Magma --> e

When the variables of sort Magma are instead at the right side of the terms of sort Magma, we have several unifiers modulo identity and commutativity:

Maude> unify in COMMID-UNIFICATION-EX : a X:Magma =? a a Y:Magma .

Solution 1
X:Magma --> a a
Y:Magma --> a

Solution 2
X:Magma --> a
Y:Magma --> e

Solution 3
X:Magma --> a
Y:Magma --> e

Note that the first solution is intriguing and is obtained by unifying terms (X:Magma a) and ((a a) Y:Magma).

#### 12.4.6 Associative (A) unification examples

In general, unification modulo associativity is not finitary. However the Maude implementation provides a finitary and complete set of unifiers for a substantial class of unification problems where associative unification happens to be finitary. In problems outside of this class, a finite but in general incomplete set of unifiers is returned, with an explicit warning that the set may be incomplete. Let us illustrate the use of the unification command in the presence of A function symbols with the different capabilities and limitations. One of the key ideas for having finitary unification modulo associativity is to have variables over associative symbols being linear, i.e., having only one occurrence among all the terms in a unification problem.

Consider a very simple module with a binary associative function symbol:

fmod UNIFICATION-EX4 is
protecting NAT .
sort NList .
subsort Nat < NList .
op _:_ : NList NList -> NList [assoc] .
endfm

A unification problem using such an associative function symbol returns five unifiers without any problem because the unification problem is linear.

Maude> unify in UNIFICATION-EX4 : X:NList : Y:NList : Z:NList =? P:NList : Q:NList .

Solution 1
X:NList --> #1:NList : #2:NList
Y:NList --> #3:NList
Z:NList --> #4:NList
P:NList --> #1:NList
Q:NList --> #2:NList : #3:NList : #4:NList

Solution 2
X:NList --> #1:NList
Y:NList --> #2:NList : #3:NList
Z:NList --> #4:NList
P:NList --> #1:NList : #2:NList
Q:NList --> #3:NList : #4:NList

Solution 3
X:NList --> #1:NList
Y:NList --> #2:NList
Z:NList --> #3:NList : #4:NList
P:NList --> #1:NList : #2:NList : #3:NList
Q:NList --> #4:NList

Solution 4
X:NList --> #1:NList
Y:NList --> #2:NList
Z:NList --> #3:NList
P:NList --> #1:NList : #2:NList
Q:NList --> #3:NList

Solution 5
X:NList --> #1:NList
Y:NList --> #2:NList
Z:NList --> #3:NList
P:NList --> #1:NList
Q:NList --> #2:NList : #3:NList

The unification problem may not be linear but it may be easy to detect that there is no unifier, e.g. it is impossible to unify a list X concatenated with itself with another list Y concatenated also with itself but with a natural number, e.g. 1, in between.

Maude> unify in UNIFICATION-EX4 : X:NList : X:NList =? Y:NList : 1 : Y:NList .
No unifier.

When associative variables are non-linear, Maude has implemented a cycle detection that may have different outcomes:

1.
There is a cycle, but no risk of an infinite set of most general unifiers. For example, because there is no solution.
Maude> unify in UNIFICATION-EX4 : 0 : Q:NList =? Q:NList : 1 .
No unifier.

Or there is a finite number of solutions

Maude> unify in UNIFICATION-EX4 : P:NList : P:NList =? 1 : Q:NList : 2 .

Solution 1
P:NList --> 1 : #1:NList : 2
Q:NList --> #1:NList : 2 : 1 : #1:NList

Solution 2
P:NList --> 1 : 2
Q:NList --> 2 : 1

2.
There is a cycle with risk of an infinite set of most general unifiers. In this case, a warning will be printed and the acyclic solutions are returned.
Maude> unify in UNIFICATION-EX4 : 0 : X:NList =? X:NList : 0 .
Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

Solution 1
X:NList --> 0
Warning: Some unifiers may have been missed due to incomplete
unification algorithm(s).

Note that the unification problem 0:X = ?X:0 has an infinite family of most general unifiers {X 0n} for 0n being a list of n consecutive 0 elements.

3.
There is a cycle but Maude cannot tell whether there is an infinite set of most general unifiers. In this case, a depth bound is imposed and solutions are enumerated down to that depth. If the depth bound is hit, a warning of incompleteness as the one shown above will be printed.
Maude> unify in UNIFICATION-EX4 :
X:NList : X:NList : X:NList =? Y:NList : Y:NList : Z:NList : Y:NList .
Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

Solution 1
X:NList --> #1:NList : #1:NList : #1:NList : #1:NList
Y:NList --> #1:NList : #1:NList : #1:NList
Z:NList --> #1:NList : #1:NList : #1:NList

Solution 2
X:NList --> #1:NList : #1:NList : #1:NList
Y:NList --> #1:NList : #1:NList
Z:NList --> #1:NList : #1:NList : #1:NList

Solution 3
X:NList --> #1:NList : #1:NList
Y:NList --> #1:NList
Z:NList --> #1:NList : #1:NList : #1:NList
Warning: Some unifiers may have been missed due to incomplete
unification algorithm(s).

If the verbose mode is activated (see Section 18.13), Maude will print different internal messages associated to the situations above:

• Associative unification using cycle detection.
• Associative unification algorithm detected an infinite family of unifiers.
• Associative unification using depth bound of 5.
• Associative unification algorithm hit depth bound.

### 12.5 Unification at the metalevel: metaUnify and metaDisjointUnify

Following the general Maude philosophy, all Maude functionality is as much as possible moved up to the metalevel, so that it becomes available by reflection (see Chapter 11). This is particularly important for unification for two reasons:

1.
Many of the formal reasoning applications of unification do require access to unification functions at the metalevel. Consider, for example, the computation of critical pairs to determine if a functional module is locally confluent. This will be done by a function that takes the metarepresentation of the given functional module as data, and then this function will have to invoke the unification function at the metalevel as part of its critical pair subcomputations.
2.
The unification algorithm is theory-dependent, since a different order-sorted unification algorithm is derived for each different signature Σ and combination of axioms Ax. Therefore, unification as a function, instead of as a user command, must necessarily be at the metalevel, since it must take the given theory (Σ,Ax) as a parameter.

Thus, the unification command is reflected in the META-LEVEL module (see Section 11.5) by two descent functions:

op metaUnify :
Module UnificationProblem Nat Nat ~> UnificationPair?
[special (...)] .

Module UnificationProblem Nat Nat ~> UnificationTriple?
[special (...)] .

These two metalevel functions work on unification problems constructed by means of the following signature:

sorts UnificandPair UnificationProblem .
subsort UnificandPair < UnificationProblem .
op _=?_ : Term Term -> UnificandPair [ctor prec 71] .
op _/\_ : UnificationProblem UnificationProblem -> UnificationProblem
[ctor assoc comm prec 73] .

The key difference between metaUnify and metaDisjointUnify is that the latter assumes that the variables in the left and righthand unificands are to be considered disjoint even when they are not so, and it generates each solution to the given unification problem not as a single substitution, but as a pair of substitutions, one for left unificands and the other for right unificands. This functionality is very useful for applications, such as critical-pair checking or narrowing, where a disjoint copy of the terms or rules involved must always be computed before unification is performed. Indeed, what the metaDisjointUnify operation avoids is precisely the need for explicitly computing such disjoint copies. The need for two substitutions in each solution is then obvious, since the terms in the given unification problem need not be made explicitly disjoint, but their (accidentally) common variables must be treated differently, as if they were disjoint.

Since it is convenient to reuse variable names from unifiers in new problems, for example in narrowing, this is allowed via the third argument, which is the largest number n appearing in a unificand variable of the form #n:Sort. Then the fresh variables in the computed unifiers will all be numbered from n + 1 on.

As is usual for descent functions, the last argument in the function is used to select which result is wanted, starting from 0. Caching is used so that if unifier i has just been returned, requesting unifier i + 1 gives rise to an incremental computation.

Results are returned using the following constructors:

subsort UnificationPair < UnificationPair? .
subsort UnificationTriple < UnificationTriple? .
op {_,_} : Substitution Nat -> UnificationPair [ctor] .
op {_,_,_} : Substitution Substitution Nat -> UnificationTriple [ctor] .

as appropriate for the descent function. The final Nat component is the largest n occurring in a fresh metavariable of the form #n:Sort. In this way, when we want to reuse variable names from unifiers, the next invocation of the function can use this parameter to make sure that the new variables generated are always fresh.

When no unifier with a given index exists the constant

op noUnifier : -> UnificationPair? [ctor] .

or, respectively, the constant

op noUnifier : -> UnificationTriple? [ctor] .

is returned as appropriate for the corresponding descent function.

Recall that for unification modulo associative symbols no finite set of unifiers may exist, yet a finite set is returned with a warning if the set may be incomplete (see Section 12.4.6). At the metalevel, the role of this warning is played by the constant:

op noUnifierIncomplete : -> UnificationPair? [ctor] .

or, respectively, the constant

op noUnifierIncomplete : -> UnificationTriple? [ctor] .

which is returned when a finite set of most general unifiers cannot be ensured.

We can illustrate the use of these metalevel functions with a few examples. The first one comes from the previous section, but moved up at the metalevel:

Maude> reduce in META-LEVEL :
metaUnify(upModule(’UNIFICATION-EX1, false),
’f[’X:Nat, ’Y:NzNat] =? ’f[’Z:NzNat, ’U:Nat] /\
’V:NzNat =? ’f[’X:Nat, ’U:Nat], 0, 0) .

result UnificationPair:
{’U:Nat <- ’#1:NzNat ;
’V:NzNat <- ’f[’#2:NzNat, ’#1:NzNat] ;
’X:Nat <- ’#2:NzNat ;
’Y:NzNat <- ’#1:NzNat ;
’Z:NzNat <- ’#2:NzNat, 2}

The second example shows that we can request fresh variables with arbitrarily large numbering:

Maude> reduce in META-LEVEL :
metaUnify(upModule(’NAT, false),
’_+_[’X:Nat,’Y:Nat] =? ’_+_[’A:Nat,’B:Nat],
100000000000000000000, 0) .

result UnificationPair:
{’A:Nat <- ’_+_[’#100000000000000000001:Nat,
’#100000000000000000002:Nat] ;
’B:Nat <- ’_+_[’#100000000000000000003:Nat,
’#100000000000000000004:Nat] ;
’X:Nat <- ’_+_[’#100000000000000000001:Nat,
’#100000000000000000003:Nat] ;
’Y:Nat <- ’_+_[’#100000000000000000002:Nat,
’#100000000000000000004:Nat],
100000000000000000004}

The following example shows a similar unification problem but with much smaller numberings in fresh variables, and now involving an invocation of metaDisjointUnify.

Maude> reduce in META-LEVEL :
’_+_[’X:Nat, ’Y:Nat] =? ’_+_[’X:Nat, ’B:Nat], 0, 0) .

result UnificationTriple: {
’X:Nat <- ’_+_[’#1:Nat, ’#2:Nat] ;
’Y:Nat <- ’_+_[’#3:Nat, ’#4:Nat],
’B:Nat <- ’_+_[’#1:Nat, ’#3:Nat] ;
’X:Nat <- ’_+_[’#2:Nat, ’#4:Nat], 4}

Yet another example shows how using variable names in unification problems with larger numbers than declared by the third argument generates a warning and no reduction.

Maude> reduce in META-LEVEL :
metaUnify(upModule(’NAT, false),
’_+_[’X:Nat,’Y:Nat] =? ’_+_[’#1:Nat,’Y:Nat], 0, 0) .

Warning: unsafe variable name #1:Nat in unification problem.

result [UnificationPair?]:
metaUnify(th ’NAT is
including ’NAT .
sorts none .
none
none
none
none
none
endth,
’_+_[’X:Nat, ’Y:Nat] =? ’_+_[’#1:Nat, ’Y:Nat], 0, 0)

And finally an example of incomplete unification for the associative case. If we move to the metalevel the unification problem with an infinite set of most general unifiers, 0:X = ?X:0, we get the first unifier of the family:

Maude> reduce in META-LEVEL :
metaUnify(upModule(’UNIFICATION-EX4, true),
’_:_[’0.Nat,’X:NList] =? ’_:_[’X:NList,’0.Nat], 0, 0) .

Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

result UnificationPair: {
’X:NList <- ’0.Zero,0}

but successive calls for unifiers get the constant noUnifierIncomplete:

Maude> reduce in META-LEVEL :
metaUnify(upModule(’UNIFICATION-EX4, true),
’_:_[’0.Nat,’X:NList] =? ’_:_[’X:NList,’0.Nat], 0, 1) .

result UnificationPair?: (noUnifierIncomplete).UnificationPair?

Note that we got the constant noUnifierIncomplete instead of the noUnifier constant, which is the output for the case of a finitary set of most general unifiers.

### 12.6 Some applications of unification

In this section we review briefly some applications that can be developed using a unification infrastructure like the one described in this chapter. We begin by discussing narrowing and narrowing-based unification algorithms. We also explain how narrowing modulo an equational theory can be used for reachability analysis of concurrent systems described by rewrite theories, and, more generally, for symbolic temporal logic model checking of such systems. We then discuss briefly other automated deduction applications, including theorem proving ones.

Chapter 16 contains more information about an implementation of narrowing in Full Maude following the ideas of Section 12.6.2 below.

#### 12.6.1 Narrowing-based unification

If we have a dedicated algorithm (as the one supported by Maude) to solve unification problems in an order-sorted theory (Σ,Ax), then we can use it as a component to obtain a unification algorithm for theories of the form (Σ,E Ax), provided the equations E are unconditional, coherent, confluent and terminating modulo Ax .

The technique used under such conditions to obtain an E Ax-unification algorithm from an Ax-unification algorithm is called narrowing, and is the obvious generalization of term rewriting to handle logical variables and perform a kind of symbolic execution. In ordinary term rewriting, if we want to apply a rewrite rule, say l r, to a term t at position p, the subterm t|p must be an instance of the lefthand side l, that is, there must be a substitution σ such that t|p = σ(l). Instead, in narrowing we can apply the rule l r at a non-variable position p in t, provided the unification problem t|p = ?l (where the variables of l and t are assumed disjoint) has a nonempty set of unifiers. For any such unifier θ we then narrow the original term t to the substitution instance under θ of t[r]p. We then write for such a narrowing step. For example, in the standard, unsorted specification of the natural numbers, we can use the equation x + s(y) = s(x + y) as a rewrite rule to narrow the term x′* (y+ z) at position 2 with substitution θ = {x y′′,y z′′,y y′′,z s(z′′)} to get the narrowing step In this example, θ is the most general unifier for the syntactic unification problem y+ z= ?x + s(y). However, in the same way as we can perform rewriting modulo a set of axioms Ax if we have an Ax-matching algorithm, we can likewise perform narrowing modulo a set Ax of axioms if we have an Ax-unification algorithm. That is, the unification problems t|p = ?l are now solved, not by syntactic unification, but by Ax-unification.

If a theory (Σ,E Ax) satisfies the above coherence, confluence, and termination modulo Ax requirements, we can systematically reduce E Ax-unification problems to narrowing problems as follows:

1.
we add a fresh new sort Truth to Σ with a constant tt;
2.
for each top sort of each connected component of sorts we add a binary predicate eq of sort Truth and add to E the equation eq(x,x) = tt, where x has such a top sort;
3.
we then reduce an E Ax-unification problem t = ?tto the narrowing reachability problem modulo Ax in the theory extending (Σ,E Ax) with these new sorts, operators, and equations, where E and the new equations are used as rewrite rules.

That is, we search for all narrowing paths modulo Ax from eq(t,t) to tt. Each such path then gives us a unifier of the equation t = ?t, just by composing the unifiers of each narrowing step in the path. As explained in , narrowing can be performed not only in order-sorted equational theories, but also in membership equational theories.

The just-described computation of E Ax-unifiers by narrowing modulo Ax yields a complete but in general infinite set of E Ax-unifiers. For the case when Ax = , some sufficient conditions are known ensuring termination of the basic narrowing strategy (see, e.g., ), and therefore ensuring that the complete set of E Ax-unifiers computed by basic narrowing is finite. However, for commonly occurring sets of axioms Ax, such as associativity-commutativity (AC), it is well-known that narrowing modulo AC “almost never terminates” and, furthermore, that narrowing strategies facilitating termination such as basic narrowing are incomplete . Based on the idea of “variants” in , a complete, yet quite efficient in terms of its search space, narrowing strategy modulo Ax called folding variant narrowing has been proposed in . Furthermore, in  sufficient checkable conditions on (Σ,E Ax) have been given ensuring that the E Ax-unification algorithm provided by folding variant narrowing modulo Ax is finitary.

In Maude 2.6, a narrowing library developed by Santiago Escobar implemented the folding variant narrowing as a component, making E Ax-unification available as part of Full Maude. Instead, Maude 2.7 implements E Ax-unification via folding variant narrowing directly in Core Maude’s C++ level (see Section 12.10).

#### 12.6.2 Symbolic reachability analysis in rewrite theories

A rewrite theory5 , say = (Σ,E Ax,R), specified in Maude as a system module, describes a concurrent system whose states are E Ax-equivalence classes of ground terms, and whose local concurrent transitions are specified by the rules R. When formally analyzing the properties of , an important problem is ascertaining for specific patterns t and tthe following symbolic reachability problem: with X the set of variables appearing in t and t, which for this discussion we may assume are a disjoint union of those in t and those in t. That is, t and tsymbolically describe sets of concurrent states [[t]] and [[t]] (namely, all the ground substitution instances of t, resp. t, or, more precisely, the E Ax-equivalence classes associated to such ground instances). And we are asking: is there a state in [[t]] from which we can reach a state in [[t]] after a finite number of rewriting steps?

For example, may specify a cryptographic protocol, t may symbolically describe a set of initial states, and tmay likewise describe a set of attack states. Then, if the above reachability question can be answered in the affirmative, the protocol is insecure against the kinds of attacks described by t. Furthermore, if the way of answering the reachability question is somehow constructive, we should be able to exhibit a concrete attack as a rewrite sequence violating the security of the protocol.

As explained in , provided the rewrite theory = (Σ,E Ax,R) is topmost (that is, all rewrites take place at the root of a term), or, as in the case of AC rewriting of object-oriented systems, is “essentially topmost,” and the rules R are coherent with E modulo Ax, narrowing with the rules R modulo the equations E Ax gives a constructive, sound, and complete method to solve reachability problems of the form X t-→*t, that is, such a problem has an affirmative answer if and only if we can find a finite narrowing sequence modulo E Ax of the form t *θ(t) for some θ. The method is constructive, because instantiating t with the composition of the unifiers for each step in the narrowing sequence gives us a concrete rewrite sequence witnessing the existential formula.

Of course, narrowing with R modulo E Ax requires performing E Ax-unification at each narrowing step. As explained in Section 12.6.1, E Ax-unification can itself be performed by narrowing with the equations E modulo Ax, provided E is coherent, confluent, and terminating modulo Ax. Therefore, in performing symbolic reachability analysis in a rewrite theory = (Σ,E Ax,R) there are usually two levels of narrowing and two levels of unification: narrowing with R modulo E Ax for reachability, and narrowing with E modulo Ax for unification purposes. Similarly, unification modulo E Ax is performed by narrowing, while unification modulo Ax is usually performed in a built-in way.

This is exactly the approach taken in the Maude-NPA protocol analyzer , where cryptographic protocols are formally specified as rewrite theories of the form = (Σ,E Ax,R), and the formal reachability analysis is performed in a backwards way, from an attack state to an initial state. This just means that we perform standard (forwards) reachability analysis with the rewrite theory -1 = (Σ,E Ax,R-1), where R-1 = {r-→l(l-→r) R}. The equational theory E Ax typically specifies the algebraic properties of the cryptographic functions used in the given protocol, for example, public key encryption and decryption, exclusive or, modular exponentiation, and so on. Reasoning modulo such algebraic properties is very important to gain high levels of assurance, since it is well-known that some cryptographic protocols that can be proved secure under the standard Dolev-Yao model, in which the cryptographic functions are treated as a “black box,” can actually be broken by an attacker that makes clever use of the algebraic properties of the cryptographic functions of the protocol. Besides using narrowing with rules modulo equations, the Maude-NPA tool uses several state space reduction techniques, including grammars that can describe sets of unreachable states that need not be explored , to drastically reduce the narrowing search space, often from an infinite set of states to a finite set of them, so that finite failure to find an attack becomes an actual proof of security.

Given a rewrite theory = (Σ,E Ax,R), we may be interested in verifying properties more general than existential questions of the form X t-→*t. Note that we can view such questions as questions about the violation of an invariant, because we can regard the set of states [[t]] as the complement of an invariant set of states, say I, which can be easily specified by an equationally-defined predicate. That is, proving the existential formula X t-→*tis the same thing as finding a counterexample for the assertion ,t I. This is just a temporal logic satisfaction assertion (see Chapter 10), but with the following nonstandard features: (i) the term t does not describe a single initial state, but a possibly infinite set [[t]] of initial states; and (ii) there is no guarantee that the set of reachable states is finite. Therefore, standard model-checking techniques may not be usable, because of a possible double infinity: in the number of initial states, and in the number of states reachable for each of those initial states. One can also generalize the above reachability question ,t I to questions of the form ,t φ, with φ a temporal logic formula. The papers  show how narrowing can be used (again, both at the level of transitions with rules R and at the level of equations E) to perform logical model checking to verify such temporal logic formulas; this is a a kind of symbolic model checking not in the binary decision diagram sense of “symbolic,” which still remains finite-state, but in a much more general sense in which possibly infinite sets of states are finitely described by patterns with logical variables.

#### 12.6.3 Other automated deduction applications

The automated deduction application par excellence, and the one that historically, thanks to Alan Robinson, gave rise to the unification notion is resolution-based theorem proving . Subsequent work by Gordon Plotkin  made it clear that not just syntactic unification, but unification modulo a set of equational axioms Ax is a very useful mechanism supporting theorem proving. Indeed, state-of-the-art resolution-based theorem provers routinely support unification modulo commonly occurring equational theories such as AC. Of course, the use of equational unification need not be restricted to resolution-based theorem provers. For example, the paper  shows how narrowing with sequent rules and equational unification can be used in a sequent-based theorem prover in which one can reason modulo both the equivalences given by the structural rules for sequents and also Boolean equivalences between formulas.

Yet another important application area is that of formal reasoning methods such as Knuth-Bendix equational completion (and its associated “inductionless induction” theorem-proving methods), checking local confluence of rewrite rules, and checking coherence of a set of rewrite rules with respect to a set of equations . In all these formal reasoning methods one needs to compute critical pairs by unification of a term with a subterm of another term. In particular, tools such as the Maude Church-Rosser Checker (CRC) and Coherence Checker (ChC) were before restricted to theories where the only equational axiom supported was commutativity. The present built-in support for unification modulo a wide set Ax of axioms (further extensible to identity axioms by narrowing as explained in Section 12.6.1) has made it possible to have much more general versions of the Maude Church-Rosser Checker and Coherence Checker tools to reason about the confluence and coherence of Maude specifications modulo equational axioms specified as equational attributes in Maude modules and theories .

### 12.7 Endogenous vs. exogenous order-sorted unification algorithms

The current Maude order-sorted unification algorithm modulo axioms Ax is what we might call an endogenous algorithm, in the sense that the computation of order-sorted unifiers is intimately integrated with the order-sorted reasoning process, so that unifiers that do not type under the order-sorted typing restrictions are never generated. This makes such an algorithm typically more efficient, because the order-sorted typing restrictions may drastically cut the number of generated unifiers, particularly modulo axioms such as AC where the number of unsorted unifiers can be very large. That is, order-sorted unification, even though it lacks the unitary property of unsorted syntactic unification and is in general more expensive than unsorted unification in the syntactic case, can often be more efficient in the modulo Ax case because of the drastic reductions that can be achieved by order-sorted typing restrictions in the number of Ax-unifiers. Moreover, even in the syntactic case, the efficiency of deductive processes that use order-sorted unification can substantially increase, because order-sorted unification will fail more often than unsorted unification, leading to smaller search spaces.

However, from the early papers on order-sorted unification such as, e.g.,  a more modular, although typically less efficient, approach to order-sorted unification, which we might call exogenous has been known. The basic idea is to reuse an existing unsorted unification algorithm modulo some axioms Ax (under some conditions on Ax) to compute order-sorted Ax-unifiers in the following way:

1.
type information is removed from the order-sorted Ax-unification problem to convert it into an unsorted Ax-unification problem;
2.
a complete set of unsorted Ax-unifiers is computed; and
3.
the order-sorted Ax-unifiers of the original problem are obtained from the unsorted ones by a process of filtering the unsorted unifiers through an order-sorted reasoning process, in which the sorts of the variables in the original problem are taken into account. Each order-sorted unifier thus obtained is always a specialization of a corresponding unsorted one, where the unsorted variables have been specialized to given sorts; however, some unsorted unifiers, perhaps many, may be filtered out by this process and have no corresponding order-sorted unifiers.

For a state-of-the art study of the exogenous approach, allowing very general axioms Ax and proving the correctness of an order-sorted inference system to generate the order-sorted unifiers from the unsorted ones, see .

Both the endogenous and the exogenous approaches have their own advantages and disadvantages. The endogenous approach is more efficient, but it requires dedicated algorithms and implementations, so that unsorted unification algorithms and tools cannot be reused. The exogenous algorithms are less efficient because: (i) they can generate many unifiers that may later be discarded; (ii) a separate order-sorted filtering process is needed; and (iii) changes of representation, and even parsing, are required between unsorted and order-sorted representations (particularly when existing unsorted algorithms are reused). However, they are more modular and flexible, so that one can with relatively little effort obtain an order-sorted unification algorithm from an unsorted one.

In Maude we have experimented with, and benefited from, both an exogenous algorithm and the current endogenous one. The exogenous algorithm was developed in collaboration with Evelyn Contejean and Claude Marché from Université Paris-Sud, and involved also the efforts of Prasanna Thati and Joe Hendrix at UIUC. It reused the rich library of unsorted unification algorithms modulo axioms of the CiME system , which could be called from Maude in an experimental version. Inside Maude, it used the order-sorted inference system to compute order-sorted unifiers developed by Joe Hendrix and described in .

This exogenous algorithm has been extensively used in a previous version of the Maude-NPA tool, and has been shown effective in finding attacks to cryptographic protocols modulo nontrivial equational theories of the form E Ax . The exogenous algorithm has also been extremely useful in testing the endogenous one. Because of the large number of unifiers generated and the complex nature of semantic unification algorithms, their testing is a nontrivial matter, and the automation of such testing is quite difficult. Thanks to the exogenous algorithm, and through the efforts of Ralf Sasse and Santiago Escobar, it has been possible to generate large numbers of random unification problems of different sizes in which the sets of unifiers generated by the exogenous and endogenous order-sorted unification algorithms have been automatically compared. This testing uncovered several bugs in an earlier alpha version of the Maude endogenous algorithm, and has also served to evaluate in practice the greater efficiency of the endogenous algorithm developed by Steven Eker.

### 12.8 Some notes on the implementation of unification

Order-sorted unification is NP-complete in general because Boolean algebra can be encoded as an order-sorted free theory signature and hence satisfiability can be reduced to an order-sorted free theory unification problem. In practice, reasonable performance can be obtained using a binary decision diagram technique to compute sorts for free variables occurring in unsorted unifiers. Furthermore in the AC case, sort information can be pushed into the unsorted unification algorithm and used to prune the Diophantine basis and the choice of subsets drawn from such a basis .

The unification theory combination framework and AC-unification algorithm are based on , while the Diophantine system solver used by the AC algorithm is based on . The unification algorithm has been thoroughly tested by Santiago Escobar and Ralf Sasse using CiME  as an oracle, and has shown better average performance than CiME on the same problems.

The addition of ACU to the theories handled by the dedicated unification algorithm in Maude required substantial changes to the unification infrastructure implemented in previous versions of Maude for C and AC theories because of the problems associated with collapse theories. In this section we give an overview of the techniques used and highlight a novel algorithm for selecting sets of Diophantine basis elements during the computation of ACU unifiers.

#### 12.8.1 Combining unification algorithms

The basic approach to solving unification problems where function symbols are drawn from more than one theory is variable abstraction where alien subterms, i.e., subterms headed by a symbol from a theory different from that of the top symbol of the parent term, are replaced by fresh variables to form pure unification subproblems which only involve variables and function symbols from a single theory and which can be passed to a unification algorithm for such a theory. Proving termination of combinations of algorithms is nontrivial, as variables are necessarily shared between theories and the unification of variables in one theory can create new unification subproblems in another theory, potentially ad infinitum. Stickel’s algorithm , which combined the AC and free theories, required an elaborate termination proof by Fages . Boudet et al.  proposed a much simpler approach where all unification subproblems and variable bindings in a given theory are solved (and re-solved if another subproblem in that theory is created) simultaneously. This method requires a simultaneous E-unification algorithm for each theory E and was the method implemented in Maude for C, AC, and prior to the addition of ACU.

Collapse theories add two major complications to the combination of unification algorithms. Firstly, theory clashes where two terms with top symbols from different theories are required to unify can no longer be treated as a failure, since if one of the top symbols belongs to a collapse theory, a collapse may occur, yielding solutions. Secondly, compound cycles, that is, problems of the form x1 = ?t1(,x2,),x2 = ?t2(,x3,),,xn = ?tn(,x1,) where the terms ti are pure in different theories, can no longer be treated as failure, since solutions may be possible via collapse.

Several authors have proposed combination schemes that can handle collapse theories. We use a simplified version of an algorithm due to Boudet . The original algorithm also handles nonregular theories but we omit that capability to simplify the implementation. The key idea is that each theory E needs a restricted simultaneous E-unification algorithm which solves the simultaneous unification problem for pure equations that are pure in E but where certain variables may be marked as only being allowed to unify with other variables. A theory clash subproblem f() = ?g(), is split into a disjunction of two subproblems each of which is a conjunction x = ?f() x = ?g() where x is a fresh variable. In one subproblem x is marked in the f equation and in the other subproblem x is marked in the g equation; either or both branches of the search may return solutions. Restricted unification is also used to break compound cycles. Because we do not handle nonregular theories, Boudet-style variable-elimination algorithms are unnecessary.

Boudet’s algorithm assumes that theories are disjoint, i.e., that they do not share function symbols. Because in Maude this is not quite true — identities can contain symbols from other theories — we need to handle a special kind of variable elimination. We illustrate the issue with the following example:

fmod UNIFICATION-CYCLE is
sort S .
vars X Y : S .
ops a b c d : -> S .
op f : S S -> S [assoc comm id: g(c, d)] .
op g : S S -> S [assoc comm id: f(a, b)] .
endfm

Maude> unify X =? f(Y, a, b) /\ Y =? g(X, c, d) .

Here the unification problem would already be in solved form but for the compound cycle formed by the X and Y variables. Restricted unification cannot break this cycle, since neither of the righthand sides can collapse out of their theory. However, putting Y = g(c, d) eliminates Y from the first equation yielding X = f(a, b) which eliminates X from the second equation, thus yielding a solution. This situation is somewhat pathological in Maude programs, and we do not really care about performance in its handling. Maude handles it by looking for this kind of cyclic dependency between theories when the signature is preprocessed and setting a flag so that a brute force variable elimination algorithm will be used to try and break compound cycles at unification time.

#### 12.8.2 Combining incomplete unification algorithms

The unification infrastructure now supports the notion of incomplete unification algorithms (see Section 12.4.6). This has several major components. Firstly the algorithm for combining unification algorithms now prioritizes theories that have complete algorithms and amongst those, prioritizes theories that are expected to have fewer partial solutions. The idea of this heuristic is to minimize branching and when a unification problem with an incomplete unification algorithm has to be solved, to have more information on shared variables (ideally they would be bound to terms in collapse-free theories), in order to minimize the likelihood of encountering an incomplete case. Secondly, when different equations on the same theory are given as a unification problem, equations with a lower likelihood of encountering an incomplete case are prioritized. Thirdly, when incompleteness does occur it is tracked through the various levels of unification calls so that appropriate warnings can be issued.

#### 12.8.3 Diophantine basis element selection

We solve restricted simultaneous ACU unification using an extension of the simultaneous AC unification algorithm in . For an ACU function symbol f we are presented with a set of flattened pure equations that take the form f(x1p1,,xnpn) = ?f(y1q1,,ymqm) or x1 = ?f(y1q1,,ymqm). Each f-equation yields a Diophantine equation p1X1 + + pnXn = q1Y 1 + + qmY m or respectively, X1 = q1Y 1 + + qmY m where the Xi’s and Y i’s are non-negative Diophantine variables. If an original variable is marked in some equation, the corresponding Diophantine variable receives an upper-bound of 1. Also, we may be able to obtain an upper-bound from order-sorting information, using the signature analysis technique in .

The general solution to a set of non-negative Diophantine equations is a set of basis elements from which all solutions can be obtained by linear combination. Upper-bound information may trivially eliminate some basis elements from consideration and can be used by the Diophantine solver to terminate the search for basis elements early.

A fresh variable zk is allocated for each basis element αk and unifiers are formed by finding sets of basis elements that satisfy certain properties and constructing assignments xi f(,zkαk,i,) where k ranges over the indices of the selected basis elements and αk,i is the value of Xi in the basis element αk.

The criteria for choosing the sets of basis elements is the key difference between AC unification, ACU unification, and restricted ACU unification. With AC unification, every selection of basis elements whose sum yields a nonzero value for each Xi and Y i must be considered. With ACU unification that requirement is lifted because of the availability of an identity element. The identity element also means that any assignment including basis element αk generalizes the same assignment with αk removed by assigning the identity element to zk and thus there is a single most general solution, formed by selecting all the basis elements.

In the case of restricted ACU unification, we may have upper-bounds on variables because they are marked. In Maude, order-sorted considerations may place upper-bounds on variables, and may also place a lower-bound of 1 on variables where the corresponding original variable has a sort that cannot take the identity element. In order to find a complete set of unifiers we need to find all maximal sets of basis elements whose sum satisfies the upper and lower-bounds on the variables.

Several explicit schemes for searching the subsets of basis elements were tried but the search was typically the dominant cost for ACU unification, often rendering the solution of quite modest unification problems impractical. In the current implementation this search is performed symbolically using a Binary Decision Diagram (BDD)  based algorithm. A BDD variable is allocated for each basis element, whose value, true or false, denotes whether the basis element is included in the subset. A BDD, called legal, is constructed, which evaluates to true on exactly those valuations that correspond to selections of basis elements that satisfy the upper- and lower-bound constraints on each Diophantine variable. Enforcement of the upper-bounds on the sum is done using dynamic programming and the BDD ite operation. Using the BDD legal, a second BDD, called maximal, is constructed which is true on exactly those valuations where legal is true, and changing a false into a true makes legal false. These valuations of the BDD variables and thus the subsets of basis elements they encode are then recovered by tracing the paths from the root to the true terminal in maximal. This method yielded a dramatic speed up (from hours to milliseconds) on problems of useful size.

### 12.9 Variants

As explained in Section 12.2, Maude 2.7.1 features order-sorted unification modulo axioms Ax, including associativity6 (A), commutativity (C), commutativity-identity (CU), associativity-commutativity (AC), associativity-commutativity-identity (ACU), and (left-, right- or two-sided) identity (Ul, Ur, U). However, order-sorted equational unification in full generality considers a decomposition of an equational theory (Σ,E Ax) into two problems: one of Ax-unification, and another of E Ax-unification that uses an Ax-unification algorithm as a subroutine. As explained in Section 12.6.1, algorithms for E Ax-unification have been extensively defined by using narrowing-based unification, where for a unification problem t = ?twe obtain the search space associated to narrowing the term eq(t,t) using E modulo Ax and search for all paths from eq(t,t) to the truth constant tt. However, we use the notion of variants of a term for generating such a narrowing search space.

Comon-Lundh and Delaune’s notion of variant  characterizes the instances of a term w.r.t. an equational theory E Ax such that the equations E are confluent, terminating, and coherent modulo axioms Ax. The E,Ax-variants of a term t are pairs (t), with θ a substitution and tthe E,Ax-canonical form of θ(t). A preorder relation of generalization that holds between such pairs provides a notion of most general variant and also of completeness of a set of variants. A complete set of E,Ax-variants (up to renaming) of a term t is a subset V of E,Ax-variants of t such that, for each substitution σ, there is a variant (t) V and a substitution ρ such that: (i) tis E,Ax-irreducible, (ii) σ(t)E,Ax = Axρ(t), and (iii) (σE,Ax)|Var(t) = Ax(θ;ρ)|Var(t).

In order to avoid clashing of algorithms and notions, we have decided that the equations used for variant generation (and variant-based unification) should be identifiable and clearly distinguished from standard equations in Maude. For this purpose we have defined a new attribute for equations: the word variant. This implies that if the user wants to use an equation t = t’ both for variant generation and for simplification, it should be duplicated: eq t = t’ . and eq t = t’ [variant] . No equation with the variant attribute can have the owise attribute. Note that what this allows is a greater flexibility at the operational level when combining variant generation and simplification: by the above method, an equation can be used for either purpose (declared only once in the appropriate way), or for both, by a double declaration.

For example, consider the following functional module defining the addition function _+_ on natural numbers built from 0 and s:

fmod NAT-VARIANT is
sort Nat .
op 0 : -> Nat [ctor] .
op s : Nat -> Nat [ctor] .
op _+_ : Nat Nat -> Nat .
vars X Y : Nat .
eq [base] : 0 + Y = Y [variant] .
eq [ind] :  s(X) + Y = s(X + Y) [variant] .
endfm

The term X + s(0) has an infinite number of variants w.r.t. those equations, i.e.,

• (s(0),{X 0}),
• (s(s(Y)),{X s(Y)}),
• (s(s(0)),{X s(0)}),
• (s(s(s(Y))),{X s(s(Y))}),
• (s(s(s(0))),{X s(s(0))}),

Indeed, there is no finite, complete, most general set of variants for that term. However, the term 0 + X has a finite number of most general variants w.r.t. those equations, i.e., (X,id). Obviously, there are many more variants, such as (0,{X 0}), but they are all instances of the most general one.

An equational theory E Ax has the finite variant property iff there is a finite complete set of most general variants for each term. This property also ensures the existence of a generic finitary E Ax-unification algorithm based on computing variants, as shown in Section 12.10. However, if a theory does not have the finite variant property, we are still able to incrementally enumerate all the variants of a term, as explained below in Section 12.9.3.

Consider the following equational theory for exclusive or.

fmod EXCLUSIVE-OR is
sorts Nat NatSet .
op 0 : -> Nat [ctor] .
op s : Nat -> Nat [ctor] .

subsort Nat < NatSet .
op mt : -> NatSet [ctor] .
op _*_ : NatSet NatSet -> NatSet [ctor assoc comm] .

vars X Y Z : [NatSet] .
eq [idem] :     X * X = mt    [variant] .
eq [idem-Coh] : X * X * Z = Z [variant] .
eq [id] :       X * mt = X    [variant] .
endfm

This theory has the finite variant property, as proved manually in  and automatically in . For instance, the term X * X has a finite set of most general variants, just (mt,id). And the term X * Y has also a finite, complete set of most general variants:

1.
(X * Y,id),
2.
(Z,{X mt,Y Z}),
3.
(Z,{X Z,Y mt}),
4.
(Z,{X Z * U,Y U}),
5.
(Z,{X U,Y Z * U}),
6.
(mt,{X U,Y U}), and
7.
(Z1 * Z2,{X U * Z1,Y U * Z2})

Note that if variable X in the equational theory is changed from sort [NatSet] to Nat (or [Nat]), then the theory does not have the finite variant property, since every pair of similar elements has to be separately eliminated, whereas now chunks of similar elements can be eliminated at once. Also, note that the symbol _*_ cannot be made ACU instead of AC, because then the equation X * X * Z = Z is not ACU-terminating.7

The finite variant property happens to be an undecidable problem . However, a new semi-decision procedure for checking the finite variant property has been developed which works well in practice: it has recently  been shown that, in order to prove the finite variant property for an equational theory (Σ,E Ax), it is enough to check, for each function symbol f Σ, whether or not each pattern of the form f(X1,,Xn) has a finite number of variants, where the Xi are distinct variables of the appropriate kind and n is the arity of f. This can be done by attempting to generate all the variants of f(X1,,Xn) as described in Section 12.9.2 below.

Variants are used for variant-based unification in Section 12.10, and such a variant-based unification is later used in Section 16.3 for symbolic reachability analysis. Before defining variant-based unification, in Section 12.9.1 we introduce the class of equational theories admissible for variant generation, and thus for variant-based unification. We also provide in Section 12.9.2 a command get variants for user generation of variants.

#### 12.9.1 Theories supported for variant generation

The equational theories that are admissible for variant generation are as follows. Let fmod,E Ax) endfm (resp. fth ,E Ax,R) endfth) be an order-sorted functional module (resp. functional theory) where E is a set of equations specified with the eq keyword and the attribute variant, and Ax is a set of axioms such that (Σ,Ax) satisfies the restrictions explained in Section 12.3. Furthermore, the equations E must satisfy the following extra conditions:

• The equations E are unconditional, confluent, terminating, sort-decreasing, and coherent modulo Ax (see also Section 15.5 for coherence details).
• An equation’s lefthand side cannot be a variable, and the owise feature is not allowed.

Any system module mod ,GE Ax,R) endm (of system theory th ,GE Ax,R) endth), where G is an additional set of equations (without the variant attribute!) and R is a set of rules, is also considered admissible for variant generation if the equational part (Σ,E Ax) satisfies the conditions described above. Note that when an equational theory (Σ,G E Ax) is entered into Maude, each equation in E (used for variant computation) must include the variant attribute. Note that equations in G do not have any restriction, i.e., they can be conditional equations, with the owise attribute, etc.

#### 12.9.2 The get variants command

Given a module ModId, Maude provides two variant generation commands of the form:

get variants [ ] in ModId : Term .
get irredundant variants [ ] in ModId : Term .

where n is an optional argument providing a bound on the number of variants requested, so that if the cardinality of the set of variants is greater than the specified bound, the variants beyond that bound are omitted; and, as usual, if no module is mentioned, the current module is used.

Maude allows an incremental generation of variants, as described in Section 12.9.3 below. When a theory does not have the finite variant property, an incremental generation of the (possibly infinite) set of most general variants would be returned by the first command get variants. However, the second command, get irredundant variants, is useful for theories that do have the finite variant property, since it will provide the set of most general variants of a term, which is the basis for variant-based unification in Section 12.10.

For example, we can check that the EXCLUSIVE-OR module above has the finite variant property by simply generating the variants for the exclusive-or symbol *.

Maude> get irredundant variants in EXCLUSIVE-OR : X * Y .

Variant #1
[NatSet]: #1:[NatSet] * #2:[NatSet]
X --> #1:[NatSet]
Y --> #2:[NatSet]

Variant #2
NatSet: mt
X --> %1:[NatSet]
Y --> %1:[NatSet]

Variant #3
[NatSet]: %1:[NatSet] * %3:[NatSet]
X --> %1:[NatSet] * %2:[NatSet]
Y --> %2:[NatSet] * %3:[NatSet]

Variant #4
[NatSet]: %1:[NatSet]
X --> %1:[NatSet] * %2:[NatSet]
Y --> %2:[NatSet]

Variant #5
[NatSet]: %2:[NatSet]
X --> %1:[NatSet]
Y --> %1:[NatSet] * %2:[NatSet]

Variant #6
[NatSet]: %1:[NatSet]
X --> mt
Y --> %1:[NatSet]

Variant #7
[NatSet]: %1:[NatSet]
X --> %1:[NatSet]
Y --> mt

No more variants.

The above example illustrates a difference between unifiers returned by the built-in unification modulo axioms and unifiers returned by variant generation or variant-based unification: two forms of fresh variables, the former #n:Sort and the new %n:Sort. The reasons for this distinction are irrelevant and connected to efficiency of algorithms. Both forms represent fresh variables and both share the same counter for new fresh variables. The user is required not to use variables of these two forms in submitted unification problems (either modulo axioms or variant-based). And when used at the metalevel, the counter for new fresh variables must take into account the numbers used for both forms of fresh variables.

Recall that memberships are discarded completely. For instance, we can modify the previous example to include a membership definition for a new sort Empty.

fmod EXCLUSIVE-OR-MB is
sorts Nat NatSet .
op 0 : -> Nat [ctor] .
op s : Nat -> Nat [ctor] .

subsort Nat < NatSet .
op mt : -> NatSet [ctor] .
op _*_ : NatSet NatSet -> NatSet [ctor assoc comm] .

sort Empty .
subsort Empty < NatSet .
mb mt : Empty .

vars X Y Z : [NatSet] .
eq [idem] :     X * X = mt    [variant] .
eq [idem-Coh] : X * X * Z = Z [variant] .
eq [id] :       X * mt = X    [variant] .
endfm

We can ask then for the variants of the exclusive-or symbol * when restricted to the sort Empty, and the results use the variant equations but not the membership information.

Maude> get irredundant variants in EXCLUSIVE-OR-MB : X:Empty * Y:Empty .

Variant #1
NatSet: #1:Empty * #2:Empty
X:Empty --> #1:Empty
Y:Empty --> #2:Empty

Variant #2
Empty: mt
X:Empty --> %1:Empty
Y:Empty --> %1:Empty

No more variants.

Note that the membership is used to compute the least sort of terms involved in the results (like the constant mt above), but is not used during variant generation. For example, this process is not able to instantiate any of the two variables to the constant mt.

Consider now the following version of the vending machine to buy apples (a) or cakes (c) with dollars (\$) and/or quarters (q). The reader can check that the only difference with the UNIF-VENDING-MACHINE module in Section 12.4 is the change equation, where we have added the attribute variant and a variable M to make it ACU-coherent (see Section 15.5 for details on ACU-coherence).

mod VARIANT-VENDING-MACHINE is
sorts Coin Item Marking Money State .
subsort Coin < Money .
op empty : -> Money .
op __ : Money Money -> Money [assoc comm id: empty] .
subsort Money Item < Marking .
op __ : Marking Marking -> Marking [assoc comm id: empty] .
op <_> : Marking -> State .
ops \$ q : -> Coin .
ops a c : -> Item .
var M : Marking .
rl [buy-c] : < M \$ > => < M c > .
rl [buy-a] : < M \$ > => < M a q > .
eq [change] : q q q q M = \$ M [variant] .
endm

Note that the change equation satisfies the finite variant property, as proved by generating the variants of symbol __.

Maude> get irredundant variants in VARIANT-VENDING-MACHINE : X:Marking Y:Marking .

Variant #1
State: #1:Marking #2:Marking
X:Marking --> #1:Marking
Y:Marking --> #2:Marking

Variant #2
State: %1:Marking %2:Marking
X:Marking --> q q q %1:Marking
Y:Marking --> q %2:Marking

Variant #3
State: %1:Marking %2:Marking
X:Marking --> q q %1:Marking
Y:Marking --> q q %2:Marking

Variant #4
State: %1:Marking %2:Marking
X:Marking --> q %1:Marking
Y:Marking --> q q q %2:Marking

We can also generate the variants of the state < \$ q q X:Marking > containing at least a dollar and two quarters.

Maude> get irredundant variants in VARIANT-VENDING-MACHINE : < \$ q q X:Marking > .

Variant #1
State: < \$ q q #1:Marking >
X:Marking --> #1:Marking

Variant #2
State: < \$ \$ %1:Marking >
X:Marking --> q q %1:Marking

These two variants form a finite, complete, and most general set of variants for the given term; for example, the variant

{< \$ \$ q q Y:Marking >, X:Marking --> q q q q Y:Marking}

is an instance of the first variant above, i.e., the canonical form < \$ \$ q q Y:Marking > is an instance of the normal form < \$ q q #1:Marking > of the first variant, and the (normalized version) of the instantiating substitution, i.e., #1:Marking --> \$ Y:Marking, is an instance of the empty substitution of the first variant.

We can consider a more complex equational theory such as the one of Abelian groups specified in the following module.

fmod ABELIAN-GROUP is
sorts Element .
op _+_ : Element Element -> Element [comm assoc prec 30] .
op -_ : Element -> Element [prec 20] .
op 0 : ->  Element .
vars X Y Z : Element  .
eq X + 0 = X [variant] .
eq X + - X = 0 [variant] .
eq X + - X + Y = Y [variant] .
eq - - X = X [variant] .
eq - 0 = 0 [variant] .
eq - X + - Y = -(X + Y) [variant] .
eq -(X + Y) + Y = - X [variant] .
eq -(- X + Y) = X + - Y [variant] .
eq - X + - Y + Z  = -(X + Y) + Z [variant] .
eq -(X + Y) + Y + Z = - X + Z [variant] .
endfm

The generation of the variants for the addition symbol takes more time and provides 47 variants:

Maude> get irredundant variants in ABELIAN-GROUP : X + Y .

Variant #1
Element: #1:Element + #2:Element
X --> #1:Element
Y --> #2:Element

Variant #2
Element: %1:Element
X --> 0
Y --> %1:Element
...

Variant #46
Element: %2:Element + - (%3:Element + %4:Element)
X --> %5:Element + - (%1:Element + %3:Element)
Y --> %1:Element + %2:Element + - (%4:Element + %5:Element)

Variant #47
Element: - (%2:Element + %3:Element)
X --> %4:Element + - (%1:Element + %2:Element)
Y --> %1:Element + - (%3:Element + %4:Element)

And the minus sign symbol has four variants:

Maude> get irredundant variants in ABELIAN-GROUP : - X .

Variant #1
Element: - #1:Element
X --> #1:Element

Variant #2
Element: %1:Element
X --> - %1:Element

Variant #3
Element: 0
X --> 0

Variant #4
Element: %1:Element + - %2:Element
X --> %2:Element + - %1:Element

#### 12.9.3 Incremental variant generation

Another interesting feature is that variant generation is incremental and in this way we are able to give partial support to theories that do not have the finite variant property. Let us consider the functional module for addition NAT-VARIANT in Section 12.9 that does not have the finite variant property. On the one hand, it is possible to have a term with a finite number of most general variants although the theory does not have the finite variant property. For instance, the term s(0) + X is simplified into s(X).

Maude> get variants in NAT-VARIANT : s(0) + X:Nat .

Variant #1
Nat: s(#1:Nat)
X:Nat --> #1:Nat

On the other hand, we can approximate the number of variants of a term that we suspect does not have a finite number of most general variants. For instance, the term X + s(0) has an infinite number of most general variants and we can approximate that infinite set of variants by including a bound in the command, as it is also done for unification modulo axioms.

Maude> get variants  in NAT-VARIANT : X:Nat + s(0) .

Variant #1
Nat: #1:Nat + s(0)
X:Nat --> #1:Nat

Variant #2
Nat: s(0)
X:Nat --> 0

...

Variant #10
Nat: s(s(s(s(s(0)))))
X:Nat --> s(s(s(s(0))))

Note that typing get irredundant variants in the previous command would force Maude to generate the infinite set of variants before collecting the ten most general ones.

This incremental variant generation may be useful in applications where obtaining a complete set of variants when possible is important. If we do not know a priori whether a term has a finite number of most general variants, we can incrementally increase the bound and if we obtain a number of variants smaller than the bound, we know for sure that it had a finite number of most general variants.

When an equational theory satisfies the requirements in Section 12.9.1 and does not have the finite variant property, it is because there are terms with an infinite set of most general variants. Of course, if the user does not provide a bound, Maude does not stop because it always returns a complete set of variants. However, when the user provides such a bound, the process will always terminate with Maude returning a finite set of variants. As said above, the number of returned variants can be smaller than the given bound when the term has indeed a finite set of variants, but it will coincide with the bound otherwise.

#### 12.9.4 Variant generation in incomplete unification examples

The unification infrastructure now supports the notion of incomplete unification algorithms (see Section 12.4.6) and variant generation and variant unification have been updated.

Let us consider an equational theory combining variant equations and associativity.

fmod VARIANT-UNIFICATION-ASSOC is
protecting NAT .
sort NList .
subsort Nat < NList .

op _:_ : NList NList -> NList [assoc ctor] .

var E : Nat .
var L : NList .

ops tail prefix : NList ~> NList .
ops head last : NList ~> Nat .
eq head(E : L) = E [variant] .
eq tail(E : L) = L [variant] .
eq prefix(L : E) = L [variant] .
eq last(L : E) = E [variant] .

op duplicate : NList ~> Bool .
eq duplicate(L : L) = true [variant] .
endfm

Some terms have a finite set of most general variants modulo associativity.

Maude> get variants in VARIANT-UNIFICATION-ASSOC : head(prefix(tail(L))) .

Variant #1
L --> #1:NList

Variant #2
L --> %1:Nat : %2:NList

Variant #3
L --> #1:Nat : #2:NList : #3:Nat

Variant #4
Nat: %3:Nat
L --> %1:Nat : %3:Nat : %4:NList : %2:Nat

No more variants.

However, some terms may hit incomplete associative unification calls (see Section 12.4.6), and an incompleteness warning for associative unification will be printed.

Maude> get variants in VARIANT-UNIFICATION-ASSOC :
duplicate(prefix(L) : tail(L)) .

Variant #1
[Bool]: duplicate(prefix(#1:NList) : tail(#1:NList))
L --> #1:NList

Variant #2
[Bool]: duplicate(%1:NList : tail(%1:NList : %2:Nat))
L --> %1:NList : %2:Nat

Variant #3
[Bool]: duplicate(prefix(%1:Nat : %2:NList) : %2:NList)
L --> %1:Nat : %2:NList

Variant #4
[Bool]: duplicate(#1:Nat : #2:NList : #2:NList : #3:Nat)
L --> #1:Nat : #2:NList : #3:Nat

Variant #5
[Bool]: duplicate(#1:Nat : #2:Nat)
L --> #1:Nat : #2:Nat

Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.
Variant #6
Bool: true
L --> %1:Nat : %1:Nat : %1:Nat

Variant #7
Bool: true
L --> %1:Nat : %1:Nat

No more variants.
Warning: Some variants may have been missed due to incomplete unification algorithm(s).

Note that the term duplicate(prefix(L) : tail(L)) has an infinite set of most general variants for the case of returning the variant term true, i.e., the family of substitutions {L:NList N:Nat  : :  N:Nat}. This is due to the associative unification call (N:Nat:L:NList) = ?(L:NList:N:Nat) invoked internally by the variant generation process.

#### 12.9.5 Variant generation at the metalevel: getVariants

The procedure for variant generation is also available at the metalevel of Maude thanks to the metaGetVariant and metaGetIrredundantVariant functions provided in the META-LEVEL module.

op metaGetVariant : Module Term TermList Nat Nat ~> Variant?
[special (...)] .
op metaGetIrredundantVariant : Module Term TermList Nat Nat ~> Variant?
[special (...)] .

The third argument is not used right now and is left for future use; thus, for the time being, the constant empty must be used. As in Section 12.5, it is convenient to reuse variable names from terms; this is allowed via the fourth argument, which is the largest number n appearing in fresh variables of the form #n:Sort or %n:Sort. Then the fresh variables in the computed variants will all be numbered from n + 1 on. And, as usual for descent functions, the last argument in the function is used to select which result is wanted, starting from 0. Caching is used so that if variant i has just been returned, requesting unifier i + 1 gives rise to an incremental computation.

The result sort is defined by means of the following data:

sorts Variant Variant? .
subsort Variant < Variant? .
op {_,_,_,_,_} : Term Substitution Nat Parent Bool -> Variant [ctor] .
op noVariant : -> Variant? [ctor] .
op noVariantIncomplete : -> Variant? [ctor] .

Again, the third argument denotes the largest number n used in the fresh variables appearing in the solutions. The fourth and fifth arguments are useful for applications based on the execution narrowing tree rather than the set of variants.

We can illustrate the use of this metalevel function with the variant generation of the configuration < \$ q q X:Marking Y:Marking> for the first variant.

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’X:Marking,’Y:Marking]], empty, 0, 0) .
result Variant: {’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’#1:Marking,’#2:Marking]],
’X:Marking <- ’#1:Marking ;
’Y:Marking <- ’#2:Marking,2,none,false}

Then the second possible variant:

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’X:Marking,’Y:Marking]], empty, 0, 1) .
result Variant: {’<_>[’__[’\$.Coin,’\$.Coin,’%1:Marking,’%2:Marking]],
’X:Marking <- ’__[’q.Coin,’q.Coin,’%1:Marking] ;
’Y:Marking <- ’%2:Marking,2,0,true}

Then the third possible variant:

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’X:Marking,’Y:Marking]], empty, 0, 2) .
result Variant: {’<_>[’__[’\$.Coin,’\$.Coin,’%1:Marking,’%2:Marking]],
’X:Marking <- ’__[’q.Coin,’%1:Marking] ;
’Y:Marking <- ’__[’q.Coin,’%2:Marking],2,0,true}

Then the fourth possible variant:

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’X:Marking,’Y:Marking]], empty, 0, 3) .
result Variant: {’<_>[’__[’\$.Coin,’\$.Coin,’%1:Marking,’%2:Marking]],
’X:Marking <- ’%1:Marking ;
’Y:Marking <- ’__[’q.Coin,’q.Coin,’%2:Marking],2,0,false}

Then the fifth possible variant:

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’X:Marking,’Y:Marking]], empty, 0, 4) .
result Variant: {’<_>[’__[’\$.Coin,’\$.Coin,’\$.Coin,’#1:Marking,’#2:Marking]],
’X:Marking <- ’__[’q.Coin,’q.Coin,’q.Coin,’#1:Marking] ;
’Y:Marking <- ’__[’q.Coin,’q.Coin,’q.Coin,’#2:Marking],2,1,false}

And there are no more variants.

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’\$.Coin,’q.Coin,’q.Coin,’X:Marking,’Y:Marking]], empty, 0, 5) .
result Variant?: noVariant

Using the fourth and fifth arguments of each returned variant, we can reconstruct the execution narrowing tree of Figure 12.1. The fourth argument of each variant is the identifier of the parent variant; the identifier of each variant is indeed the last argument of its associated call to metaGetVariant. The fifth argument is a Boolean: true meaning that there is at least one other variant in that level of the narrowing tree, and false meaning that this is the last one in that level of the narrowing tree. Note that variants return the whole composed substitution and the intermediate unifier shown in Figure 12.1 between variants 1 and 4 has to be extracted manually. Figure 12.1: Folding variant narrowing tree for the term < \$ q q X Y >.

Let us also show an example of an incomplete variant generation at the metalevel. If we move to the metalevel the incomplete variant generation for term duplicate(prefix(L) : tail(L)) of Section 12.9.4, we get the first variant:

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-UNIFICATION-ASSOC, true),
’duplicate[’_:_[’prefix[’L:NList],’tail[’L:NList]]], empty, 0, 0) .
result Variant: {’duplicate[’_:_[’prefix[’#1:NList],’tail[’#1:NList]]],
’L:NList <- ’#1:NList,1,none,false}

And when we ask for the eighth variant, we get the constant noVariantIncomplete:

Maude> reduce in META-LEVEL :
metaGetVariant(upModule(’VARIANT-UNIFICATION-ASSOC, true),
’duplicate[’_:_[’prefix[’L:NList],’tail[’L:NList]]], empty, 0, 7) .
result Variant?: noVariantIncomplete

Note that we got the constant noVariantIncomplete instead of the noVariant constant, which is the output for the case of a finitary set of variants.

### 12.10 Variant-based equational order-sorted unification

The intimate connection between E,Ax-variants and E Ax-unification is as follows. Suppose that we extend the equational theory (Σ,E Ax) to ( , Ax) by adding to Σ a new sort Truth, not related to any sort in Σ, with a constant tt, and, for each top sort of a connected component [s], an operator eq : [s] [s] -> Truth; and where is the result of adding for each top sort [s] an extra (oriented) equation eq(x,x) = tt (where x is a variable of sort [s]) to E. Then, given any two terms t,t, if θ is an E Ax-unifier of t and t, then the E,Ax-canonical forms of θ(t) and θ(t) must be Ax-equal and therefore the pair (tt) must be a variant of the term eq(t,t). Furthermore, if the term eq(t,t) has a finite set of most general variants, then we are guaranteed that the set of most general E Ax-unifiers of t and tis finite.

At a practical level, variants are generated using narrowing (see Chapter 16 for narrowing capabilities in Full Maude). Narrowing with oriented equations E (with or without modulo Ax) enjoys well-known completeness results, including the generation of complete sets of unifiers and covering all rewriting sequences from instances of a term using normalized substitutions (i.e., variants). For instance,  showed that narrowing with E without axioms enjoyed good completeness results, and  showed that narrowing with E modulo axioms Ax enjoyed also good completeness results. But narrowing can be quite inefficient, generating a huge search space, and different narrowing strategies have been devised to reduce the search space while remaining complete, specially for unification purposes (see  for a survey on narrowing termination). The basic narrowing strategy of  provided a restriction of narrowing that, while being complete, it was terminating for specific classes of theories. However, very little was known about effective narrowing strategies in the modulo case, and some of the known anomalies ring a cautionary note, to the effect that the naive extensions of standard narrowing strategies, for example basic narrowing modulo AC, were incomplete . In , the folding variant narrowing strategy is defined for the modulo case and it is proved to be complete for variants and with good termination properties, providing a finitary and complete unification algorithm for equational unification for the theories described in Section 12.9.1 that also satisfy the finite variant property. Moreover, it is even better than the basic narrowing strategy in the case without axioms, since it can terminate for equational theories where basic narrowing cannot (see ).

In Maude 2.6, variant generation and variant-based equational unification were implemented in Maude and made available in Full Maude. Instead, in Maude 2.7 variant generation and variant-based equational unification have been implemented in Core Maude’s C++ level for efficiency purposes and using the Ax-unification algorithm described in Section 12.4. Furthermore, the variant generation and variant-based equational unification available in Maude 2.6 were accepting only equational theories where the righthand side of the equations was a strongly irreducible term (e.g., a variable or a constant), while the current version implements the folding variant narrowing strategy in full generality. By “full generality” we mean not just any equational theory (Σ,E Ax) having the finite variant property with Ax satisfying the requirements in Section 12.3 (so that unification is finitary), but any confluent, coherent, and terminating modulo Ax decomposition (Σ,Ax,E), thus obtaining an incremental generation for the (in general infinite) set of E Ax-unifiers in that case.

The key distinction, now supported for the first time in Maude in full generality, is one between dedicated unification algorithms for a limited set of axioms Ax and generic unification algorithms which can be applied to a much wider range of user-definable theories and can even deal with incremental generation of infinite sets of unifiers.

#### 12.10.1 The variant unify command

Given a module ModId, of the general form mod ,GE Ax,R) endm where (Σ,E Ax) satisfies the requirements of Section 12.9.1 and satisfies also the finite variant property, Maude provides a command for E Ax-equational unification based on variant generation of the form:

variant unify [ ] in ModId :
Term-1 =? Term’-1 /\ ... /\ Term-k =? Term’-k .

where k 1; n is an optional argument providing a bound on the number of unifiers requested, so that if the cardinality of the set of unifiers is greater than the specified bound, the unifiers beyond that bound are omitted; and, as usual, if no module is mentioned, the current module is used.

Consider again the module VARIANT-VENDING-MACHINE introduced in Section 12.9.2. We can ask whether there is an E Ax-equational unifier of two configurations, one containing at least two quarters, and another containing at least one dollar, by invoking the following command:

Maude> variant unify in VARIANT-VENDING-MACHINE :
< q q X:Marking > =? < \$ Y:Marking >  .

Unifier #1
X:Marking --> \$ %1:Marking
Y:Marking --> q q %1:Marking

Unifier #2
X:Marking --> q q #1:Marking
Y:Marking --> #1:Marking

There are no most general unifiers; for instance, the unifier X:Marking --> q q, Y:Marking --> empty is an instance of the first solution by using the identity property of the operator for markings.

And we can ask again whether there is an E Ax-equational unifier of the two previous configurations but variable Y:Marking also appears in a second unification problem with two new configurations, one containing again at least two quarters, and another with just variable Y:Marking.

Maude> variant unify in VARIANT-VENDING-MACHINE :
< q q X:Marking > =? < \$ Y:Marking > /\
< q q Z:Marking > =? < Y:Marking >  .

Unifier #1
X:Marking --> \$ %1:Marking
Z:Marking --> %1:Marking
Y:Marking --> q q %1:Marking

Unifier #2
X:Marking --> \$ q q %1:Marking
Z:Marking --> q q %1:Marking
Y:Marking --> \$ %1:Marking

#### 12.10.2 Incremental variant unification

Similarly to the incremental generation of variants in Section 12.9.3, one can obtain an incremental number of unifiers for a given unification problem. Let us consider again the NAT-VARIANT module in Section 12.9 that does not have the finite variant property. On the one hand, it is possible to have a finite number of most general unifiers for a unification problem although the theory does not have the finite variant property. For instance, the unification problem between s(0) + X and s(s(s(0))) returns just one unifier.

Maude> variant unify in NAT-VARIANT : s(0) + X:Nat =? s(s(s(0)))  .

Unifier #1
X:Nat --> s(s(0))

No more unifiers.

On the other hand, we can approximate the number of unifiers of a unification problem that we suspect does not have a finite number of most general unifiers. For instance, the unification problem between terms X + s(0) and s(s(s(0))) has only one solution X s(s(0)) and we can obtain that solution by including a bound in the command, as it is also done for variant generation.

Maude> variant unify  in NAT-VARIANT : X:Nat + s(0) =? s(s(s(0)))  .

Unifier #1
X:Nat --> s(s(0))

However, if we tried to obtain two unifiers, Maude would not stop because it would keep trying to generate a second unifier for a unification problem that has only one unifier, without knowing that it could stop. This differs from the incremental generation of variants (Section 12.9.3), where we can incrementally increase the bound even if the theory does not have the finite variant property and Maude will always stop with a new variant. The problem here is due to the fact that Maude needs to compute the set of variants before computing the set of unifiers; when the equational theory does not have the finite variant property, such a set of variants can be infinite and Maude is not able to complete this computation, even when the user provides a bound, because such a bound refers to the number of requested unifiers, but not to the number of variants, which in this process is just part of the internal process for computing unifiers. As in the example above, this can happen even when there are no further unifiers.

#### 12.10.3 Variant unification in incomplete unification examples

Similarly to the incomplete variant generation in Section 12.9.4, we can have variant unification calls that cannot provide a finitary set of most general unifiers. Let us consider again the VARIANT-UNIFICATION-ASSOC theory of Section 12.9.4.

We can force unification calls, within variant unification, to an associative unification problem with an infinite set of most general unifiers, e.g., the previously described associative unification problem N:Nat:L:NList = ?L:NList:N;Nat with the family of substitutions {L:NList N:Nat  : :  N:Nat}:

Maude> variant unify in VARIANT-UNIFICATION-ASSOC :
head(L) =? last(L) /\ prefix(L) =? tail(L)  .

Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

Unifier #1
L --> %1:Nat : %1:Nat : %1:Nat

Unifier #2
L --> %1:Nat : %1:Nat

No more unifiers.
Warning: Some unifiers may have been missed due to incomplete unification algorithm(s).

Notice that the unification problem head(L) =? last(L) /\ prefix(L) =? tail(L) has the same solutions as the unification problem N:Nat : L:NList =? L:NList : N:Nat.

#### 12.10.4 Variant unification at the metalevel: metaVariantUnify and metaVariantDisjointUnify

The procedure for variant-based equational unification is also available at the metalevel by means of the following functions provided in the META-LEVEL module.

op metaVariantUnify :
Module UnificationProblem TermList Nat Nat ~> UnificationPair?
[special (...)] .

op metaVariantDisjointUnify :
Module UnificationProblem TermList Nat Nat ~> UnificationTriple?
[special (...)] .

The unification problems and the result sort are the same as in Section 12.5.

sorts UnificandPair UnificationProblem .
subsort UnificandPair < UnificationProblem .
op _=?_ : Term Term -> UnificandPair [ctor prec 71] .
op _/\_ : UnificationProblem UnificationProblem -> UnificationProblem
[ctor assoc comm prec 73] .

subsort UnificationPair < UnificationPair? .
subsort UnificationTriple < UnificationTriple? .
op {_,_} : Substitution Nat -> UnificationPair [ctor] .
op {_,_,_} : Substitution Substitution Nat -> UnificationTriple [ctor] .

We can illustrate the use of this metalevel function with the variant unification of the two terms of Section 12.10.1: < q q X:Marking > and < \$ Y:Marking >:

Maude> reduce in META-LEVEL :
metaVariantUnify(upModule(’VARIANT-VENDING-MACHINE, true),
’<_>[’__[’q.Coin,’q.Coin,’X:Marking]] =?
’<_>[’__[’\$.Coin,’Y:Marking]],
empty, 0, 0) .
result UnificationPair: {
’X:Marking <- ’__[’\$.Coin,’%1:Marking] ;
’Y:Marking <- ’__[’q.Coin,’q.Coin,’%1:Marking],1}

Let us also illustrate the use of incomplete variant unification by moving to the metalevel the incomplete unification problem of Section 12.10.3: head(L:NList) =? last(L:NList) /\ prefix(L:NList) =? tail(L:NList).

Maude> reduce in META-LEVEL :
metaVariantUnify(upModule(’VARIANT-UNIFICATION-ASSOC, true),
’prefix[’L:NList] =? ’tail[’L:NList], empty, 0, 0) .

Warning: Unification modulo the theory of operator _:_ has encountered
an instance for which it may not be complete.

result UnificationPair: {
’L:NList <- ’_:_[’%1:Nat,’%1:Nat,’%1:Nat],1}

And when we try to obtain the third unifier we get the constant noUnifierIncomplete.

Maude> reduce in META-LEVEL :
metaVariantUnify(upModule(’VARIANT-UNIFICATION-ASSOC, true),