Logic Programming and the Andorra Kernel Language

Logic Programming

Skip this bit, if you already know something about logic programming. The introduction is intended for those who don't know anything about the subject.

Logic Programming is not, as you might expect, a very low-level part of computer science, working with gates. Logic Programming is rather an attempt to use predicate logic as the basis of a programming language, leading to some very high level languages.

Declarative Programming

Logic programming languages tend to be declarative programming languages, as opposed to the more traditional imperative programming languages, such as C or Pascal.

In a declarative language, you concentrate on the relationships between the parts of the problem, rather than on the exact sequence of steps required to solve the problem. An example of a well-known declarative language is SQL; SQL queries are translated by the database engine into various lookup strategies, where the presence of indexes and the sizes of various tables and views are taken into account. All this detail doesn't matter too much to the programmer1, who is only interested in the result of the query, rather than the exact steps taken to retrieve the result.

Imperative languages, in contrast to their declarative brethren, expect the programmer to provide an exact description of the steps required to solve some problem.

In a logic programming language, what is declared is the logical relationship between various entities. After that, if you supply some sort of query, the logic programming language can deduce the results of the query by tracing its path through the relationship.

An Example

As an example of how a set of logical statements can be used as a programming language, here is the ancestor relationship expressed logically:

Sue is-parent-of Bill.
Sue is-parent-of James.
Sue is-parent-of Edith.
Fred is-parent-of Bill.
Fred is-parent-of James.
Authur is-parent-of Edith.
Mary is-parent-of Kylie.
Mary is-parent-of Jason.
Mary is-parent-of Matilda.
James is-parent-of Kylie.
James is-parent-of Jason.
James is-parent-of Matilda.
Edith is-parent-of David.
William is-parent-of David.

X is-ancestor-of Y if
	X is-parent-of Y
X is-ancestor-of Y if
	there is some Z such that
	X is-parent-of Z and
	Z is-ancestor-of Y

This example contains two relationships: is-parent-of and is-ancestor-of. These relationships are called predicates.

The simpler predicate is is-parent-of. All the statements which make up is-parent-of are simple facts and can be easily interpreted. We can easily ask "Is James the parent of Matilda?" and, since James is-parent-of Matilda is present in the above list, the question will be answered with a "true". If we ask "Is James the parent of David?" then, since James is-parent-of David is not in the list, we will be answered with a "false"2.

is-ancestor-of is a more complex predicate, since it expresses the is-ancestor-of relationship as a pair of rules. Logically, the first relationship can be read as "Someone is the ancestor of somebody else if they are that person's parent." This is the simplest expression of what it means to be sombody's ancestor. The second part of the is-ancestor-of relationship is more complex. This part can be read as "Somebody is the ancestor of somebody else if there is another person who is the child of the first person, and this third person is the ancestor of the second person." If this is a little confusing, try this example: my grandmother is my ancestor, because she is the parent of my father and my father is my ancestor, because he is my parent.

Directionless Programming

There are four ways in which we may want to use the above example:

  • We can ask it a question. Eg. Sue is-ancestor-of David? is true (through the line Sue-Edith-David).
  • We can ask it who the ancestors of somebody are. Eg. Who is-ancestor-of David? (Edith, William, Authur and Sue).
  • We can ask it who the descendants of somebody is. Eg. Edith is-ancestor-of Who? (David).
  • We can ask it for the complete set of ancestor relationships. Eg. Who is-ancestor-of Whom?
The way in which we have declared is-ancestor-of allows you to ask any of these questions. It's up to the logic programming language itself to provide you with all the answers.

Logic Programming Resources

Here is a list of useful URLs concerned with logic programming:

Virtual Library - Logic Programming
The WWW Virtual Library logic programming directory.
Database Systems and Logic Programming
A fairly complete search site for papers on logic programming.
The Journal of Functional and Logic Programming
An online journal.

Prolog

Prolog (Programming in Logic) is the first logic programming language and still the most common. Prolog now has a large number of commercial and free implementations.

Syntax

A Prolog predicate is written as name(arg1, ..., argn). For example, the parent relationship shown in the example would be written as parent(sue, bill). Predicate names start with lower-case letters3

The arguments to predicates are constructed from terms.. The most primitive form of term is a constant which is represented by a word with a lower-case initial letter, eg. sue, walter, binary. Constants starting with upper-case letters or containing punctuation need to be surrounded by single quotes, eg. 'Sue', '#something', 'ancestor(sue, bill)'. Numbers are also constants.

The next most primitive form of term is a variable. Variables are represented by words beginning with an upper-case letter, or an underscore, eg. X, Alpha1, _IgnoreMe. Variables beginning with an underscore are anonymous variables; throw-away variables used as place-holders.

Complex terms are built from function symbols, consisting of a named function and some arguments in brackets, eg. g(X, Y), alpha(sue, matter), xor(1, 1), s(s(s(0))). In some cases, function symbols can be used as infix operators, eg. X + Y = '+'(X, Y).

Lists are used extensively in Prolog and have a special syntax. A list is represented by [ elt1, ..., eltn ] eg. [1, 2, a, b, f(a, Y)] is a list with 5 elements. The empty list is represented as []. Lists are separated into head and tail elements by using the | symbol. [ elt1, ..., eltn | T ] is a list where the first n elements are specified, and the tail of the list is T. Eg. [a, b, f(a, b), 1, 6, g(2)] = [X, Y | Z] will give X = a, Y = b and Z = [f(a, b), 1, 6, g(2)]. In some code, a dot notation is used to separate head and tail; X.Y = [X | Y].

Predicates are defined by either stating facts, eg. parent(sue, bill) or rules, eg. ancestor(X, Y) :- parent(X, Z), ancestor(Z, Y). In rules, the ':-' reads as an if and the comma reads as an and. Each fact or rule in a predicate is called a clause.

Queries in Prolog are expressed as ?- Goal1, ..., Goaln. Eg. ?- parent(sue, X), ancestor(fred, X)

The Example in Prolog

The ancestor example, written in Prolog would look like:

parent(sue, bill).
parent(sue, james).
parent(sue, edith).
parent(fred, bill).
parent(fred, james).
parent(arthur, edith).
parent(mary, kylie).
parent(mary, jason).
parent(mary, matilda).
parent(james, kylie).
parent(james, jason).
parent(james, matilda).
parent(edith, david).
parent(william, david).

ancestor(X, Y) :-
	parent(X, Y).
ancestor(X, Y) :-
	parent(X, Z), ancestor(Z, Y).

An example query would be ?- ancestor(fred, david), "Is fred the ancestor of david" which would return a no.

Another example query would be ?- parent(sue, X), ancestor(fred, X), "Is sue a parent of somebody who also has fred as an ancestor?" In this case, Prolog will respond with X=bill ?; you may respond with a return to indicate that you are satisfied with this answer. If you respond with a ; then the system will search for another answer and return with X=james ?. If you respond with another ;, then you will get the response no indicating that there are no more answers.

Prolog's Search Strategy

When Prolog is searching for an answer to a query, it starts by choosing one of the goals that it has, to see if it can find an answer to that goal. Eg. if the query were ?- parent(sue, X), ancestor(X, david), then Prolog would choose parent(sue, X) as the first goal to try.

Once Prolog has chosen a goal, it starts to try and match the clauses of the goal, top-to-bottom until it finds one that fits. In the above example, parent(sue, bill) matches parent(sue, X) if X was bound to bill.

The Prolog search rule now tries the ancestor(X, david) part of the query, only, since X has been bound to bill the query is now really ancestor(bill, david).

The first clause of ancestor is now tried. This clause simply tries parent(bill, david). There is no parent(bill, david) clause, so the program backtracks. When a Prolog program backtracks, it moves back to the next possible choice of clause that it has and tries that clause instead. Any variables that have been bound after that choice are undone, leaving them ready to try something else. In the current case, there are no variables to be unbound - X was bound before we started choosing clauses of ancestor.

The program now tries the next clause of ancestor, which leads to two further goals, parent(bill, Z), ancestor(Z, david). Prolog picks the leftmost goal again and tries parent(bill, Z).

Eventually, ancestor(bill, david) will fail. At this point, the program backtracks, unbinding X and choosing the next clause from parent(sue, X), which will lead to X being bound to james and the whole thing starting off again.

After another round of backtracking, X will be bound to edith and ancestor(edith, david) will succeed.

The Prolog search strategy, therefore, always picks the leftmost, innermost goal to try next and always tries clauses in top to bottom order. This strategy is called a depth-first search strategy, as it always searches down the possible goals, rather than across them. The depth-first strategy is very simple and efficient to implement, but can lead to some very odd infinite loops. For example:

p(a) :- p(Z).

q(b).

?- p(X), q(X)

In this example ?- p(X), q(X) should fail, since X will be bound to a by p and to b by q. However, the p(X) goal will go into an infinite loop, and q(X) will never be tried4.

Cut

In ordinary use of Prolog, you are interested in any possible answer that can be computed. Occasionally, however, you want just a few answers, and then wish to cut away any further answers. This process is normally referred to as pruning.

In Prolog the cut operator, written as a ! allows pruning. When a cut operator is encountered in a clause, all possible alternatives below the clause are removed, as are any alternatives which have been computed to the left of the cut.

An example of the use of cut is where you have a series of alternatives, with a default at the end. If you encounter an alternative, then you want to ignore the default. As an example, the following table works out whether a foreign exchange deal is a spot deal or not5:

dealType(Date, Today, cad, _Currency2, spot) :-
	addDays(Today, 1, Date),
	!.
dealType(Date, Today, _Currency1, cad, spot) :-
	addDays(Today, 1, Date),
	!.
dealType(Date, Today, _Currency1, _Currency2, spot) :-
	addDays(Today, 2, Date),
	!.
dealType(_Date, _Today, _Currency1, _Currency2, outright).

addDays(date1, n, date2) is defined to be true when Date2 is n days ahead of Date16. If we assume that dates are represented by a term: date(day, month, year), then we could define addDays in this way:

addDays(date(Day, Month, Year), Days, Date) :-
	plus(Day, Days, Day1),
	normalizeDate(date(Day1, Month, Year), Date).

normalizeDate(date(Day, Month, Year), Date) :-
	Day =< 0,
	!,
	nextMonthYear(Month1, Year1, Month, Year),
	daysInMonth(Month1, Year1, DIM),
	plus(Day, DIM, Day1),
	normalizeDate(date(Day1, Month1, Year1), Date).
normalizeDate(date(Day, Month, Year), Date) :-
	daysInMonth(Month, Year, DIM),
	Day > DIM,
	!,
	nextMonthYear(Month, Year, Month1, Year1),
	plus(Day1, DIM, Day),
	normalizeDate(date(Day1, Month1, Year1), Date).
normalizeDate(Date, Date).

daysInMonth(jan, _Year, 31).
daysInMonth(feb, Year, 29) :-
	isLeapYear(Year), !.
daysInMonth(feb, Year, 28).
daysInMonth(mar, _Year, 31).
daysInMonth(apr, _Year, 30).
daysInMonth(may, _Year, 31).
daysInMonth(jun, _Year, 30).
daysInMonth(jul, _Year, 31).
daysInMonth(aug, _Year, 31).
daysInMonth(sep, _Year, 30).
daysInMonth(oct, _Year, 31).
daysInMonth(nov, _Year, 30).
daysInMonth(dec, _Year, 31).

nextMonthYear(jan, Year, feb, Year).
nextMonthYear(feb, Year, mar, Year).
nextMonthYear(mar, Year, apr, Year).
nextMonthYear(apr, Year, may, Year).
nextMonthYear(may, Year, jun, Year).
nextMonthYear(jun, Year, jul, Year).
nextMonthYear(jul, Year, aug, Year).
nextMonthYear(aug, Year, sep, Year).
nextMonthYear(sep, Year, oct, Year).
nextMonthYear(oct, Year, nov, Year).
nextMonthYear(nov, Year, dec, Year).
nextMonthYear(dec, Year, jan, NewYear) :-
	plus(Year, 1, NewYear).

leapYear(Year) :-
	Year \\ 400 =:= 0, !.
leapYear(Year) :-
	Year \\ 4 =:= 0,
	Year \\ 100 =\= 0.

In this example, =:= tests for numerical equality, =\= tests for numerical inequality > tests for the greater-than relationship and =< tests for the less than or equal to relationship. \\ is the modulus operator. plus(A, B, C) succeeds if A+B=C, with one of A, B or C being bound to the correct value if the other two arguments are numbers.

Cut damages the idea of directionless programming7. For example, the query ?- daysInMonth(M, 1996, DIM) will not give each month and the number of days in the month for each month; the cut in daysInMonth(feb, 1996, 29) eliminates any further choices.

Prolog Resources

Here is a list of handy URLs for getting various Prolog resources:

Quintus Prolog
A fully-fledged commercial Prolog system.
SICStus Prolog
Another commercial Prolog system, with academic and free student licenses. This page also contains a good set of links to Prolog libraries.
Prolog Library
A collection of Prolog routines.
jProlog
A Prolog to Java compiler??!

Agents Kernel Language

Prolog's search strategy can be very inefficient. Very often, many fruitless search branches are explored which could be ignored if more information from later in the query were made available. As an example, consider the following program and query:

number(0).
number(s(X)) :- number(X).

?- number(X), X = s(s(s(0))).

With a normal Prolog search strategy, the program would generate 0, s(0) and s(s(0)) before generating s(s(s(0))). Each value of X which is rejected causes the program to backtrack. If the program could somehow move ahead to the X = s(s(s(0))) part, then the call to number becomes a simple check.

The easiest goals in a query to execute are those with only one possible clause to execute. These goals are called determinate goals, in contrast to nondeterminate where several branches may have to be tried. Determinate goals represent "no regrets" goals; they have to be executed sometime, so the program should execute them as soon as possible and see if they bind enough variables to make another goal determinate.

The Andorra Principle essentially says "Do the determinate bits first". The Agents Kernel Language, or AKL is a Prolog-like language designed to take advantage of the Andorra principle.

Syntax

The AKL syntax is similar to Prolog-syntax, but somewhat more complex to allow the AKL to decide how much of a goal to try to see whether it is determinate or not.

The AKL is a guarded language. Clauses are written as Head :- Guard, Operator, Body, where Guard and Body are normal, comma-separated lists of goals and Operator is one of three guard operators:

Wait (?)
Implies a normal search, with the ? used to separate the guard from the body.
Conditional (->)
Works in a similar manner to the Prolog cut. If a guard completely succeeds, then all possible clauses below this clause are pruned - possibly leading to a determinate goal.
Commit (|)
If a guard completely succeeds then the commit operator prunes away all other possible branches, leading to a determinate goal.

All clauses in a predicate must have the same guard operator.

If no guard operator is specified, then a guard and guard operator of true ? is assumed. Facts are assumed to have a guard and body of true ? true, eg. parent(james, kylie) would be more correctly written as parent(james, kylie) :- true ? true..

Queries follow the Prolog model.

The Example in AKL

The ancestor example, written in AKL would look like:

parent(sue, bill).
parent(sue, james).
parent(sue, edith).
parent(fred, bill).
parent(fred, james).
parent(arthur, edith).
parent(mary, kylie).
parent(mary, jason).
parent(mary, matilda).
parent(james, kylie).
parent(james, jason).
parent(james, matilda).
parent(edith, david).
parent(william, david).

ancestor(X, Y) :-
	parent(X, Y) ? true.
ancestor(X, Y) :-
	parent(X, Z) ? ancestor(Z, Y).

Putting the parent part of the ancestor predicate in the guard allows the system to quickly check to see whether the clause is likely to succeed. A query such as ?- ancestor(sue, kylie) will immediately cause the first clause to fail, leaving the way open for the program to determinately commit to the second clause.

Execution

An AKL program essentially acts like a Prolog program, except that the leftmost, depthmost goal is not always selected. Preference is given to any goal where the AKL can detect determinism.

When a goal is selected, all the guards in each clause in the predicate are all tried. With a bit of luck, all the guards except one will fail and the goal can be determinatly promoted. A determinately promoted goal first allows its guard to complete, and then sets about executing the body of the clause.

If there are two or more possible clauses when the guards complete, then the goal suspends until further bindings cause enough guards to fail. Any variable bindings made by the guards are treated as speculative; they apply to the guard, but do not leak into the surrounding computation. Part of promotion is the promotion of speculative bindings into real bindings.

Clauses (and predicates) with conditional or commit guards must have quiet guards before the operators are allowed to prune. Quiet guards have no speculative bindings; either the guard doesn't care about the state of a variable, or the variable has been really bound. Quietness makes sure that pruning operators don't get activated by an over-enthusiastic guard. For example:

p(X) :- X = a | true.
p(X) :- X = b | true.

q(X) :- true ? X = b.

?- p(X), q(X).

In the example above, if quietness were not required, then each guard in p would speculatively bind X and then commit. Without quietness, therefore, there is a race to see whether X = a or X = b8. Quietness ensures that no pruning occurs until q is called, which is determinate and binds X to b.

At some point, most AKL programs won't be able to find a determinate goal9. At this point, the AKL program must fall back on the Prolog-style search. Since the an AKL program has usually tried many goals in the program, there is a wider variety of possible choices. The AKL picks a goal which is wait-guarded and then promotes the first clause, leaving the remaining clauses for backtracking. Hopefully, this nondeterminate promotion will cause something else to be determinate and start the ball rolling again.

Concurrency and Parallelism

The AKL chooses a goal to execute from a set of determinate goals. Since the order of execution of these goals does not matter, the AKL allows both concurrent programming and parallelism.

In concurrent programming, although each goal is evaluated one at a time, the system is free to switch between goals as it sees fit. This kind of behaviour is similar to a multitasking operating system, and concurrent systems allow logic programs to be viewed as a set of threads, all acting in concert.

Concurrent programming gives the appearance of parallelism, without actually having a parallel set of systems. In true parallelism, several determinate goals can be evaluated at once.

Stream and-parallelism views individual predicates as independent, communicating processes. The processes communicate by incrementally filling out a list; as one process binds the head of a list, another process can commit to a single clause and process the list element. As an example, the following program and query sums up a series of numbers:

numbers(0, X) :- true -> X = [].
numbers(N, X) :- true ->
	X = [N | X1],
	plus(N1, 1, N),
	numbers(N1, X1).

sum([], S, T) :- true ? T = S.
sum([N | N1], S, T) :- true ?
	plus(S, N, S1),
	sum(N1, S1, T).

sum(N, T) :- sum(N, 0, T).

?- numbers(5, X), sum(X, T).

Resources

Here are some links to various AKL resources:

SICS AKL Page
SICS (Swedish Institute of Computer Science) is the home of the AKL. This site includes v1.0 of the AKL abstract machine.
Penny
Penny is a parallel implementation of the AKL.
AKL Papers
Papers on the AKL from SICS.

Notes

  1. OK, well, actually it does. Most programmers end up with a pretty good idea of how a supposedly "declarative" language actually works. For some bizarre reason, most programmers care about efficiency.
  2. If we were using the strict rules of mathematical logic, we wouldn't be able to assume that James is-parent-of David is false. There may be some unstated fact that makes it true. Strictly, all we can deduce are positive statements - we can know for sure that James is-parent-of Matilda. However, in most logic programming languages, the negation as failure rule is assumed: if it can't be shown that something is true, then it's assumed to be false.
  3. Or by enclosing the name in single quotes, eg. 'Parent'(sue, bill).
  4. Infinite loops often show up in cases where a predicate has an ambiguous interpretation. In the example case of p(a) :- p(Z) we can have two, equally valid, models for p. If we take p(X) to be false for all X, then p(a) <- exists Z p(Z) is a true statement (false implies false). If we take, at least, p(a) to be true, then exits Z p(Z) is true and p(a) <- exists Z p(Z) is a true statement (true implies true). Of course, not every infinite loop is caused by ambiguous predicates, but it all ties in with the fact that logic programming languages can be interpreted as statements in predicate calculus and I (at least) think that it is an interesting example of mathematics creeping into computer science. Sorry. Go back to what you were doing.
  5. "Spot" deals are two days ahead of today, unless one of the currencies involved is the Canadian dollar (CAD), in which case a spot deal is one day ahead of today. Any deal that is not a spot deal is an "Outright" deal.
  6. Actually, these should be "Business days" which take account of weekends and holidays. This makes things far too complex and has been ignored.
  7. As do the various arithmetic operators such as =:=.
  8. You can still have a race if you want, even with quietness. Just rewrite p to be p(X) :- true | X = a and p(X) :- true | X = b. Nothing will then stop the commit operators from racing.
  9. This actually occurs whenever part of the program becomes stable. Stability is a local version of "no determinate goals" which allows multiple independent nondeterminism. You can pull some rather neat tricks with stability, but if you want to do that kind of thing, have a look at Sverker's paper.