Graph reduction with multiple meta-levels

Author:	Armin + Fare
Date:	16 sep 2002

Introduction

Represent the state as a linear graph whose evolution is controlled by reduction rules.

When the scale of the problem grows, we can take two incompatible points of view:

the whole computer runs a single big soup, with a rich set of reduction rules -- in this view, modules are seen as programming tools that let us reasonably define independent rules, and the job of the compiler is to put them all together coherently.
interpretation of the graph occurs inside of modules which are still separated at run-time, with special logic to cross the module boundaries and help preserve them.

We want to be able to do both. Indeed, whether we are in (1) or in (2) only depends on one's point of view. Ultimately, the processor is just a big soup interpreter as in (1), so that we cannot escape this. Nevertheless, if you just forget all boundaries at any conceptual level, you are basically loosing the ability to do any interesting meta-programming. So (2) is more appropriate when looking at the system from a higher-level, more abstract points of view.

Our purpose is to define a framework in which we can not only specify graph reduction rules, but take as first-level objects rules, whole or sub-graphs, reduction systems, abstraction levels, and cross-level transformations (functors).

Examples

A few examples will (hopefully) eventually help us in defining the precise default semantic we are looking for, and a default syntax.

I will stick with two examples: implementing lists, and calling a sort method.

From your application's point of view, you will probably want lists to be abstract objects on which you can apply methods. As a graph:

                          i1  i2  i3
                          |   |   |
                          |   |   |
<your application> ------ list-node

where i1, i2, i3... are the list items.

A possible implementation of lists is via cons cells, a la Lisp:

         item     item     item
          |        |        |
          |        |        |
root --- cons --- cons --- cons --- nil

Assuming that the above list-node is currently implemented like this, the relationship between the two graphs exists as a link from the low-level graph to a sub-graph of the high-level one -- the subgraph being in this case just the single list-node:

                          i1  i2  i3
                          |   |   |
                      /===|===|===|===\
                      I   |   |   |   I
<your application> ------ list-node   I
                      I               I
                      \===============/

                              /\
                             /__\
                              ||   (big fat arrow)
                              ||

              /======item===item===item=========\
              I       |      |      |           I
              root - cons - cons - cons - nil   I
              I                                 I
              \=================================/

On the low-level graph, some nodes are shown "on the edge" of the graph because they correspond to outside links at the higher level.

Of course, the purpose of this is to be able to switch to another representation at any point. The fact that the high-level graph is linear helps a lot in keeping things in sync: we can change the representation at any time because we know who is pointing to us -- at least, this information can be found in the high-level graph, althought of course an implementation might choose to factor it out. (Other complementary ways of ensuring consistency are e.g. "broken hearts" or systematic indirection etc.)

Each level has it own language: the node names present at the upper level (like list-node) are completely disjoint from the node names at the lower level (root, cons, item, nil). The above big fat arrow does not mean "is a subgraph of", but is a cross-meta-level correspondance. One thing an application might want to do is actually cross the meta-levels. In general, we want to be able to reflect things from one level to the other. When doing so, we must specify and fix which levels we are interested in (say, the two levels pictured above); as long as we don't, the system is free to change the implementation levels for optimization (say, change the way the list is implemented). In the above example, we might for example want to:

reify a particular implementation of the list, i.e. obtain a copy of the list which corresponds to a particular implementation it chooses, but written in the application's own language. In other words, it means actually putting a copy of the low-level graph inside of the high-level graph, renaming the nodes along the way ("catamorphism"). For example, if the above application-level list is a list of integers, then you can import a copy of the low-level graph renaming cons to + and nil to 0; then you have exactly the graph expression that computes the sum of all integers in the list. Or you can import the list with your own notions of cons and nil on which you can then work directly.
compile a list representation built by the application: the high-level produced some data structure, and when it is done it only wants to consider it as an abstract list. This would create the low-level implementation of the list. There is an interesting boundary problem here: we want to put into a separated low-level graph some nodes that originally belonged to the high level, but how much? Where should we stop pulling things down? In the example of the list, if the high-level has built a cons-list, we must pull the cons nodes down, and their cdr, recursively, but not their car -- the list items must stay at the high level. Various languages have various solutions to delimitate things, like quoting (corresponding to "start compilation here") and unquoting (corresponding to "but stop pulling things down at this point"). Now unquoting might be horrific with arbitrary graphs (as opposed to trees), and as the list example showed there are cases where we don't need any quoting (it is somehow implicit in the types). The boundary question admits multiple answers depending on the cases, and so is not solved by the framework: it is up to the compilation algorithm to "consume" the high-level nodes that it wants to pull down, and not the other ones.

We now proceed to adding a sort method to our lists. To invoke this method, we will want the application to build a graph like this:

                                   i1  i2  i3
                                   |   |   |
                                   |   |   |
<your application> ------ sort --- list-node

In traditional graph reduction systems, some compiler has produced a rule according to which when a sort node is next to a list-node, it expands to something corresponding to the corresponding method call. However, conceptually, there are levels at which the call is an atomic operation: at these abstraction levels, the above graph should reduce in a single step to:

                          i3  i1  i2
                          |   |   |
                          |   |   |
<your application> ------ list-node

where the list is now sorted. "Atomic" is thus not an absolute word: some operations which are atomic at some conceptual level are not atomic at some lower levels. (There are always lower levels at which any operation is not atomic!) (The same could be said about e.g. having side-effects or performing i/o -- these expressions have no absolute meaning.)

In our example, applying the sort method is atomic at the application level, but it spawns a new lower level at which the operation is detailled. This is done by matching the pattern --- sort --- list-node:

                                   i1  i2  i3
                                   |   |   |
                      /============|===|===|===\
                      I            |   |   |   I
<your application> ------ sort --- list-node   I
                      I                        I
                      \========================/

                              /\
                             /__\
                              ||   (big fat arrow)
                              ||

                /==================================\
                I                                  I
                I                                  I
                I     ...                          I
                I                                  I
                \==================================/

The low-level graph is written in a potentially different language that implements the sorting algorithm. It is actually a closure: the high-level list-node is also encoded in the low-level graph. The big fat arrow means that the initial low-level state corresponds ("means the same things as") the matched pattern in the high-level graph. When the low-level graph evolves, it no longer directly corresponds to anything at the high-level. This is typical; intermediate low-level operations do not correspond to anything when seen at a higher level. In the above example, the two levels can even be seen as initially out-of-synch, but striving for synchronization, which is achieved when the method execution is finished. Thus method execution can be seen as synchronizing the low-level with the high-level. When this is done, the big fat arrow is reduced (again by catamorphism) to push the result back at the high-level.

It is interesting to note that even the classical single-level pattern-based reduction can be seen like this. Just imagine that the above low-level box is written in the same language as the high-level graph, and contains the next step of an atomic reduction you want to perform on the matched pattern. Again, the "low-level" graph conceptually represents the same thing as the original subgraph -- in this case, the low-level graph just happens to be written in the same language as the high-level graph. The low-level is then put back in place, replacing the original matched subgraph, using the identity catamorphism (i.e. the catamorphism which need not translate anything at all). When this is done, the high-level graph has evolved of one step.

More generally, using variants of pattern matching will be useful to send information back and forth across the meta-levels. Here is how for example the sort method can generically perform binary comparisons on the items on the list. We introduce a name, say compare, in the language in which sort is written (the low-level in the above picture). The caller (high-level) provides a concrete interpretation for it as follows: the rule that spawns the low-level sort closure is paired with another cross-level rule that translates a low-level pattern into a high-level one:

       |                                |
    compare                       int-lower-than
     /   \            ___|\          /     \
    /     \              |/         /       \
item1     item2                   i1         i2

and possibly another rule to send back the concrete boolean result of the latter. Thus the site of the function or method application plays the role of meta-level for the execution of the function, and can add some new rules. Of course, we will want to avoid adding too many rules, as they might mess up the internal state of the function exection. In some cases the low-level will even want to protect its state: it can for example ask the meta-meta-level to protect it from the meta-level.

Nevertheless, if you are authorized and careful, you can add interesting behavior to the function you are calling. In this model, the meta-level directly controls what the low-level sees. We can independently add various meta-features to the language running the function. It can be with the function's support (as in the compare example above, or for exception handling, etc.) or not (as in running a function with a lazy argument or in statistic-collecting mode).

Syntax and Semantic

(to be completed.)

(yes I know it's what we wanted to do in the sprints. Of course we are out of schedule :-)