Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Jones N.D.Partial evaluation and automatic program generation.1999

.pdf
Скачиваний:
9
Добавлен:
23.08.2013
Размер:
1.75 Mб
Скачать

Congruence and consistency of annotations 171

8.3.1Well-annotated expressions

A simple and traditional way to preclude type check errors is to devise a type system. In typed functional languages, a type inference algorithm such as algorithm W checks that a program is well-typed prior to program execution [184]. If it is, then no run-time summand tags or checks are needed. Type checking is quite well understood and can be used to get a nice formulation of the problem to be solved by binding-time analysis [104,202].

We saw in Section 5.7 that type rules can be used to check well-annotatedness, and we now apply similar reasoning to the lambda calculus.

De nition 8.1 The two-level types t are as follows, where ranges over type variables:

t ::= j S j D j t ! t

 

A type environment is a mapping from program variables to types.

2

De nition 8.2 Let be a type environment mapping the free variables of a twolevel expression te to their types. Then te is well-annotated if ` te : t can be

deduced from the inference rules in Figure 8.4 for some type t.

2

For example, the two-level expression power-ann of Example 8.1 is well-annota- ted in type environment = [n 7!S, x 7!D]. The whole expression has type D, and the part (fix p ...) has type S ! D ! D.

Our lambda calculus is basically untyped, but the well-annotatedness ensures that all program parts evaluated at partial evaluation time will be well-typed, thus ensuring specialization against type errors. The well-annotatedness criterion is, however, completely permissive concerning the run-time part of a two-level expression. Thus a lambda expression without static operators is trivially welltyped | at partial evaluation time.

Two-level expressions of type S evaluate (completely) to rst-order constants, and expressions of type t1 ! t2 evaluate to a function applicable only at partial evaluation time. The value by T of a two-level expression te of type D is a onelevel expression e. For partial evaluation we are only interested in fully annotated programs p-ann that have type D. In that case, T [[p-ann]] s (if de ned) will be a piece of code, namely the residual program.

In our context, the result about error freedom of well-typed programs can be formulated as follows. Proof is omitted since the result is well-known.

De nition 8.3 Let t be a two-level type and v be a two-level value. We say that t suits v i one of the following holds:

1.t = S and v = ct"Const for some ct 2 Const.

2.t = D and v = cd"Code for some cd 2 Code.

172Partial Evaluation for the Lambda Calculus

3.(a) t = t1 ! t2, v = f"2Funval for some f 2 2Funval, and

(b)8 v 2 2Val: t1 suits v implies t2 suits f(v).

A type environment suits an environment if for all variables x bound by , (x)

suits (x).

2

The following is a non-standard application of a standard result [184].

Proposition 8.1 (`Well-annotated programs do not go wrong') If ` te : t, and

suits s, then T [[te]] s does not yield a projection error.

2

Of course T can `go wrong' in other ways than by committing type errors. Reduction might proceed in nitely (so T [[p-ann]] s is not de ned) or residual code might be duplicated. We shall not discuss these problems here.

8.4Binding-time analysis

De nition 8.4 The annotation-forgetting function : 2Exp !Exp, when applied to a two-level expression te, returns a one-level expression e which di ers from te

only in that all annotations (underlines) and lift operators are removed.

2

De nition 8.5 Given two-level expressions, te and te1, de ne te v te1 by

1.(te) = (te1 )

2.All operators underlined in te are also underlined in te1

2

Thus v is a preorder on the set of two-level expressions. Given a -expression e, let a binding-time assumption for e be a type environment mapping each free variable of e to either S or D.

De nition 8.6 Given an expression e and a binding-time assumption , a completion of e for is a two-level expression te1 with (te1) = e and ` te1 : t for some type t. A minimal completion is an expression te2 which is a completion of

te ful lling te2 v te1 for all completions te1 of e.

2

Minimal completions are in general not unique. Assume = [y 7!D], and e = ( x.x+y) 4. There are two minimal completions, te1 = ( x.x+y) (lift 4) and te2 = ( x.(lift x)+y) 4 which yield identical residual programs when partially evaluated. The de nition of v does not distinguish between (minimal) completions which di er only in the choice of lift-points. Residual programs are identical for completions te1 and te2 if te1 v te2 and te2 v te1 , and the impact of di erent choices on e ciency of the partial evaluation process itself is of little importance.

Simplicity versus power in Lambdamix 173

The requirement that be a binding-time assumption implies that all free variables are rst-order. This ensures the existence of a completion. Note that a-bound variable x can get any type in completions, in particular a functional type. Possible con icts can be resolved by annotating the abstraction(s) and application(s) that force x to have a functional type.

The task of binding-time analysis in the -calculus is brie y stated: given an expression e and a binding-time assumption nd a minimal completion of e for. In Section 8.6 we show by example that this can be done by type inference, and in Section 8.7 we show how to do it in a much more e cient way.

Proposition 8.2 Given an expression e and a binding-time assumption there exist(s) minimal completion(s) of e for .

Proof Follows from the properties of the constraint-based binding-time analysis

algorithm in Section 8.7

2

8.5Simplicity versus power in Lambdamix

A value of type t =6 D can only be bound to a variable by applying a function of type t ! t0. The partial evaluation time result of such a statically performed application is found by evaluating the function body, no matter what the type of the argument or the result is. This corresponds closely to unfolding on the y of all static function calls (see Section 5.5).

Lambdamix does not perform specialization of named program points. Rather, generation of multiple variants of a source expression can be accomplished as an implicit result of unfolding a fix operator, since static variables may be bound to di erent values in the di erent unfoldings.

The only way to prevent a function call from being unfolded is to annotate the function as dynamic: . All applications of that function must accordingly be annotated as dynamic. Dynamic functions . . . can only have dynamic arguments (Figure 8.4). Note that this restriction does not exist in Chapter 5 where named functions are specialized. As an example, consider the append function, app, written as a lambda expression:

(fix app. xs. ys. if (null? xs) then ys

else (cons (car xs) (app (cdr xs) ys))) xs0 ys0

Partial evaluation with xs0 = '(a b) and dynamic ys0 yields (cons 'a (cons 'b ys0)), a result similar to that produced by the Scheme0 specializer from Chapter 5 (with any reasonable unfolding strategy). Lambdamix handles this example well because the recursive calls to app should be unfolded to produce the optimal residual program. Unfolding the calls allows Lambdamix to exploit the static argument, (cdr xs).

174 Partial Evaluation for the Lambda Calculus

Now assume that xs0 is dynamic and that ys0 is static with value '(c d). When applied to a corresponding problem, the techniques from Chapter 5 would produce the residual Scheme0 program

(define (app-cd xs) (if (null? xs)

'(c d)

(cons (car xs) (app-cd (cdr xs)))))

where the recursive call to app-cd is not unfolded. Now consider this problem in the Lambdamix framework. With dynamic ys0, a minimal completion of the append program is:

(fix app. xs. ys. if (null? xs) then (lift ys)

else (cons (car xs) (app (cdr xs) ys))) xs0 ys0

Note that even though xs0 and xs are dynamic the function xs. ys.. . . is still static in the minimal completion. Lambdamix will loop in nitely by unfolding the recursive applications of app. To avoid in nite unfolding, the recursive application (app (cdr xs) ys) must be annotated as dynamic, which forces the whole expression fix app.. . . to be annotated as dynamic. This means that no computation can be done by terminating partial evaluation.

In this particular example, specialization of the named function app with respect to rst-order data ys0 = '(c d) could be obtained by simple methods but to get a general solution to this class of problems we must also consider specialization with respect to higher-order values, i.e., functions. We shall return to this in Chapter 10.

8.5.1Optimality of Lambdamix

Lambdamix has been tested on several interpreters derived from denotational language de nitions [106]. Such interpreters are compositional in the program argument, which means that recursive calls in the interpreter can be safely unfolded when the interpreter is specialized with respect to a concrete source program. Lambdamix often performs well on interpreters ful lling compositionality, and is often able to specialize away interpretive overhead such as syntactic dispatch, environment lookups, etc.

A compelling example: when the self-interpreter from Figure 8.1 (after removing all tagging and untagging operations) is specialized with respect to a lambda expression e, the residual program is an expression e0 which is identical to e modulo renaming of variables and insigni cant coding of base function applications. Thus Lambdamix is nearly optimal as de ned in Chapter 6. (A small di erence: the call: (+ e1 e2) is transformed into (apply '+ e1 e2), etc. The problem can be fully eliminated by treating base functions as free variables, bound in the initial

Binding-time analysis by type inference 175

environment [106] or by a simple post processing like in Chapter 11.)

8.6Binding-time analysis by type inference

An intuitively natural approach to binding-time analysis for the lambda calculus uses a variant of the classical Algorithm W for polymorphic type inference [106, 185,202]. The guiding principle is that the static parts of an annotated program must be well-typed. This naturally leads to an algorithm that tries to type a given program in its given type environment.

If this succeeds, all is well and specialization can proceed. If type inference fails, the application of a user-de ned or base function that led to the type con ict is made dynamic (i.e. an underline is added), and the process is repeated. Eventually, enough underlines will be added to make the whole well-typed and so suitable for specialization.

We only give an example for brevity, since the next section contains a much more e cient algorithm. Recall the power program of Example 8.1:

(fix p. n'. x'.

if (= n' 0) then 1

else (* x' (p (- n' 1) x'))) n x

with initial type environment = [n 7!S, x 7!D]. At the if, Algorithm W works with the type environment:

[p 7!(S ! D ! ); n0 7!S; x0 7!D; n 7!S; x 7!D]

where is an as yet unbound type variable. Thus expression (p (- n' 1) x') has type , which is no problem. This leads, however, to a type con ict in expression (* x' (p (- n' 1) x')) since static operator * has type S S ! S, in con ict with x', which has type D.

The problem is resolvable by changing * to *, with type D D ! D. This forces= D so the else expression has type D. The single remaining con ict, that 1 has type S 6= D, is easily resolved by changing the 1 to lift 1, or by underlining it. The rst solution leads to the annotation of Example 8.1.

8.7BTA by solving constraints

In this section we shall show an elegant and e cient solution to the problem of binding-time analysis for the lambda calculus. The key observation is that the requirement that a two-level expression is well-annotated can be formulated as a set of constraints on the types of the individual subexpressions. These constraints can be e ciently reduced to a normal form from which the minimal completion is

176 Partial Evaluation for the Lambda Calculus

easily computed. The description is adapted from a paper by Fritz Henglein [114]. That paper also gives the details of the e cient constraint reduction and proofs of the stated propositions.

The de nition of completions places no restriction on the insertion of lifts (apart from the obvious demand that the expression must be well-annotated). To simplify the exposition, we shall assume for now that no lifts are allowed. Once the basic concepts are treated we present the modi cations needed to account for lift-insertion (Section 8.7.3).

8.7.1Combining static and dynamic type rules

Consider the type rules for and :

(Abstr)

 

[x 7!t2] ` te : t1

` x.te : t2 ! t1

 

(Abstr-dyn)

 

[x 7!D] ` te : D

 

 

`

 

x.te : D

 

 

Compare these rules with a combined rule, which we call Abstr-comb:

(Abstr-comb)

[x 7!t2] ` e : t1 ((t2 ! t1) = t _ t = t1 = t2 = D)

 

` x.e : t

 

An application of rule (Abstr-comb) corresponds exactly to an application of either (Abstr) or (Abstr-dyn), depending on which disjuncts hold in (t2 ! t1 = t _ t = t1 = t2 = D). By making combined rules for the other operators that have both static and dynamic versions we get a type system, TypeComb, in which an expression e is typable if and only if e has completions. Given the type of all subexpressions of e in TypeComb, we immediately have the corresponding completion of e. For example, if a subexpression of form x.e1 has type D, then in the corresponding completion the lambda will be annotated . We nd completions for an expression e by nding typings of e in TypeComb.

8.7.2Constraints on expression types

Let e be a -expression and a binding-time environment for e. We associate a unique type variable x with every -bound variable x occurring in e and a unique type variable e1 with every subexpression e1 in e. We assume that all -bound variables are distinct.

As an informal example consider the expression ( x.x y) z and assume bindingtime environment = [y 7!D, z 7!D]. Let be the type variable associated with ( x.x y). All other expressions must have type D, but both = D and = D ! D would give correct typings. The corresponding completions are respectively

BTA by solving constraints 177

( x.x @ y) @ z and ( x.x @ y) z.

Our strategy is to translate the inference rules of the system TypeComb into constraints on the type variables associated with the subexpressions and bound variables of a given e to be analysed. The next step is to nd the most general substitution of type terms for type variables such that all constraints are satis ed. This substitution characterizes all possible completions of e, and among these we choose a minimal completion.

We de ne b and f to be the ` at' partial orders on type terms whose only strict inequalities are

S <b D

D ! D <f D

Note that (t2 ! t1 = t _ t = t1 = t2 = D) , t2 ! t1 f t.

A constraint system C is a multiset1 of constraints of the form

0 ! 00 f ,

b ,

= 0, and

> 0.

where , 0, 00, are type variables or a type constant S or D. A substitution V (of ground type expressions for type variables) is a solution of C if the following two conditions are ful lled:

1. For every constraint of form

V ful ls

 

0 ! 00 f

V ( 0 ! 00) f V ( )

 

b

V ( ) b V ( )

 

= 0

V ( ) = V ( 0)

 

 

 

> 0

V ( ) = D ) V ( 0) = D

 

 

 

 

2. For every type variable not occurring in C we have V ( ) = .2

We write Sol(C) for the set of all solutions of C.

We de ne the constraint system C (e) by induction as follows.

1Using multisets instead of sets leads to a simpler formulation and faster implementation of the constraint transformations rules

2This condition guarantees that solutions V and V 0 are equal whenever their restrictions to the variables occurring in C are equal.

178 Partial Evaluation for the Lambda Calculus

Form of e

C (e) =

 

 

 

 

x.e1

f x ! e1 f e g [ C (e1)

e1 e2

f e2 ! e f e1 g [ C (e1) [ C (e2)

fix e1

f e ! e f e1 g [ C (e1)

if e1 then e2 else e3

f S b e1 , e = e2 , e = e3 , e1

 

> e g

 

 

 

[ C (e1) [ C (e2 ) [ C (e3)

c

f S = e g

op e1 e2

f S b e1 , e1 = e2 , e2 = e g [ C (e1) [ C (e2)

A -bound variable x

f x = e g

Free x with (x) = t

f t = e g

Every type derivation for a -expression e corresponds uniquely to a type labelling of the syntax tree of e; that is, to a mapping of ( -expression) occurrences in e into type expressions. A type labelling that arises from a type derivation in this fashion can, however, equally well be viewed as a mapping from the canonical type variables associated above with the occurrences in e to type expressions. Consequently every (implicit) type derivation for a -expression e determines uniquely a substitution on these type variables by mapping every other type variable to itself. By induction on the syntax of -expressions e it can be shown that such a substitution is a solution of the constraint system C (e) and vice versa | every solution of C (e) is a substitution determined by a type derivation for e. Since every implicit type derivation of e corresponds to a unique completion of e we have the following proposition.

Proposition 8.3 For every -expression e and binding-time assumption for e there is a one-to-one correspondence between the completions of e and the solutions of

C (e).

2

8.7.3Inserting lift expressions

The de nition of a completion allows any subexpression e of type S to be replaced by lift e of type D. When given an unannotated expression e there is no obvious a priori way to determine where to insert lifts to obtain completions. Our solution is to associate two type variables e1 and e1 with each subexpression e1 of e. As above, e1 represents the `immediate' type of e1, and e1 represents its type after possible lifting. Given a solution V , the relation between the two variables must be V ( e1 ) b V ( e1 ), where equality means `no lift inserted' and <b indicates insertion of a lift. Below is the revised de nition of C (e).

It could be argued that it would be conceptually cleaner to introduce a separate class of constraints, say l, to control lift insertion instead of using b which was introduced for other purposes. Running the risk of unnecessary confusion, we have

BTA by solving constraints 179

chosen to use b anyway because its properties also capture lift insertion. Leaving out a separate l makes the presentation shorter.

Form of e

C (e) =

 

 

 

 

x.e1

f x ! e1 f e , e b e g [ C (e1)

e1 e2

f e2 ! e f e1 , e b e g [ C (e1) [ C (e2)

fix e1

f e ! e f e1 , e b e g [ C (e1)

if e1 then e2 else e3

f S b e1 , e = e2 , e = e3 , e1

 

> e, e b e g

 

 

 

[ C (e1) [ C (e2) [ C (e3)

c

f S = e, e b e g

op e1 e2

f S b e1 , e1 = e2 , e2 = e, e b e g

 

[ C (e1) [ C (e2)

A -bound variable x

f x b e g

Free x with (x) = t

f t = e , e b e g

Note that an abstraction x.e can never get type S and thus never be lifted. This insight could yield a small `optimization' of the constraint set generated above.

8.7.4Normalization of type constraints

In Section 8.7.2 we have seen that the type derivations for a -expression e under binding-time assumption | and thus its completions | can be characterized by the solutions of a constraint system C (e). In this section we present transformations that preserve the set of solutions of such a constraint system. A constraint system in normal form with respect to these transformations (i.e., cannot be transformed any further) will have the property that it directly de nes a minimal solution.

Our transformation rules de ne a labelled reduction relation C V C0, where C

)

and C0 are constraint systems and V is a substitution. If the substitution is the identity substitution we simply write C ) C0. For substitution V and constraint system C, the result of the application of V to all type expressions in C is written V (C). Let G(C) be the directed graph, where the nodes are the variables appearing in constraint system C and where the edge ( , ) is included if and only if there is an inequality constraint of the form ! 0 f or 0 ! f in C. If G(C) contains a cycle we say C is cyclic; acyclic otherwise.3 The transformation rules are given in Figure 8.5. The rst two inequality constraint rules show how inequality constraints with identical right-hand sides are eliminated: if the lefthand sides have the same type constructor then these left-hand sides are equated

3Constraints of the form = 0 and b 0 need not be considered in the de nition of cyclicity since our transformation rules eliminate all equational constraints, and b-inequality constraints remaining in a normal form constraint system are irrelevant.

180 Partial Evaluation for the Lambda Calculus

in the `reduced' system (Rule 1a); if the left-hand sides have di erent left-hand side type constructors then the right-hand side is equated with D (Rule 1b) and the inequalities are eventually eliminated by Rules 1g and 1h.

 

 

 

V

 

The transitive closure of the transformation rules is de ned by: C )+ C0 if C

V

V 0 V

V

V 0

 

 

)+ C0 for some C0, where V 0 V denotes the

) C0 and C )+ C0 if C )+ C0, C0

composition of V 0

and V . We say C is a normal form (or normalized) constraint

system if there is no C0 such that C V C0 for any V . We say C has a normal form

)

V

if there is a normal form C0 such that C )+ C0 for some substitution V . The correctness of the transformations is captured in the following proposition, which is easily proved by induction on the length of transformation sequences and by case analysis of the individual rules using elementary properties of b; f .

Proposition 8.4 (Soundness and completeness of transformations)

V

Sol(C0)g.

 

Let C )+ C0. Then Sol(C) = f(V 0 V ) j V 0 2

2

The transformations can be used to derive an algorithm for normalizing constraint systems based on the following proposition.

Proposition 8.5 (Normalization of constraint systems)

1.The transformations of Figure 8.5 are weakly normalizing; that is, every C has a normal form.

2.If C0 is a normal form constraint system then

(a)it has no equational constraints;

(b)it is acyclic;

(c)its constraints are of the form ! ' f , b or > 0 where ,

0 are type variables; is a type variable or the type constant D; and

is a type variable or the type constant S.

(d)for every inequality constraint of the form ! ' f the type variable

does not occur on the right-hand side of other f -inequalities or on the left-hand side of b-inequalities;

(e)for every inequality constraint of the form S b the type variable does not occur on the right-hand side of f -inequalities or on either side of b-inequalities.

3.If C contains no constraints of the form b 0 where is a type variable

S

and C )+ C0 then C0 contains no constraint of that form either.