How to add laziness to a strict language, without even being odd Philip Wadler Bell Laboratories, Lucent Technologies wadler@research.bell-labs.com It's easy to add laziness to a strict language. A number of schemes have been proposed, all variants of the following idea: add a type constructor for suspensions, and language constructs to delay and force evaluation. e : t ------------------ delay e : t susp e : t susp ------------- force e : t The delayed expression is not evaluated unless the corresponding suspension is forced, and if this happens the suspension is overwritten with a value, so the expression is not evaluated again. The `delay' operation must be a new language construct, not a function, because it does not evaluate its argument. For symmetry, `force' is also taken to be a language construct. However, although easy to understand and implement, these schemes are not necessarily easy to use. In particular, every scheme I know of encourages a style of programming that forces more evaluation than traditional lazy languages (I call this style odd), and discourages a style of programming that delays evaluation as in traditional lazy languages (I call this style even). This note explicates the two styles, showing how the odd style is easy to encode in the traditional `delay' and `force' syntax, while the even style is harder to encode. It then presents a new `lazy' syntax, and shows how the even style is easy to encode in this syntax, while the odd style is harder to encode. The new `lazy' syntax is defined by translation into the `delay-force' syntax. Comparisons are drawn with two other syntaxes, one used in CAML and one proposed by Chris Okasaki. Standard ML is taken as the strict language to be extended with lazy constructs. The odd style, with ease ~~~~~~~~~~~~~~~~~~~~~~~~ Consider first a language simply augmented with `delay', `force', and `susp'. Streams in the odd style are defined as follows. One can find definitions like this in the texts of Abelson and Sussman (1985) and Paulson (1991). datatype 'a stream = Nil | Cons of 'a * 'a stream susp; (* map : ('a -> 'b) -> 'a stream -> 'b stream *) fun map f Nil = Nil | map f (Cons(x,xs)) = Cons(f x, delay (map f (force xs))); (* countdown : int -> int stream *) fun countdown n = Cons(n, delay (countdown (n-1))); (* cutoff : int -> 'a stream -> 'a list *) fun cutoff 0 xs = [] | cutoff n Nil = [] | cutoff n (Cons(x,xs)) = x :: cutoff (n-1) (force xs); In this style, here is the stream corresponding to the list [1,2]. Cons (1, delay Cons (2, delay Nil)) This style is called odd because any finite stream will contain an odd number of constructors, where Cons, Nil, and delay are counted as constructors. This style is also called odd because it exhibits behaviour that may seem peculiar to anyone familiar with streams. One would expect the expression cutoff 5 (map (sqrt o real) (countdown 4)) to return the list of square roots of the integers from four to zero [2.0, 1.73, 1.41, 1.0, 0.0] but instead it returns uncaught exception Sqrt. The problem is that this definition of streams forces more evaluation than necessary. Even when the cutoff reaches zero, the stream must be evaluated either to Nil or Cons, forcing one to compute the square root of minus one without need, and indeed with some harm. The even style, with difficulty ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here is the same program, rewritten to exhibit the traditional behaviour. The key change is to the definition of the type for streams, from which all else follows. We use a convention of appending an underbar to name the auxiliary type and the operations upon it. Note that the `steam_' of this definition is equivalent to the `stream' of the previous definition. datatype 'a stream_ = Nil_ | Cons_ of 'a * 'a stream withtype 'a stream = 'a stream_ susp; (* map : ('a -> 'b) -> 'a stream -> 'b stream *) (* map_ : ('a -> 'b) -> 'a stream_ -> 'b stream_ *) fun map f ws = delay map_ f (force ws) and map_ f Nil_ = Nil_ | map_ f (Cons_(x,xs)) = Cons_(f x, map f xs) (* countdown : int -> int stream *) fun countdown n = delay Cons_(n, countdown (n-1)); (* cutoff : int -> 'a stream -> 'a list *) (* cutoff_ : int -> 'a stream_ -> 'a list *) fun cutoff 0 ws = [] | cutoff n ws = cutoff_ n (force ws) and cutoff_ n Nil_ = [] | cutoff_ n (Cons_(x,xs)) = x :: cutoff (n-1) xs; In this style, here is the stream corresponding to the list [1,2]. delay Cons (1, delay Cons (2, delay Nil)) This style is called even because any finite stream will contain an even number of constructors, where Cons, Nil, and delay are counted as constructors. The new program gives the expected result for the previous expression, returning a list of square roots rather than an exception. Alas, our definitions has nearly doubled in size, and halved in perspicuity. Paulson wrote to me that `My book originally used the second definitions you present, but that extra bit of laziness cost too much in legibility.' He reverted to the odd style, relegating even style to an exercise. The even style, with ease ~~~~~~~~~~~~~~~~~~~~~~~~~ Here is what the definition looks like under the new proposal. datatype lazy 'a stream = Nil | Cons of 'a * 'a stream; (* map : ('a -> 'b) -> 'a stream -> 'b stream *) fun lazy map f Nil = Nil | map f (Cons(x,xs)) = Cons(f x, map f xs); (* countdown : int -> int stream *) fun lazy countdown n = Cons (n, countdown (n-1)); (* cutoff : int -> 'a stream -> 'a list *) fun cutoff 0 xs = [] | cutoff n Nil = [] | cutoff n (Cons(x,xs)) = x :: cutoff (n-1) xs; Intuitively, the effect of writing `lazy' in a datatype definition is to make each constructor return a suspension, and the effect of writing `lazy' in a function definition is to delay evaluation of the right-hand side. The exact meaning of these constructs will be given later, by translation into `delay', `force', and `susp'. Under that translation, this program is equivalent to the previous program. A semantic subtlety: all functions and constructors here are strict, in that their arguments are always evaluated. Delayed evaluation is always indicated by adding the keyword `lazy' to a function or value definition. We will return to this point after discussing the translation of programs. A syntactic subtlety: it may seem more natural to say `lazy fun' than `fun lazy', but latter works better in the case where there are mutually recursive types or functions, only some of which are lazy. We'll see examples in the next section. The odd style, with difficulty ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If the odd style is truly what you intend, it is still possible. Here is a program for that purpose in the new notation. Again, the types dictate the structure of the program. datatype 'a stream = Nil | Cons of 'a * 'a stream_ and lazy 'a stream_ = Delay of 'a stream; (* map : ('a -> 'b) -> 'a stream -> 'b stream *) (* map_ : ('a -> 'b) -> 'a stream_ -> 'b stream_ *) fun map f Nil = Nil | map f (Cons(x,xs)) = Cons(f x, map_ f xs) and lazy map_ f (Delay xs) = Delay (map f xs); (* countdown : int -> int stream *) (* countdown_ : int -> int stream_ *) fun countdown n = Cons (n, countdown_ n) and lazy countdown_ n = Delay (countdown (n-1)); (* cutoff : int -> 'a stream -> 'a list *) fun cutoff 0 xs = [] | cutoff n Nil = [] | cutoff n (Cons(x,Delay(xs))) = x :: cutoff (n-1) xs; Under the translation rules given below, this program has exactly the same semantics as the previous program for odd style. The situation is pleasantly dual. The odd style in the new notation is almost exactly as hard to express as the even style was in the old. The translation ~~~~~~~~~~~~~~~ While the new syntax looks pleasant, it is useless unless there is a clear, rigorous, and uniform way to assign it a meaning. Fortunately, this can be done in a fairly straightforward way, by specifying a translation from the new language, with the `lazy' keyword, to the old language, augmented with `delay', `force', and `susp'. There is one rewrite rule for each form of `lazy' construct added to the language. (datatype) datatype lazy t = c1 of t1 | ... | cn of tn ===> datatype t_ = c1_ of t1 | ... | cn_ of tn withtype t = t_ susp where t is a type variable list followed by a type identifier, t_ is the same with a fresh type identifier name, and c1_, ..., cn_ are fresh constructor names. (fun) fun lazy f p11 ... p1n = e1 | ... | f pm1 ... pmn = em ===> fun f x1 ... xn = delay (f_ x1 ... xm) | f_ p11 ... p1n = force e1 | ... | f_ pm1 ... pmn = force em where f_ is a fresh variable name. (val) val lazy x = e ==> val x = delay (force e) (constructor) c ===> fn x => delay (c_ x) where c is declared in a lazy datatype, and c_ is the corresponding fresh constructor. (destructor) [We assume that before this step is applied that all uses of destructors are translated to case expressions, in the usual way.] case e0 of c1 x1 => e1 | ... | cn xn => en ===> case (force e0) of c1_ x1 => e1 | ... | cn_ xn => en where c1,...,cn are declared in a lazy datatype, and c1_,...,cn_ are the corresponding fresh constructors. Returning to the semantic subtlety mentioned previously, note that we have the translation cutoff 0 (Cons(sqrt ~1.0, Nil)) ===> cutoff 0 (fn x => delay (Cons_ x)) (sqrt ~1.0, Nil) and so evaluating this expression raises an error. On the other hand, we also have the translation let val lazy x = Cons(sqrt ~1.0, Nil) in cutoff 0 x end ===> let val x = delay (force (Cons(sqrt ~1.0, Nil))) in cutoff 0 x end and so evaluating this expression returns []. All functions and constructors evaluate their arguments; evaluation is suspended only where the keyword `lazy' appears. Previous work: CAML ~~~~~~~~~~~~~~~~~~~ Some versions of CAML allow summands of datatypes or fields of records to be declared lazy (Weiss 1990, Mauny 1991). Again, this notation makes it easy to use the odd style, harder to use the even style. Furthermore, one needs to know whether a constructor contains `lazy' in its declaration before one knows whether its arguments will be evaluated, unlike the notation described above. For instance, in the presence of the CAML declaration type 'a susp = Delay of 'a then the CAML term `Delay e' is like `delay e' in our old notation, and the CAML term `case e of Delay x => x' is like `force e' in our old notation. Here's a CAML type declaration for streams in odd style. type 'a stream = Nil | Cons of {head: 'a, lazy tail: 'a stream}; Here's one for streams in even style. type 'a stream_ = Nil | Cons of 'a * 'a stream and 'a stream = lazy Delay of 'a stream_; And here's one that corresponds to none of the previous examples. type 'a stream = Nil | lazy Cons of 'a * 'a stream; This last declaration suspends evaluation of both the head and the tail of the stream. Previous work: Okasaki ~~~~~~~~~~~~~~~~~~~~~~ Okasaki (1996) proposes that one abbreviate `delay e' by writing $e, and, dually, that the equivalent of `force e' be written using pattern matching, as (case e of $x => x). The `$' sign may be nested with other patterns. With this proposal it is even easier to implement streams in odd style. Here is the equivalent program. datatype 'a stream = Nil | Cons of 'a * 'a stream susp; (* map : ('a -> 'b) -> 'a stream -> 'b stream *) fun map f Nil = Nil | map f (Cons(x,$xs)) = Cons(f x, $map f xs); (* countdown : int -> int stream *) fun countdown n = Cons (n, $countdown (n-1)); (* cutoff : int -> 'a stream -> 'a list *) fun cutoff 0 xs = [] | cutoff n Nil = [] | cutoff n (Cons(x,$xs)) = x :: cutoff (n-1) xs; This looks elegant. Too bad it is not the program we desire! Okasaki extended ~~~~~~~~~~~~~~~~ In the course of pursuing this topic, I discovered a simple and logical addition to Okasaki's notation that makes it relatively easy to express even style, in at least some situations. The addition is to treat the declaration fun $f p11 ... p1n = e1 | ... | $f pm1 ... pmn = em as equivalent to fun f p11 ... p1n = force e1 | ... | f pm1 ... pmn = force em. This is clearly in line with the other abbreviations. Here is even style in Okasaki's notation, thus extended. datatype 'a stream_ = Nil | Cons of 'a * 'a stream withtype 'a stream = 'a stream_ susp; (* map : ('a -> 'b) -> 'a stream -> 'b stream_ *) fun $map f ($Nil) = $Nil | $map f ($Cons(x,xs)) = $Cons(f x, $map f xs) (* countdown : int -> int stream_ *) fun $countdown n = $Cons(n, $countdown (n-1))); (* cutoff : int -> 'a stream -> 'a list *) fun cutoff 0 xs = [] | cutoff n ($Nil) = [] | cutoff n ($Cons(x,xs)) = x :: cutoff (n-1) xs; The code for the functions can be derived from the corresponding code in `lazy' notation by uniformly replacing `Nil' with `$Nil', `Cons(x,xs)' with `$Cons(x,xs)', `map f xs' with `$map f xs', and `countdown n' with `$countdown n'. Thus, the test expression becomes cutoff 5 ($map (sqrt o real) ($countdown 4)). The meaning of this program is identical to that given previously, although a little extra laziness has been introduced. In the lazy notation, the decrement in `countdown (n-1)' is performed immediately, while in this notation the decrement in `$countdown (n-1)' is performed only if the result of the suspension is demanded. Compared to the `lazy' notation, the `$' notation has three drawbacks. First, the program may become buried under dollar signs. Second, in the `lazy' notation one can switch between lazy and strict representations by adding a keyword at the definition site, while in the `$' notation one must change the definition site and add a symbol at each call site as well; so the `lazy' notation is more abstract. Third, the `$' notation works best only when all function and constructor applications are fully saturated. For instance, in the `lazy' notation one may square each element in a stream of streams by writing `map (map sqr) xss' while in the `$' notation one must write `$map (fn xs => $map sqr xs) xss'. On the other hand, the `$' notation has the advantage of fine control over laziness. This will be useful in those few situations where the odd style is actually desired, because one is willing to pay an extra price in complexity to eliminate every unnecessary vestige of lazy evaluation. Okasaki, even better ~~~~~~~~~~~~~~~~~~~~ After the first part of this note was written, Chris Okasaki observed that the `lazy' syntax combines perfectly well with the `$' syntax. Neither syntactic sugar interferes with the other, and the sum appears to combine the advantages of both. Acknowledgements ~~~~~~~~~~~~~~~~ I owe a special debt to Chris Okasaki, who entered into a long correspondence on this topic, and whose feedback has greatly sharpened my thinking. For comments on earlier drafts, I also thank Bob Harper, Dave MacQueen, Michel Mauny, Larry Paulson, and Pierre Weis. References ~~~~~~~~~~ Harold Abelson and Gerald Jay Sussman (1985), The Structure and Interpretation of Computer Programs, MIT Press. Michel Mauny (1991), Integrating lazy evaluation in Strict ML, INRIA, Technical report 137. Chris Okasaki (1996), Purely Functional Data Structures, PhD Thesis, Carnegie Mellon University, Technical report CMU-CS-96-177. Pierre Weis et al. (1990), The CAML Reference Manual, version 2.6.1, INRIA, Technical report 121. Larry Paulson (1991), ML for the Working Programmer, Cambridge University Press.