How to add laziness to a strict language,
without even being odd
Philip Wadler
Bell Laboratories, Lucent Technologies
wadler@research.bell-labs.com
It's easy to add laziness to a strict language. A number of schemes
have been proposed, all variants of the following idea: add a type
constructor for suspensions, and language constructs to delay and
force evaluation.
e : t
------------------
delay e : t susp
e : t susp
-------------
force e : t
The delayed expression is not evaluated unless the corresponding
suspension is forced, and if this happens the suspension is
overwritten with a value, so the expression is not evaluated again.
The `delay' operation must be a new language construct, not a
function, because it does not evaluate its argument. For symmetry,
`force' is also taken to be a language construct.
However, although easy to understand and implement, these schemes are
not necessarily easy to use. In particular, every scheme I know of
encourages a style of programming that forces more evaluation than
traditional lazy languages (I call this style odd), and
discourages a style of programming that delays evaluation as in
traditional lazy languages (I call this style even).
This note explicates the two styles, showing how the odd style is easy
to encode in the traditional `delay' and `force' syntax, while the even
style is harder to encode. It then presents a new `lazy' syntax, and
shows how the even style is easy to encode in this syntax, while the
odd style is harder to encode. The new `lazy' syntax is defined by
translation into the `delay-force' syntax. Comparisons are drawn with
two other syntaxes, one used in CAML and one proposed by Chris
Okasaki. Standard ML is taken as the strict language to be extended
with lazy constructs.
The odd style, with ease
~~~~~~~~~~~~~~~~~~~~~~~~
Consider first a language simply augmented with `delay', `force', and
`susp'. Streams in the odd style are defined as follows. One can
find definitions like this in the texts of Abelson and Sussman (1985)
and Paulson (1991).
datatype 'a stream = Nil | Cons of 'a * 'a stream susp;
(* map : ('a -> 'b) -> 'a stream -> 'b stream *)
fun map f Nil = Nil
| map f (Cons(x,xs)) = Cons(f x, delay (map f (force xs)));
(* countdown : int -> int stream *)
fun countdown n = Cons(n, delay (countdown (n-1)));
(* cutoff : int -> 'a stream -> 'a list *)
fun cutoff 0 xs = []
| cutoff n Nil = []
| cutoff n (Cons(x,xs)) = x :: cutoff (n-1) (force xs);
In this style, here is the stream corresponding to the list [1,2].
Cons (1, delay Cons (2, delay Nil))
This style is called odd because any finite stream will contain
an odd number of constructors, where Cons, Nil, and delay are
counted as constructors.
This style is also called odd because it exhibits behaviour that may
seem peculiar to anyone familiar with streams. One would expect the
expression
cutoff 5 (map (sqrt o real) (countdown 4))
to return the list of square roots of the integers from four to zero
[2.0, 1.73, 1.41, 1.0, 0.0]
but instead it returns
uncaught exception Sqrt.
The problem is that this definition of streams forces more evaluation
than necessary. Even when the cutoff reaches zero, the stream must
be evaluated either to Nil or Cons, forcing one to compute the
square root of minus one without need, and indeed with some harm.
The even style, with difficulty
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here is the same program, rewritten to exhibit the traditional
behaviour. The key change is to the definition of the type for
streams, from which all else follows. We use a convention of
appending an underbar to name the auxiliary type and the operations
upon it. Note that the `steam_' of this definition is equivalent to
the `stream' of the previous definition.
datatype 'a stream_ = Nil_ | Cons_ of 'a * 'a stream
withtype 'a stream = 'a stream_ susp;
(* map : ('a -> 'b) -> 'a stream -> 'b stream *)
(* map_ : ('a -> 'b) -> 'a stream_ -> 'b stream_ *)
fun map f ws = delay map_ f (force ws)
and map_ f Nil_ = Nil_
| map_ f (Cons_(x,xs)) = Cons_(f x, map f xs)
(* countdown : int -> int stream *)
fun countdown n = delay Cons_(n, countdown (n-1));
(* cutoff : int -> 'a stream -> 'a list *)
(* cutoff_ : int -> 'a stream_ -> 'a list *)
fun cutoff 0 ws = []
| cutoff n ws = cutoff_ n (force ws)
and cutoff_ n Nil_ = []
| cutoff_ n (Cons_(x,xs)) = x :: cutoff (n-1) xs;
In this style, here is the stream corresponding to the list [1,2].
delay Cons (1, delay Cons (2, delay Nil))
This style is called even because any finite stream will contain
an even number of constructors, where Cons, Nil, and delay are
counted as constructors.
The new program gives the expected result for the previous expression,
returning a list of square roots rather than an exception.
Alas, our definitions has nearly doubled in size, and halved in
perspicuity. Paulson wrote to me that `My book originally used the
second definitions you present, but that extra bit of laziness cost
too much in legibility.' He reverted to the odd style, relegating
even style to an exercise.
The even style, with ease
~~~~~~~~~~~~~~~~~~~~~~~~~
Here is what the definition looks like under the new proposal.
datatype lazy 'a stream = Nil | Cons of 'a * 'a stream;
(* map : ('a -> 'b) -> 'a stream -> 'b stream *)
fun lazy map f Nil = Nil
| map f (Cons(x,xs)) = Cons(f x, map f xs);
(* countdown : int -> int stream *)
fun lazy countdown n = Cons (n, countdown (n-1));
(* cutoff : int -> 'a stream -> 'a list *)
fun cutoff 0 xs = []
| cutoff n Nil = []
| cutoff n (Cons(x,xs)) = x :: cutoff (n-1) xs;
Intuitively, the effect of writing `lazy' in a datatype definition is
to make each constructor return a suspension, and the effect of
writing `lazy' in a function definition is to delay evaluation of the
right-hand side. The exact meaning of these constructs will be given
later, by translation into `delay', `force', and `susp'. Under that
translation, this program is equivalent to the previous program.
A semantic subtlety: all functions and constructors here are strict,
in that their arguments are always evaluated. Delayed evaluation is
always indicated by adding the keyword `lazy' to a function or value
definition. We will return to this point after discussing the
translation of programs.
A syntactic subtlety: it may seem more natural to say `lazy fun' than
`fun lazy', but latter works better in the case where there are
mutually recursive types or functions, only some of which are lazy.
We'll see examples in the next section.
The odd style, with difficulty
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the odd style is truly what you intend, it is still possible.
Here is a program for that purpose in the new notation. Again, the
types dictate the structure of the program.
datatype 'a stream = Nil | Cons of 'a * 'a stream_
and lazy 'a stream_ = Delay of 'a stream;
(* map : ('a -> 'b) -> 'a stream -> 'b stream *)
(* map_ : ('a -> 'b) -> 'a stream_ -> 'b stream_ *)
fun map f Nil = Nil
| map f (Cons(x,xs)) = Cons(f x, map_ f xs)
and lazy map_ f (Delay xs) = Delay (map f xs);
(* countdown : int -> int stream *)
(* countdown_ : int -> int stream_ *)
fun countdown n = Cons (n, countdown_ n)
and lazy countdown_ n = Delay (countdown (n-1));
(* cutoff : int -> 'a stream -> 'a list *)
fun cutoff 0 xs = []
| cutoff n Nil = []
| cutoff n (Cons(x,Delay(xs))) = x :: cutoff (n-1) xs;
Under the translation rules given below, this program has exactly the
same semantics as the previous program for odd style. The situation
is pleasantly dual. The odd style in the new notation is almost
exactly as hard to express as the even style was in the old.
The translation
~~~~~~~~~~~~~~~
While the new syntax looks pleasant, it is useless unless there is a
clear, rigorous, and uniform way to assign it a meaning. Fortunately,
this can be done in a fairly straightforward way, by specifying a
translation from the new language, with the `lazy' keyword, to the old
language, augmented with `delay', `force', and `susp'. There is one
rewrite rule for each form of `lazy' construct added to the language.
(datatype)
datatype lazy t = c1 of t1 | ... | cn of tn
===>
datatype t_ = c1_ of t1 | ... | cn_ of tn
withtype t = t_ susp
where t is a type variable list followed by a type identifier,
t_ is the same with a fresh type identifier name, and
c1_, ..., cn_ are fresh constructor names.
(fun)
fun lazy f p11 ... p1n = e1
| ...
| f pm1 ... pmn = em
===>
fun f x1 ... xn = delay (f_ x1 ... xm)
| f_ p11 ... p1n = force e1
| ...
| f_ pm1 ... pmn = force em
where f_ is a fresh variable name.
(val)
val lazy x = e
==>
val x = delay (force e)
(constructor)
c
===>
fn x => delay (c_ x)
where c is declared in a lazy datatype,
and c_ is the corresponding fresh constructor.
(destructor)
[We assume that before this step is applied that all
uses of destructors are translated to case expressions,
in the usual way.]
case e0 of
c1 x1 => e1
| ...
| cn xn => en
===>
case (force e0) of
c1_ x1 => e1
| ...
| cn_ xn => en
where c1,...,cn are declared in a lazy datatype,
and c1_,...,cn_ are the corresponding fresh constructors.
Returning to the semantic subtlety mentioned previously, note
that we have the translation
cutoff 0 (Cons(sqrt ~1.0, Nil))
===>
cutoff 0 (fn x => delay (Cons_ x)) (sqrt ~1.0, Nil)
and so evaluating this expression raises an error. On the
other hand, we also have the translation
let val lazy x = Cons(sqrt ~1.0, Nil) in cutoff 0 x end
===>
let val x = delay (force (Cons(sqrt ~1.0, Nil))) in cutoff 0 x end
and so evaluating this expression returns []. All functions and
constructors evaluate their arguments; evaluation is suspended only
where the keyword `lazy' appears.
Previous work: CAML
~~~~~~~~~~~~~~~~~~~
Some versions of CAML allow summands of datatypes or fields of records
to be declared lazy (Weiss 1990, Mauny 1991). Again, this notation
makes it easy to use the odd style, harder to use the even style.
Furthermore, one needs to know whether a constructor contains `lazy'
in its declaration before one knows whether its arguments will be
evaluated, unlike the notation described above.
For instance, in the presence of the CAML declaration
type 'a susp = Delay of 'a
then the CAML term `Delay e' is like `delay e' in our old notation,
and the CAML term `case e of Delay x => x' is like `force e' in our
old notation.
Here's a CAML type declaration for streams in odd style.
type 'a stream = Nil | Cons of {head: 'a, lazy tail: 'a stream};
Here's one for streams in even style.
type 'a stream_ = Nil | Cons of 'a * 'a stream
and 'a stream = lazy Delay of 'a stream_;
And here's one that corresponds to none of the previous examples.
type 'a stream = Nil | lazy Cons of 'a * 'a stream;
This last declaration suspends evaluation of both the head and
the tail of the stream.
Previous work: Okasaki
~~~~~~~~~~~~~~~~~~~~~~
Okasaki (1996) proposes that one abbreviate `delay e' by writing $e,
and, dually, that the equivalent of `force e' be written using pattern
matching, as (case e of $x => x). The `$' sign may be nested with
other patterns.
With this proposal it is even easier to implement streams in odd style.
Here is the equivalent program.
datatype 'a stream = Nil | Cons of 'a * 'a stream susp;
(* map : ('a -> 'b) -> 'a stream -> 'b stream *)
fun map f Nil = Nil
| map f (Cons(x,$xs)) = Cons(f x, $map f xs);
(* countdown : int -> int stream *)
fun countdown n = Cons (n, $countdown (n-1));
(* cutoff : int -> 'a stream -> 'a list *)
fun cutoff 0 xs = []
| cutoff n Nil = []
| cutoff n (Cons(x,$xs)) = x :: cutoff (n-1) xs;
This looks elegant. Too bad it is not the program we desire!
Okasaki extended
~~~~~~~~~~~~~~~~
In the course of pursuing this topic, I discovered a simple and
logical addition to Okasaki's notation that makes it relatively easy
to express even style, in at least some situations. The addition is
to treat the declaration
fun $f p11 ... p1n = e1
| ...
| $f pm1 ... pmn = em
as equivalent to
fun f p11 ... p1n = force e1
| ...
| f pm1 ... pmn = force em.
This is clearly in line with the other abbreviations.
Here is even style in Okasaki's notation, thus extended.
datatype 'a stream_ = Nil | Cons of 'a * 'a stream
withtype 'a stream = 'a stream_ susp;
(* map : ('a -> 'b) -> 'a stream -> 'b stream_ *)
fun $map f ($Nil) = $Nil
| $map f ($Cons(x,xs)) = $Cons(f x, $map f xs)
(* countdown : int -> int stream_ *)
fun $countdown n = $Cons(n, $countdown (n-1)));
(* cutoff : int -> 'a stream -> 'a list *)
fun cutoff 0 xs = []
| cutoff n ($Nil) = []
| cutoff n ($Cons(x,xs)) = x :: cutoff (n-1) xs;
The code for the functions can be derived from the corresponding code
in `lazy' notation by uniformly replacing `Nil' with `$Nil',
`Cons(x,xs)' with `$Cons(x,xs)', `map f xs' with `$map f xs', and
`countdown n' with `$countdown n'. Thus, the test expression becomes
cutoff 5 ($map (sqrt o real) ($countdown 4)).
The meaning of this program is identical to that given previously,
although a little extra laziness has been introduced. In the lazy
notation, the decrement in `countdown (n-1)' is performed immediately,
while in this notation the decrement in `$countdown (n-1)' is
performed only if the result of the suspension is demanded.
Compared to the `lazy' notation, the `$' notation has three drawbacks.
First, the program may become buried under dollar signs. Second, in
the `lazy' notation one can switch between lazy and strict
representations by adding a keyword at the definition site, while in
the `$' notation one must change the definition site and add a symbol
at each call site as well; so the `lazy' notation is more abstract.
Third, the `$' notation works best only when all function and
constructor applications are fully saturated. For instance, in
the `lazy' notation one may square each element in a stream of streams
by writing `map (map sqr) xss' while in the `$' notation one must
write `$map (fn xs => $map sqr xs) xss'.
On the other hand, the `$' notation has the advantage of fine control
over laziness. This will be useful in those few situations where the
odd style is actually desired, because one is willing to pay an extra
price in complexity to eliminate every unnecessary vestige of lazy
evaluation.
Okasaki, even better
~~~~~~~~~~~~~~~~~~~~
After the first part of this note was written, Chris Okasaki observed
that the `lazy' syntax combines perfectly well with the `$' syntax.
Neither syntactic sugar interferes with the other, and the sum appears
to combine the advantages of both.
Acknowledgements
~~~~~~~~~~~~~~~~
I owe a special debt to Chris Okasaki, who entered into a long
correspondence on this topic, and whose feedback has greatly sharpened
my thinking. For comments on earlier drafts, I also thank Bob Harper,
Dave MacQueen, Michel Mauny, Larry Paulson, and Pierre Weis.
References
~~~~~~~~~~
Harold Abelson and Gerald Jay Sussman (1985), The Structure and
Interpretation of Computer Programs, MIT Press.
Michel Mauny (1991), Integrating lazy evaluation in Strict ML,
INRIA, Technical report 137.
Chris Okasaki (1996), Purely Functional Data Structures, PhD Thesis,
Carnegie Mellon University, Technical report CMU-CS-96-177.
Pierre Weis et al. (1990), The CAML Reference Manual, version 2.6.1,
INRIA, Technical report 121.
Larry Paulson (1991), ML for the Working Programmer, Cambridge
University Press.