Programming with abstract data types

More than with any other part of Standard ML programming with abstypes requires a certain discipline. We have in the abstype concept a means of localising the creation of values of a particular type and the reason for wishing to do this is that we can validate the creation of these values, rejecting certain undesirable ones, perhaps by raising exceptions. Moreover, we can exploit the validation within the abstype definition itself. Here we implement sets as ordered lists without repetitions.

abstype 'a ordered_set = Set of (('a * 'a) -> bool) * 'a list
  fun emptyset (op <=) = Set (op <=, [])
  fun addset (x, Set (op <=, s)) =
    let fun ins [] = [x]
          | ins (s as h::t) = 
            if x = h then s else
            if x <= h then x :: s else h :: ins t
     in Set (op <=, ins s)
  fun memberset (x, Set (_, [])) = false
    | memberset (x, Set (op <=, h::t)) = 
      h <= x andalso (x = h orelse memberset (x, Set (op <=, t)))

The abstype mechanism ensures that sets have been created either by the function emptyset or by the function addset. Both of these functions construct sorted lists and thus it is safe to exploit this ordering in memberset. By this time, because we are imposing a particular order on the values in the list it is crucially important we do not provide access to the Set constructor (say via val mk_ set = Set) because otherwise a user of this ordered_set abstype could create an unordered set thus.

val myset = mk_set (op <=, [1, 5, 3, 2, 8])
This directly constructed set value will then give unexpected results, as detailed below.
memberset (1, myset) = true
memberset (5, myset) = true
memberset (3, myset) = false
memberset (2, myset) = false
memberset (8, myset) = true
In the context of a larger program development these occasional unexpected results might sometimes lead to observable errors which would be difficult to diagnose.


Would it be prudent to add the following definition of equal within the definition of ordered set? What are the circumstances in which its use might give a misleading result?

fun equal (Set (_, s1), Set (_, s2)) = s1 = s2;

Having seen a simple implementation of sets as ordered lists we could improve this to an implementation which used binary trees and improve that to an implementation which used balanced trees. Fortunately this has already been done for us by Stephen Adams [Ada93] whose paper describes his implementation.