Binary Search Trees
Many algorithms make use of datastructures that represent dynamic sets, that is, a collection of elements that can grow, shrink, or otherwise change, over time. Stacks, queues, priority queues, and dictionaries may all be viewed as dynamic sets. If algorithms are to make use of dynamic sets without efficiency worries, it is important that the appropriate data structures are carefully chosen. The choice of implementation may be affected by the particular types of element involved, and by the relative frequencies of different operations being performed on the dynamic set. In this note we introduce search trees, which support a variety of operations on dynamic sets.
A heap is a vertically-ordered tree; a search tree is horizontally ordered. A binary search tree is a binary tree whose nodes are labelled by items in such a way that in-order traversal of the tree gives an ordered list of items. Searching for an item in a search tree is an O(h) operation, where h is the height of the tree. Balanced trees are important because the height of a balanced tree is O(lgn), where n is the number of nodes in the tree. In this section we look at functions to insert and retrieve elements from a binary search tree without worrying about keeping the tree balanced. Techniques for balancing will be covered later.
Recall the type declaration for a binary tree. We declare it in a signature for later use.
signature BTreeSig =
sig datatype ’a Tree = Lf
| Nd of ’a Tree * ’a * ’a Tree
We will implement sets of items as search trees of type Item tree, where the type Item is equipped with an ordering, <. A binary tree is a search tree if, and only if, for each internal node Nd(lt, v, rt), every label in the left subtree, lt is less than v, and every label in the right subtree, rt, is greater than v.
The basic idea is to build into our data-structure the divide-and-conquer strategy used in algorithms like quicksort and mergesort. In the quicksort algorithm, we use a pivot to divide the sorting problem into two independent parts. In a binary search tree, each internal node divides the data-structure into two independent parts. We place smaller items in the left sub-tree, and larger items in the right sub-tree. An in-order traversal of a binary search tree gives an ordered list.
Here is a function, based on our first implementation of quicksort, that builds a binary search tree from a list.
fun divide x (h :: t) =
let val (low, high) = divide x t
in if x < h then (low, h :: high)
else (h :: low, high)
| divide _  = (,)
fun mkTree (h :: t) =
let val (x, y) = divide h t
in Nd( mkTree x, h, mkTree y )
| mkTree  = Lf
We can picture the action of quicksort by building this tree, and then producing an in-order traversal—except that in quicksort we don’t actually bother to build the tree. It may pay to build the tree, in order to implement operations, such as member, that involve searching. The function to look for a given element may be written as
fun member Lf k = false
| member (Nd(lt, k’, rt) k =
if k < k’ then member lt k
if k’ < k then member rt k
else true (* k = k’ *) ;
The cost of a call to member is bounded by the height of the tree; if the tree is balanced, this is O(lgn). The work invested in building the tree is O(nlgn). If we only expect to make O(lgn) calls to member, we might as well use a list (with an O(n) implementation of member) to represent our set. Otherwise, the investment is probably worthwhile.
fun insert (e, Lf) = Nd(Lf, e, Lf)
| insert (e, Nd(lt, r, rt)) =
if e < r then Nd(insert(e, lt), r, rt)
else if r < e then Nd(lt, r, insert(e, rt))
else Nd(lt, r, rt) (* e = r *)
fun ins (e, Lf) = Nd(Lf, e, Lf)
| ins (e, Nd(lt, r, rt)) =
if e < r then Nd(ins(e, lt), r, rt)
else if r < e then Nd(lt, r, ins(e, rt))
else raise NoChange
fun insert(e, t) = ins(e, t) handle NoChange => t
fun getmax (Nd(lt, v, Lf)) = (lt, v)
| getmax (Nd(lt, v, rt)) =
let val (r, m) = getmax rt
in (Nd(lt, v, r), m) end
Deletion The delete operation is more interesting. The entry to be deleted may occur anywhere in the tree, we must be able to re-constitute a binary search tree from the remainder. Fortunately, it suffices to consider only one case. If we can write a function join to re-constitute a binary search tree from the two orphan children that remain when we remove the root node of a tree, we can implement delete as follows:
fun delete(e, Lf) = Lf
| delete(e, Nd(lt, v, rt)) =
if e < v then Nd(delete(e, lt), v, rt)
else if v < e then Nd(lt, v, delete(e, rt))
else join lt rt
fun join Lf x = x
| join x Lf = x
| join lt rt =
let val (l, m) = rmmax lt
in Nd(l, m, rt) end
An implementation of several set operations is provided by the functor TREESET given in Figure 1.
A binary search tree can also be used to support dictionary operations, as shown in Figure 2.
We implement a dictionary as a search tree of Key * Item pairs. TREESET provides most of the operations, but we need to access the representation directly to implement lookup. ©Michael Fourman 1994-2006