A balanced binary search tree is a data structure enabling
efficient retrieval of data. For training hidden Markov models on a
large set of training utterances with a large vocabulary, it is
necessary to provide efficient retrieval of word pronunciation models.
This section describes a Haskell implementation of balanced binary
search trees as described in~\cite{BirdWadl88} that is sufficient for
our needs (i.e., we don't implement all of the functions normally
associated with the abstract type).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The Balanced Binary Search Tree Datatype}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
While the data type and functions are basically those of Bird
\& Wadler~\cite[Chapter 9]{BirdWadl88}, we have extended their search
tree structure by tagging each node in the tree with two values
instead of one: a {\em key\/} and a {\em definition}. Note that we
only provide a partial implementation of balanced binary search trees;
for example, we have not bothered to implement a ``delete'' function.
\begin{haskell}{BalBinSTrees}
> module BalBinSTrees(
> BalBinSTree, -- don't export the data constructors
> bbstBuild,
> bbstInsert, -- error upon finding duplicate keys
> bbstInsertQuiet, -- don't complain about duplicate keys
> bbstLookUp,
> bbstMember,
> bbstDepth,
> bbstShowKeys,
> bbstFlatten
> ) where
\end{haskell}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Implementation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%====================================================================
\subsection*{Representation}
%%====================================================================
We let the balanced binary search tree data type inherit the
methods of the class ``\verb~Text~'' so that we can easily read and
write such trees from/to plain text files. The type variable \verb~k~
represents the type of the key and the type variable \verb~b~
represents the type of the data to be retrieved.
\begin{haskell}{BalBinSTree}
> data BalBinSTree a b =
> Nil | Node a b (BalBinSTree a b) (BalBinSTree a b)
> deriving Show{-was:Text-}
\end{haskell}
%%====================================================================
\subsection*{bbstBuild}
%%====================================================================
The function \verb~bbstBuild~ takes an association list (i.e.,
a list of pairs where the first element of the pair is the ``key'' and
the second element of the pair is the ``definition'') as an argument
and returns a balanced binary search tree. The key type must belong to
the Haskell class \verb~Ord~ because the function \verb~bbstInsert~
uses the methods of that class.
\begin{haskell}{bbstBuild}
> bbstBuild :: (Ord a) => [(a,b)] -> BalBinSTree a b
> bbstBuild = foldl bbstInsert Nil
\end{haskell}
%%====================================================================
\subsection*{bbstInsert}
%%====================================================================
The function \verb~bbstInsert~ takes a balanced binary search
tree and an association pair and returns a new balanced binary search
tree which includes the pair, provided that the key is not already
found in the tree. If the key is already in the tree, an error is
signaled and evaluation is halted. The definition of
\verb~bbstInsert~ follows the definition of {\em insert\/}
of~\cite[p.\ 255]{BirdWadl88} except for the way we handle duplicate
keys.
\begin{haskell}{bbstInsert}
> bbstInsert :: (Ord a) => BalBinSTree a b -> (a,b) ->
> BalBinSTree a b
> bbstInsert Nil (x,d) = Node x d Nil Nil
> bbstInsert (Node y e l r) (x,d)
> | x < y = rebalance (Node y e (bbstInsert l (x,d)) r)
> | x == y = error "duplicate key"
> | otherwise = rebalance (Node y e l (bbstInsert r (x,d)))
\end{haskell}
The function \verb~bbstInsertQuiet~ is similar to
\verb~bbstInsert~ but doesn't complain about duplicate keys, quietly
returning the original tree. The definition of this function follows
the definition of {\em insert\/} of~\cite[p.\ 255]{BirdWadl88}.
\begin{haskell}{bbstInsertQuiet}
> bbstInsertQuiet :: (Ord a) => BalBinSTree a b -> (a,b) ->
> BalBinSTree a b
> bbstInsertQuiet Nil (x,d) = Node x d Nil Nil
> bbstInsertQuiet t@(Node y e l r) a@(x,d)
> | x < y = rebalance (Node y e (bbstInsertQuiet l a) r)
> | x == y = t
> | otherwise = rebalance (Node y e l (bbstInsertQuiet r a))
\end{haskell}
The function \verb~rebalance~ is used to bring a binary tree
that is slightly out of balance back into balance. This is the
function {\em rebal\/} of~\cite[p.\ 255]{BirdWadl88}.
\begin{haskell}{rebalance}
> rebalance :: BalBinSTree a b -> BalBinSTree a b
> rebalance t = case slope t of
> 2 -> shift_right t
> -2 -> shift_left t
> _ -> t
\end{haskell}
The function \verb~slope~ is the function {\em slope\/}
of~\cite[p.\ 253]{BirdWadl88}.
\begin{haskell}{slope}
> slope :: BalBinSTree a b -> Int
> slope Nil = 0
> slope (Node _ _ l r) = bbstDepth l - bbstDepth r
\end{haskell}
The function \verb~bbstDepth~ computes the depth of a binary
search tree; it is the function {\em depth\/} of~\cite[p.\
235]{BirdWadl88}.
\begin{haskell}{bbstDepth}
> bbstDepth :: BalBinSTree a b -> Int
> bbstDepth Nil = 0
> bbstDepth (Node _ _ l r) = 1 + (bbstDepth l `max` bbstDepth r)
\end{haskell}
The functions \verb~shift_right~ and \verb~shift_left~ are
used to rebalance a tree. These are the functions {\em shiftr\/} and
{\em shiftl\/} of~\cite[p.\ 255]{BirdWadl88}.
\begin{haskell}{shift_right}
> shift_right (Node x d l r)
> | slope l == -1 = rotate_right (
> Node x d (rotate_left l) r)
>
> | otherwise = rotate_right (
> Node x d l r)
\end{haskell}
\fixhaskellspacing\begin{haskell}{shift_left}
> shift_left (Node x d l r)
> | slope r == 1 = rotate_left (
> Node x d l (rotate_right r))
>
> | otherwise = rotate_left (
> Node x d l r)
\end{haskell}
The two rotation operations are defined as follows. These are
the functions {\em rotr\/} and {\em rotl\/} of~\cite[p.\
255]{BirdWadl88}.
\begin{haskell}{rotate_right}
> rotate_right (Node x d (Node y e t1 t2) t3) =
> Node y e t1 (Node x d t2 t3)
\end{haskell}
\fixhaskellspacing\begin{haskell}{rotate_left}
> rotate_left (Node x d t1 (Node y e t2 t3)) =
> Node y e (Node x d t1 t2) t3
\end{haskell}
%%====================================================================
\subsection*{bbstLookUp}
%%====================================================================
The function \verb~bbstLookUp~ looks for a given key in the
search tree and returns the definition associated with that key. We
restrict the key to the class \verb~Ord~ so that it can be compared to
other keys and to the class \verb~Text~ so that an informative error
message can be printed when a key is not found.
\begin{haskell}{bbstLookUp}
> bbstLookUp :: (Ord a, Show{-was:Text-} a) => BalBinSTree a b -> a -> b
> bbstLookUp Nil x = error ("key " ++ shows x " not found in tree")
> bbstLookUp (Node k d l r) x
> | x < k = bbstLookUp l x
> | x == k = d
> | x > k = bbstLookUp r x
\end{haskell}
%%====================================================================
\subsection*{bbstMember}
%%====================================================================
The function \verb~bbstMember~ looks for a given key in the
search tree and returns \verb~True~ if found and \verb~False~ if not.
This is the function {\em member\/} of~\cite[p.\ 246]{BirdWadl88}.
\begin{haskell}{bbstMember}
> bbstMember :: (Ord a) => BalBinSTree a b -> a -> Bool
> bbstMember Nil _ = False
> bbstMember (Node k d l r) x
> | x < k = bbstMember l x
> | x == k = True
> | x > k = bbstMember r x
\end{haskell}
%%====================================================================
\subsection*{bbstShowKeys}
%%====================================================================
We provide a function for displaying the tree structure along
with the keys for the special case when the key type belongs to the
type class \verb~Text~. The function uses tab characters to indent
the different levels.
\begin{haskell}{bbstShowKeys}
> bbstShowKeys :: (Show{-was:Text-} a) => Int -> BalBinSTree a b -> String
> bbstShowKeys ntabs Nil = tabs ntabs ++ "NIL\n"
> bbstShowKeys ntabs (Node x _ l r) =
> bbstShowKeys (ntabs+1) r ++
> tabs ntabs ++ show x ++ "\n" ++
> bbstShowKeys (ntabs+1) l
> tabs ntabs = take ntabs (repeat '\t')
\end{haskell}
%%======================================================================
\subsection*{bbstFlatten}
%%======================================================================
The function \verb~bbstFlatten~ returns a list of all of
key-data pairs within the binary search tree. It is basically the
function {\em labels\/} of~\cite[p.\ 247]{BirdWadl88}.
\begin{haskell}{bbstFlatten}
> bbstFlatten :: BalBinSTree a b -> [(a,b)]
> bbstFlatten Nil = []
> bbstFlatten (Node k v l r) = bbstFlatten l ++ [(k,v)] ++ bbstFlatten r
\end{haskell}
%%%%%%%%%% End of BalBinSTrees.lhs %%%%%%%%%%
|