The final imperative features of Standard ML which we will present are the facilities for imperative input and output which are available in the language.
Pre-defined streams are TextIO.stdIn of type TextIO.instream and TextIO.stdOut of type TextIO.outstream. A new input stream can be created by using the function TextIO.openIn of type string -> TextIO.instream. A new output stream can be created by using the TextIO.openOut function of type string -> TextIO.outstream. There are TextIO.closeIn and TextIO.closeOut functions as well.
The result of attempting to open a file which is not present is an exceptional case and raises the exception Io, which carries a record describing the nature of the I/O failure. This exception may be handled and alternative action taken.
The functions for text I/O are the following.
| TextIO.input | : | TextIO.instream -> string |
| TextIO.inputN | : | TextIO.instream * int -> string |
| TextIO.lookahead | : | TextIO.instream -> char option |
| TextIO.endOfStream | : | TextIO.instream -> bool |
| TextIO.output | : | TextIO.outstream * string -> unit |
A familiar C programming metaphor for processing files may be easily implemented in Standard ML. The function below simulates the behaviour of the UNIX cat command.
fun cat s =
let
val f = TextIO.openIn s
and c = ref ""
in
while (c := TextIO.inputN (f, 1); !c <> "") do
TextIO.output (TextIO.stdOut, !c);
TextIO.closeIn f
end;This function simulates the behaviour of the UNIX strings command, that is, it reads in a binary file and prints out those strings of printable characters which have length four or more.
fun strings s =
let
local
val is = BinIO.openIn s
in
val binfile = BinIO.inputAll is
val _ = BinIO.closeIn is
end
val ws = String.str o Char.chr o Word8.toInt
val fold = Word8Vector.foldr (fn (w, s) => ws w ^ s) ""
val tokenise = String.tokens (Bool.not o Char.isPrint)
val select = List.filter (fn s => String.size s >= 4)
in
(select o tokenise o fold) binfile
end;
We can present another C programming metaphor: a pre-processor which
includes files as specified by a #include directive. It
searches for the include files in one of a list of directories,
handling possible exceptions and trying the next directory in its
turn.
fun mlpp dir is os =
let val os = TextIO.openOut os
fun findAndOpen [] f = TextIO.openIn f
| findAndOpen (h::t) f = TextIO.openIn f
handle _ => TextIO.openIn (h^f)
handle _ => findAndOpen t f
fun inc f =
let val is = findAndOpen dir f
in while not (TextIO.endOfStream is) do
let val line = TextIO.inputLine is
val len = String.size line
in if len > 8 andalso
String.substring (line, 0, 8) = "#include"
then inc (String.substring (line, 10, len - 12))
else TextIO.output (os, line)
end;
TextIO.closeIn is
end
in
inc is;
TextIO.closeOut os
end;Finally we show that we can combine text input and binary output by implementing a text-to-binary file translator which decodes a Base 64 encoded file. The Base 64 standard is the one which is used by for Internet mail in order to safeguard data from unintentional corruption. It operates by encoding three eight-bits characters using four six-bits ones. These six bits can be mapped onto the uppercase letters, the lowercase letters, the digits and the symbols plus and divide in that order, from 0 to 63. The Base 64 translator is presented below and uses auxiliary functions charToWord and wordListToVector together with infixed versions of the functions Word.<<, Word.>>, Word.orb and Word.andb.
fun base64decode infile outfile =
let
val is = TextIO.openIn infile
val os = BinIO.openOut outfile
fun decode #"/" = 0wx3F
| decode #"+" = 0wx3E
| decode c =
if Char.isDigit c then charToWord c + 0wx04
else if Char.isLower c then charToWord c - 0wx47
else if Char.isUpper c then charToWord c - 0wx41
else 0wx00
fun convert (w0::w1::w2::w3::_) =
let val w = (w0 << 0wx12) orb (w1 << 0wx0C) orb (w2 << 0wx06) orb w3
in [w >> 0wx10, (w andb 0wx00FF00) >> 0wx08, w andb 0wx0000FF]
end
| convert _ = []
fun next is = (convert o map decode o explode) (TextIO.inputN (is, 4))
in
while not (TextIO.endOfStream is) do
if TextIO.lookahead is = SOME #"\n"
then (TextIO.input1 is; ())
else (BinIO.output (os, wordListToVector (next is)));
TextIO.closeIn is;
BinIO.closeOut os
end;Exercise
The base64decode functions uses masks to select out the middle and low bytes in a word. Why could these not be obtained by shifting up sixteen bits and down eight and shifting down sixteen bits respectively?