Description
Construct a lexer for a subset of tiger that includes the following
exp:
    lvalue
    int
    string
    exp binop exp
    lvalue := exp
    id ( explist )
    ( expseq )
    let declist in expseq end
    if exp then exp
    if exp then exp else exp
lvalue:
    # These are simple variables
    id
binop:
    + - * / = <> < > <= >= & |
declist:
    dec
    dec declist
dec: 
    type id = id
    var id : id := exp
    function id ( typefields ) : id = exp
expseq:
    exp
    expseq ; exp
explist:
    exp
    explist , exp
typefields:
    typefield
    typefield , typefields
typefield:
    id : id
And comments delimited by /* and */.
NOTE: the only valid typeid in our language are aliases of the built in "int" and "string" types. We do not have arrays or structures or other compound types.
Your assignment is to use the sample code in
chapter 2 to
implement (in tiger.lex) a lexer for this subset of tiger. The full
description of tiger is in the Appendix A of the book, and you can get
a
short description of Tiger. Remember,
you don't have to implement the whole language, just what I describe
above.
You need to consider carefully how you will handle strings and comments, and in particular, what to do if unfinished strings or unfinished comments are in the programs you are lexing.
You will definitely need to read Chapter 2 to see examples of building a lexer, and the ML-Lex manual is also useful.
Example code
You need to make some changes to the sources.cm file, you can use one like:
Group is
driver.sml
errormsg.sml
tokens.sig
tokens.sml
tiger.lex
$/smlnj-lib.cm
$/basis.cm
You can parse a sample program using:
- CM.make "sources.cm";
- Parse.parse "test.tig";
  () : unit
Here's a simple program that should lex when you have completed the assignment.
/* factorial */
let
  function fac(n : int) : int =
    if n <= 1 then
        1
    else
        n * fac(n - 1)
in
    fac(5)
end