first commit

2011-04-19 11:37:05 +02:00 · 2011-04-19 11:37:05 +02:00 · 063194f8be
commit 063194f8be
349 changed files with 36508 additions and 0 deletions
--- a/jasmin/jasmin-2.4/lib/java_cup/manual.html
+++ b/jasmin/jasmin-2.4/lib/java_cup/manual.html
@ -0,0 +1,883 @@
+<html><head>
+<title>Java(tm) CUP User's Manual</title>
+</head><body>
+
+<hr>
+<img src="java_cup.logo.new.gif" alt="Java CUP Logo Image">
+<hr>
+<h1>Java(tm) CUP User's Manual</h1>
+<h3><a href="http://www.cc.gatech.edu/gvu/people/Faculty/Scott.E.Hudson.html">
+Scott E. Hudson</a><br> 
+<a href="http://www.cc.gatech.edu/gvu/gvutop.html">
+Graphics Visualization and Usability Center</a><br> 
+<a href="http://www.gatech.edu/TechHome.html">
+Georgia Institute of Technology</a><br>
+<i>January 1996</i> (v0.9d release)</h3>
+<hr>
+
+<h3>Table of Contents</h3>
+<dl compact>
+  <dt> 1.  <dd> <a href="#intro">Introduction and Example</a>
+  <dt> 2.  <dd> <a href="#spec">Specification Syntax</a>
+  <dt> 3.  <dd> <a href="#running">Running Java CUP</a>
+  <dt> 4.  <dd> <a href="#parser">Customizing the Parser</a>
+  <dt> 5.  <dd> <a href="#errors">Error Recovery</a>
+  <dt> 6.  <dd> <a href="#conclusion">Conclusion</a>
+  <dt>     <dd> <a href="#refs">References</a>
+  <dt> A.  <dd> <a href="#appendixa">Grammar for Java CUP Specification Files</a>
+  <dt> B.  <dd> <a href="#appendixb">A Very Simple Example Scanner</a>
+</dl>
+
+<a name=intro>
+<h3>1. Introduction and Example</h3></a>
+
+This manual describes the basic operation and use of the 
+Java<a href="#trademark">(tm)</a>
+Based Constructor of Useful Parsers (Java CUP for short).
+Java CUP is a system for generating LALR parsers from simple specifications.
+It serves the same role as the widely used program YACC 
+<a href="#YACCref">[1]</a> and in fact offers most of the features of YACC.  
+However, Java CUP is written in Java, uses specifications including embedded 
+Java code, and produces parsers which are implemented in Java.<p>
+
+Although covering all aspect of the Java CUP system, this manual is relatively
+brief, assumes you have at least a little bit of knowledge of LR parsing,
+and preferably have a bit of experience with a program such as YACC.  
+A number of compiler construction textbooks (such as 
+<a href="#dragonbook">[2</a>,<a href="#crafting">3]</a>) cover this material, 
+and discuss the YACC system (which is quite similar to this one) as a 
+specific example. <p> 
+
+Using Java CUP involves creating a simple specifications based on the
+grammar for which a parser is needed, along with construction of a
+scanner capable of breaking characters up into meaningful tokens (such
+as keywords, numbers, and special symbols).<p> 
+
+As a simple example, consider a 
+system for evaluating simple arithmetic expressions over integers.  
+This system would read expressions from standard input (each terminated 
+with a semicolon), evaluate them, and print the result on standard output.  
+A grammar for the input to such a system might look like: <pre>
+  expr_list ::= expr_list expr_part | expr_part
+  expr_part ::= expr ';'
+  expr      ::= expr '+' term | expr '-' term | term
+  term      ::= term '*' factor | term '/' factor | term '%' factor | factor
+  factor    ::= number | '-' expr | '(' expr ')'
+</pre>
+To specify a parser based on this grammar, our first step is to identify and
+name the set of terminal symbols that will appear on input, and the set of 
+non terminal symbols.  In this case, the non terminals are: 
+
+<pre><tt>  expr_list, expr_part, expr, term,</tt> and <tt>factor</tt>.</pre>
+
+For terminal names we might choose:
+
+<pre><tt>  SEMI, PLUS, MINUS, TIMES, DIVIDE, MOD, NUMBER, LPAREN,</tt> and <tt>RPAREN</tt></pre>
+
+Based on these namings we can construct a small Java CUP specification 
+as follows:<br>
+<hr>
+<pre><tt>// Java CUP specification for a simple expression evaluator (no actions)
+
+import java_cup.runtime.*;
+
+/* Preliminaries to set up and use the scanner.  */
+init with {: scanner.init();              :};
+scan with {: return scanner.next_token(); :};
+
+/* Terminals (tokens returned by the scanner). */
+terminal token     SEMI, PLUS, MINUS, TIMES, DIVIDE, MOD,  LPAREN, RPAREN;
+terminal int_token NUMBER;
+
+/* Non terminals */
+non terminal symbol     expr_list, expr_part;
+non terminal int_token  expr, term, factor;
+
+/* The grammar */
+expr_list ::= expr_list expr_part | 
+              expr_part;
+expr_part ::= expr SEMI;
+expr      ::= expr PLUS term | 
+              expr MINUS term | 
+              term;
+term      ::= term TIMES factor | 
+              term DIVIDE factor | 
+              term MOD factor | 
+              factor;
+factor    ::= NUMBER | 
+              MINUS factor | 
+              LPAREN expr LPAREN;
+</tt></pre>
+<hr><br>
+We will consider each part of the specification syntax in detail later.  
+However, here we can quickly see that the specification contains three 
+main parts.  The first part provides preliminary and miscellaneous declarations
+to specify how the parser is to be generated, and supply parts of the 
+runtime code.  In this case we indicate that the <tt>java_cup.runtime</tt>
+classes should be imported, then supply a small bit of initialization code,
+and some code for invoking the scanner to retrieve the next input token.
+The second part of the specification declares terminals and non terminals,
+and associates object classes with each.  In this case, we declare our terminals
+as being represented at runtime by two object types: <tt>token</tt> and 
+<tt>int_token</tt> (which are supplied as part of the Java CUP runtime system), 
+while various non terminals are represented by objects of types <tt>symbol</tt>
+and <tt>int_token</tt> (again supplied from the runtime system).  The final 
+part of the specification contains the grammar.<p>
+
+To produce a parser from this specification we use the Java CUP generator.
+If this specification were stored in a file <tt>parser.cup</tt>, then 
+(on a Unix system at least) we might invoke Java CUP using a command like:
+<pre><tt> java java_cup.Main &lt; parser.cup</tt> </pre>
+In this case, the system will produce two Java source files containing 
+parts of the generated parser: <tt>sym.java</tt> and <tt>parser.java</tt>.  
+As you might expect, these two files contain declarations for the classes 
+<tt>sym</tt> and <tt>parser</tt>. The <tt>sym</tt> class contains a series of 
+constant declarations, one for each terminal symbol.  This is typically used 
+by the scanner to refer to symbols (e.g. with code such as 
+"<tt>return new token(sym.SEMI);</tt>" ).  The <tt>parser</tt> class 
+implements the parser itself.<p>
+
+The specification above, while constructing a full parser, does not perform 
+any semantic actions -- it will only indicate success or failure of a parse.
+To calculate and print values of each expression, we must embed Java
+code within the parser to carry out actions at various points.  In Java CUP,
+actions are contained in <i>code strings</i> which are surrounded by delimiters 
+of the form <tt>{:</tt> and <tt>:}</tt> (we can see examples of this in the 
+<tt>init with</tt> and <tt>scan with</tt> clauses above).  In general, the 
+system records all characters within the delimiters, but does not try to check 
+that it contains valid Java code.<p>
+
+A more complete Java CUP specification for our example system (with actions 
+embedded at various points in the grammar) is shown below:<br>
+<hr>
+<pre><tt>// Java CUP specification for a simple expression evaluator (w/ actions)
+
+import java_cup.runtime.*;
+
+/* Preliminaries to set up and use the scanner.  */
+init with {: scanner.init();              :};
+scan with {: return scanner.next_token(); :};
+
+/* Terminals (tokens returned by the scanner). */
+terminal token     SEMI, PLUS, MINUS, TIMES, DIVIDE, MOD,  LPAREN, RPAREN;
+terminal int_token NUMBER;
+
+/* Non terminals */
+non terminal symbol     expr_list, expr_part;
+non terminal int_token  expr, term, factor;
+
+/* The grammar */
+expr_list ::= expr_list expr_part 
+	      | 
+              expr_part;
+
+expr_part ::= expr:e 
+	      {: System.out.println("= " + e.int_val); :} 
+              SEMI              
+	      ;
+
+expr      ::= expr:e1 PLUS term:e2    
+	      {: RESULT.int_val = e1.int_val + e2.int_val; :} 
+	      | 
+              expr:e1 MINUS term:e2    
+              {: RESULT.int_val = e1.int_val - e2.int_val; :} 
+	      | 
+              term:e1                  
+	      {: RESULT.int_val = e1.int_val; :} 
+	      ;
+
+term      ::= term:e1 TIMES factor:e2 
+	      {: RESULT.int_val = e1.int_val * e2.int_val; :} 
+	      | 
+              term:e1 DIVIDE factor:e2 
+	      {: RESULT.int_val = e1.int_val / e2.int_val; :} 
+	      | 
+              term:e1 MOD factor:e2 
+	      {: RESULT.int_val = e1.int_val % e2.int_val; :} 
+	      | 
+              factor:e                 
+	      {: RESULT.int_val = e.int_val; :} 
+	      ;
+
+factor    ::= NUMBER:n                 
+	      {: RESULT.int_val = n.int_val;  :} 
+	      | 
+              MINUS factor:e             
+	      {: RESULT.int_val = -e.int_val; :} 
+	      | 
+              LPAREN expr:e RPAREN     
+	      {: RESULT.int_val = e.int_val;  :} 
+	      ;
+</tt></pre>
+<hr><br>
+Here we can see several changes.  Most importantly, code to be executed at 
+various points in the parse is included inside code strings delimited by 
+<tt>{:</tt> and <tt>:}</tt>.  In addition, labels have been placed on various 
+symbols in the right hand side of productions.  For example in:<br>
+<pre>  expr ::= expr:e1 PLUS term:e2   
+           {: RESULT.int_val = e1.int_val + e2.int_val; :}
+</pre>
+the non terminal <tt>expr</tt> has been labeled with <tt>e1</tt>, while 
+<tt>term</tt> has been labeled with <tt>e2</tt>.  The left hand side
+symbol of each production is always implicitly labeled as <tt>RESULT</tt>.<p>
+
+Each symbol appearing in a production is represented at runtime by an 
+object (on the parse stack).  These labels allow code embedded in a
+production to refer to these objects.  Since <tt>expr</tt> and <tt>term</tt>
+were both declared as <tt>int_token</tt>, they are both represented by
+an object of class <tt>int_token</tt>.  These objects are created 
+as the result of matching some other production.  The code in that production
+fills in various fields of its result object, which are in turn used here to 
+fill in a new result object, and so on.  Overall this is a very common
+form of syntax directed translation related to attribute grammars and 
+discussed at length in compiler construction textbooks such as 
+<a href="#dragonbook">[2</a>,<a href="#crafting">3]</a>.
+<p>
+
+In our specific example, the <tt>int_token</tt> class includes an 
+<tt>int_val</tt> field which stores an <tt>int</tt> value.  We use this 
+field to compute the value of the expression from its component parts.  
+In the production above, we compute the <tt>int_val</tt> field of the 
+left hand side symbol (i.e. <tt>RESULT</tt>) as the sum of the values 
+computed by the <tt>expr</tt> and <tt>term</tt> parts making up this 
+expression.  That value in turn may be combined with other to compute a 
+final result.<p>
+
+The final step in creating a working parser is to create a <i>scanner</i> (also
+known as a <i>lexical analyzer</i> or simply a <i>lexer</i>).  This routine is 
+responsible for reading individual characters, removing things things like
+white space and comments, recognizing which terminal symbols from the 
+grammar each group of characters represents, then returning token objects
+representing these symbols to the parser. Example code for a workable (if 
+not elegant or efficient) scanner for our example system can be found in 
+<a href="#appendixb">Appendix B</a>.<p>
+
+Like the very simple one given in Appendix B, all scanners need to return 
+objects which are instances of <tt>java_cup.runtime.token</tt> (or one of 
+its subclasses).  The runtime system predefines three such classes: 
+<tt>token</tt> which contains no specific information beyond the token 
+type (and some internal information used by the parser), <tt>int_token</tt> 
+which also records a single <tt>int</tt> value, and <tt>str_token</tt> which 
+records a single string value. <p>
+
+The code contained in the <tt>init with</tt> clause of the specification 
+will be executed before any tokens are requested.  Each token will be 
+requested using whatever code is found in the <tt>scan with</tt> clause.
+Beyond this, the exact form the scanner takes is up to you. <p>
+
+In the <a href="#spec">next section</a> a more detailed and formal 
+explanation of all parts of a Java CUP specification will be given.  
+<a href="#running">Section 3</a> describes options for running the 
+Java CUP system.  <a href="#parser">Section 4</a> discusses the details 
+of how to customize a Java CUP parser, while <a href="#errors">Section 5</a>
+considers error recovery.  Finally, <a href="#conclusion">Section 6</a> 
+provides a conclusion.
+
+<a name="spec">
+<h3>2. Specification Syntax</h3></a>
+Now that we have seen a small example, we present a complete description of all 
+parts of a Java CUP specification.  A specification has four sections with 
+a total of eight specific parts (however, most of these are optional).  
+A specification consists of:
+<ul>
+<li> <a href="#package_spec">package and import specifications</a>,
+<li> <a href="#code_part">user code components</a>,
+<li> <a href="#symbol_list">symbol (terminal and non-terminal) lists</a>, and
+<li> <a href="#production_list">the grammar</a>.
+</ul>
+Each of these parts must appear in the order presented here.  (A complete 
+grammar for the specification language is given in 
+<a href="#appendixa">Appendix A</a>.)  The particulars of each part of
+the specification are described in the subsections below.<p>
+
+<h5><a name="package_spec">Package and Import Specifications</a></h5>
+
+A specification begins with optional <tt>package</tt> and <tt>import</tt> 
+declarations.  These have the same syntax, and play the same 
+role, as the package and import declarations found in a normal Java program.
+A package declaration is of the form:
+
+<pre><tt>    package <i>name</i>;</tt></pre>
+
+where name <tt><i>name</i></tt> is a Java package identifier, possibly in
+several parts separated by ".".  In general, Java CUP employs Java lexical
+conventions.  So for example, both styles of Java comments are supported,
+and identifiers are constructed beginning with a letter, dollar
+sign ($), or underscore (_), which can then be followed by zero or more
+letters, numbers, dollar signs, and underscores.<p>
+
+After an optional <tt>package</tt> declaration, there can be zero or more 
+<tt>import</tt> declarations. As in a Java program these have the form:
+
+<pre><tt>    import <i>package_name.class_name</i>;</tt>
+</pre>
+or
+<pre><tt>    import <i>package_name</i>.*;</tt>
+</pre>
+
+The package declaration indicates what package the <tt>sym</tt> and 
+<tt>parser</tt> classes that are generated by the system will be in.  
+Any import declarations that appear in the specification will also appear
+in the source file for the <tt>parser</tt> class allowing various names from
+that package to be used directly in user supplied action code.
+
+<h5><a name="code_part">User Code Components</a></h5>
+
+Following the optional <tt>package</tt> and <tt>import</tt> declarations
+are a series of optional declarations that allow user code to be included
+as part of the generated parser (see <a href="#parser">Section 4</a> for a 
+full description of how the parser uses this code).  As a part of the parser 
+file, a separate non-public class to contain all embedded user actions is 
+produced.  The first <tt>action code</tt> declaration section allows code to 
+be included in this class.  Routines and variables for use by the code 
+embedded in the grammar would normally be placed in this section (a typical 
+example might be symbol table manipulation routines).  This declaration takes 
+the form:
+
+<pre><tt>    action code {: ... :};</tt>
+</pre>
+
+where <tt>{: ... :}</tt> is a code string whose contents will be placed
+directly within the <tt>action class</tt> class declaration.<p>
+
+After the <tt>action code</tt> declaration is an optional 
+<tt>parser code</tt> declaration.  This declaration allows methods and
+variable to be placed directly within the generated parser class.
+Although this is less common, it can be helpful when customizing the 
+parser -- it is possible for example, to include scanning methods inside
+the parser and/or override the default error reporting routines.  This 
+declaration is very similar to the <tt>action code</tt> declaration and 
+takes the form:
+
+<pre><tt>    parser code {: ... :};</tt>
+</pre>
+
+Again, code from the code string is placed directly into the generated parser
+class definition.<p>
+
+Next in the specification is the optional <tt>init</tt> declaration 
+which has the form:
+
+<pre><tt>    init with {: ... :};</tt></pre>
+
+This declaration provides code that will be executed by the parser
+before it asks for the first token.  Typically, this is used to initialize
+the scanner as well as various tables and other data structures that might
+be needed by semantic actions.  In this case, the code given in the code
+string forms the body of a <tt>void</tt> method inside the <tt>parser</tt> 
+class.<p>
+
+The final (optional) user code section of the specification indicates how 
+the parser should ask for the next token from the scanner.  This has the
+form:
+
+<pre><tt>    scan with {: ... :};</tt></pre>
+
+As with the <tt>init</tt> clause, the contents of the code string forms
+the body of a method in the generated parser.  However, in this case
+the method returns an object of type <tt>java_cup.runtime.token</tt>.
+Consequently the code found in the <tt>scan with</tt> clause should 
+return such a value.<p>
+
+<h5><a name="symbol_list">Symbol Lists</a></h5>
+
+Following user supplied code comes the first required part of the 
+specification: the symbol lists.  These declarations are responsible 
+for naming and supplying a type for each terminal and non-terminal
+symbol that appears in the grammar.  As indicated above, each terminal
+and non-terminal symbol is represented at runtime with an object.  In
+the case of terminals, these are returned by the scanner and placed on
+the parse stack.  In the case of non terminals these replace a series
+of symbol objects on the parse stack whenever the right hand side of
+some production is recognized.  In order to tell the parser which object
+types should be used for which symbol, <tt>terminal</tt> and 
+<tt>non terminal</tt> declarations are used.  These take the forms:
+
+<pre><tt>    terminal <i>classname</i> <i>name1, name2,</i> ...;</tt>
+</pre>
+
+and
+
+<pre><tt>    non terminal <i>classname</i> <i>name1, name2,</i> ...;</tt>
+</pre>
+
+where <tt><i>classname</i></tt> can be a multiple part name separated with
+"."s.  Since the parser uses these objects for internal bookkeeping, the 
+classes used for non terminal symbols must be a subclass of 
+<tt>java_cup.runtime.symbol</tt>.  Similarly, the classes used for terminal 
+symbols must be a subclass of <tt>java_cup.runtime.token</tt> (note that
+<tt>java_cup.runtime.token</tt> is itself a subclass of 
+<tt>java_cup.runtime.symbol</tt>). 
+
+<h5><a name="production_list">The Grammar</a></h5>
+
+The final section of a Java CUP declaration provides the grammar.  This 
+section optionally starts with a declaration of the form:
+
+<pre><tt>    start with <i>nonterminal</i>;</tt>
+</pre>
+
+This indicates which non terminal is the <i>start</i> or <i>goal</i> 
+non terminal for parsing.  If a start non terminal is not explicitly
+declared, then the non terminal on the left hand side of the first 
+production will be used.<p>
+
+The grammar itself follows the optional <tt>start</tt> declaration.  Each
+production in the grammar has a left hand side non terminal followed by 
+the symbol "<tt>::=</tt>", which is then followed by a series of zero or more
+actions, terminal, or non terminal symbols, and terminated with a semicolon (;).
+Each symbol on the right hand side can optionally be labeled with a name.
+Label names appear after the symbol name separated by a colon (:).  Label
+names must be unique within the production, and can be used within action
+code to refer to the runtime object that represents the symbol.
+If there are several productions for the same non terminal they may be 
+declared together.  In this case the productions start with the non terminal 
+and "<tt>::=</tt>".  This is followed by multiple right hand sides each 
+separated by a bar (|).  The full set of productions is then terminated by a 
+semicolon.<p>
+
+Actions appear in the right hand side as code strings (e.g., Java code inside
+<tt>{:</tt> ... <tt>:}</tt> delimiters).  These are executed by the parser
+at the point when the portion of the production to the left of the 
+action has been recognized.  (Note that the scanner will have returned the 
+token one past the point of the action since the parser needs this extra
+<i>lookahead</i> token for recognition.)
+
+<a name="running">
+<h3>3. Running Java CUP</h3></a>
+
+As mentioned above, Java CUP is written in Java.  To invoke it, one needs
+to use the Java interpreter to invoke the static method 
+<tt>java_cup.Main()</tt>, passing an array of strings containing options.  
+Assuming a Unix machine, the simplest way to do this is typically to invoke it 
+directly from the command line with a command such as: 
+
+<pre><tt>    java java_cup.Main <i>options</i> &lt; <i>inputfile</i></tt></pre>
+
+Once running, Java CUP expects to find a specification file on standard input
+and produces two Java source files as output. <p>
+
+In addition to the specification file, Java CUP's behavior can also be changed
+by passing various options to it.  Legal options include:
+<dl>
+  <dt><tt>-package</tt> <i>name</i>  
+  <dd>Specify that the <tt>parser</tt> and <tt>sym</tt> classes are to be 
+       placed in the named package.  By default, no package specification 
+       is put in the generated code (hence the classes default to the special 
+       "unnamed" package).
+
+  <dt><tt>-parser</tt> <i>name</i>   
+  <dd>Output parser and action code into a file (and class) with the given
+      name instead of the default of "<tt>parser</tt>".
+
+  <dt><tt>-symbols</tt> <i>name</i>  
+  <dd>Output the symbol constant code into a class with the given
+      name instead of the default of "<tt>sym</tt>".
+
+  <dt><tt>-nonterms</tt>      
+  <dd>Place constants for non terminals into the  symbol constant class.
+      The parser does not need these symbol constants, so they are not normally
+      output.  However, it can be very helpful to refer to these constants
+      when debugging a generated parser.
+
+  <dt><tt>-expect</tt> <i>number</i>      
+  <dd>During parser construction the system may detect that an ambiguous 
+      situation would occur at runtime.  This is called a <i>conflict</i>.  
+      In general, the parser may be unable to decide whether to <i>shift</i> 
+      (read another symbol) or <i>reduce</i> (replace the recognized right 
+      hand side of a production with its left hand side).  This is called a 
+      <i>shift/reduce conflict</i>.  Similarly, the parser may not be able 
+      to decide between reduction with two different productions.  This is 
+      called a <i>reduce/reduce conflict</i>.  Normally, if one or more of 
+      these conflicts occur, parser generation is aborted.  However, in 
+      certain carefully considered cases it may be advantageous to 
+      arbitrarily break such a conflict.  In this case Java CUP uses YACC 
+      convention and resolves shift/reduce conflicts by shifting, and 
+      reduce/reduce conflicts using the "highest priority" production (the 
+      one declared first in the specification).  In order to enable automatic 
+      breaking of conflicts the <tt>-expect</tt> option must be given 
+      indicating exactly how many conflicts are expected.
+
+  <dt><tt>-compact_red</tt>   
+  <dd>Including this option enables a table compaction optimization involving
+      reductions.  In particular, it allows the most common reduce entry in 
+      each row of the parse action table to be used as the default for that 
+      row.  This typically saves considerable room in the tables, which can 
+      grow to be very large.  This optimization has the effect of replacing 
+      all error entries in a row with the default reduce entry.  While this 
+      may sound dangerous, if not down right incorrect, it turns out that this 
+      does not affect the correctness of the parser.  In particular, some
+      changes of this type are inherent in LALR parsers (when compared to 
+      canonical LR parsers), and the resulting parsers will still never 
+      read past the first token at which the error could be detected.
+      The parser can, however, make extra erroneous reduces before detecting
+      the error, so this can degrade the parser's ability to do 
+      <a href="#errors">error recovery</a>.
+      (Refer to reference [2] pp. 244-247 or reference [3] pp. 190-194 for a 
+      complete explanation of this compaction technique.) <br><br>
+
+      <i>Special note</i>: at the time of this writing the standard 
+      javac compiler had a bug which caused it to produce corrupted 
+      class files when very large statically initialized arrays (i.e., large 
+      parse tables) are used.  Consequently, if you have a large grammar, you 
+      may be <i>forced</i> to use this option in order to create tables
+      that are small enough to compile correctly.
+
+  <dt><tt>-nowarn</tt>        
+  <dd>This options causes all warning messages (as opposed to error messages)
+      produced by the system to be suppressed.
+
+  <dt><tt>-nosummary</tt>     
+  <dd>Normally, the system prints a summary listing such things as the 
+      number of terminals, non terminals, parse states, etc. at the end of
+      its run.  This option suppresses that summary.
+
+  <dt><tt>-progress</tt>      
+  <dd>This option causes the system to print short messages indicating its
+      progress through various parts of the parser generation process.
+
+  <dt><tt>-dump_grammar</tt>  
+  <dt><tt>-dump_states</tt>   
+  <dt><tt>-dump_tables</tt>   
+  <dt><tt>-dump</tt>          
+  <dd> These options cause the system to produce a human readable dump of
+       the grammar, the constructed parse states (often needed to resolve
+       parse conflicts), and the parse tables (rarely needed), respectively.
+       The <tt>-dump</tt> option can be used to produce all of these dumps.
+
+  <dt><tt>-time</tt>          
+  <dd>This option adds detailed timing statistics to the normal summary of
+      results.  This is normally of great interest only to maintainers of 
+      the system itself.
+
+  <dt><tt>-debug</tt>          
+  <dd>This option produces voluminous internal debugging information about
+      the system as it runs.  This is normally of interest only to maintainers 
+      of the system itself.
+
+</dl>
+
+<a name="parser">
+<h3>4. Customizing the Parser</h3></a>
+
+Each generated parser consists of three generated classes.  The 
+<tt>sym</tt> class (which can be renamed using the <tt>-symbols</tt>
+option) simply contains a series of <tt>int</tt> constants,
+one for each terminal.  Non terminals are also include if the <tt>-nonterms</tt>
+option is given.  The source file for the <tt>parser</tt> class (which can
+be renamed using the <tt>-parser</tt> option) actually contains two 
+class definitions, the public <tt>parser</tt> class that implements the 
+actual parser, and another non-public class (called <tt>CUP$action</tt>) which 
+encapsulates all user actions contained in the grammar, as well as code from 
+the <tt>action code</tt> declaration.  In addition to user supplied code, this
+class contains one method: <tt>CUP$do_action</tt> which consists of a large 
+switch statement for selecting and executing various fragments of user 
+supplied action code.  In general, all names beginning with the prefix of 
+<tt>CUP$</tt> are reserved for internal uses by Java CUP generated code. <p> 
+
+The <tt>parser</tt> class contains the actual generated parser.  It is 
+a subclass of <tt>java_cup.runtime.lr_parser</tt> which implements a 
+general table driven framework for an LR parser.  The generated <tt>parser</tt>
+class provides a series of tables for use by the general framework.  
+Three tables are provided:
+<dl compact>
+<dt>the production table 
+<dd>provides the symbol number of the left hand side non terminal, along with
+    the length of the right hand side, for each production in the grammar,
+<dt>the action table
+<dd>indicates what action (shift, reduce, or error) is to be taken on each 
+    lookahead symbol when encountered in each state, and
+<dt>the reduce-goto table
+<dd>indicates which state to shift to after reduces (under each non-terminal
+from each state). 
+</dl>
+(Note that the action and reduce-goto tables are not stored as simple arrays,
+but use a compacted "list" structure to save a significant amount of space.
+See comments the runtime system source code for details.)<p>
+
+Beyond the parse tables, generated (or inherited) code provides a series 
+of methods that can be used to customize the generated parser.  Some of these
+methods are supplied by code found in part of the specification and can 
+be customized directly in that fashion.  The others are provided by the
+<tt>lr_parser</tt> base class and can be overridden with new versions (via
+the <tt>parser code</tt> declaration) to customize the system.  Methods
+available for customization include:
+<dl compact>
+<dt><tt>public void user_init()</tt>
+<dd>This method is called by the parser prior to asking for the first token 
+    from the scanner.  The body of this method contains the code from the 
+    <tt>init with</tt> clause of the the specification.  
+<dt><tt>public java_cup.runtime.token scan()</tt>
+<dd>This method encapsulates the scanner and is called each time a new token is
+    needed by the parser.  The body of this method is supplied by the
+    <tt>scan with</tt> clause of the specification.
+<dt><tt> public void report_error(String message, Object info)</tt>
+<dd>This method should be called whenever an error message is to be issued.  In
+    the default implementation of this method, the first parameter provides 
+    the text of a message which is printed on <tt>System.err</tt> 
+    and the second parameter is simply ignored.  It is very typical to
+    override this method in order to provide a more sophisticated error
+    reporting mechanism.
+<dt><tt>public void report_fatal_error(String message, Object info)</tt>
+<dd>This method should be called whenever a non-recoverable error occurs.  It 
+    responds by calling <tt>report_error()</tt>, then aborts parsing
+    by calling the parser method <tt>done_parsing()</tt>, and finally
+    throws an exception.  (In general <tt>done_parsing()</tt> should be called 
+    at any point that parsing needs to be terminated early).
+<dt><tt>public void syntax_error(token cur_token)</tt>
+<dd>This method is called by the parser as soon as a syntax error is detected
+    (but before error recovery is attempted).  In the default implementation it
+    calls: <tt>report_error("Syntax error", null);</tt>.
+<dt><tt>public void unrecovered_syntax_error(token cur_token)</tt>
+<dd>This method is called by the parser if it is unable to recover from a 
+    syntax error.  In the default implementation it calls:
+    <tt>report_fatal_error("Couldn't repair and continue parse", null);</tt>.
+<dt><tt> protected int error_sync_size()</tt>
+<dd>This method is called by the parser to determine how many tokens it must
+    successfully parse in order to consider an error recovery successful.
+    The default implementation returns 3.  Values below 2 are not recommended.
+    See the section on <a href="#errors">error recovery</a> for details.
+</dl>
+
+Parsing itself is performed by the method <tt>public void parse()</tt>.  
+This method starts by getting references to each of the parse tables, 
+then initializes a <tt>CUP$action</tt> object (by calling 
+<tt>protected void init_actions()</tt>). Next it calls <tt>user_init()</tt>,
+then fetches the first lookahead token with a call to <tt>scan()</tt>.
+Finally, it begins parsing.  Parsing continues until <tt>done_parsing()</tt>
+is called (this is done automatically, for example, when the parser accepts).<p>
+
+In addition to the normal parser, the runtime system also provides a debugging
+version of the parser.  This operates in exactly the same way as the normal
+parser, but prints debugging messages (by calling 
+<tt>public void debug_message(String mess)</tt> whose default implementation
+prints a message to <tt>System.err</tt>).<p>
+
+Based on these routines, invocation of a Java CUP parser is typically done
+with code such as:
+<pre>
+      /* create a parsing object */
+      parser parse_obj = new parser();
+
+      /* open input files, etc. here */
+
+      try {
+        if (do_debug_parse)
+          parser_obj.debug_parse();
+        else
+          parser_obj.parse();
+      } catch (Exception e) {
+        /* do cleanup here -- possibly rethrow e */
+      } finally {
+	/* do close out here */
+      }
+</pre>
+
+<a name="errors">
+<h3>5. Error Recovery</h3></a>
+
+A final important aspect of building parsers with Java CUP is 
+support for syntactic error recovery.  Java CUP uses the same 
+error recovery mechanisms as YACC.  In particular, it supports
+a special error symbol (denoted simply as <tt>error</tt>).
+This symbol plays the role of a special non terminal which, instead of
+being defined by productions, instead matches an erroneous input 
+sequence.<p>
+
+The error symbol only comes into play if a syntax error is
+detected.  If a syntax error is detected then the parser tries to replace
+some portion of the input token stream with <tt>error</tt> and then
+continue parsing.  For example, we might have productions such as:
+
+<pre><tt>    stmt ::= expr SEMI | while_stmt SEMI | if_stmt SEMI | ... |
+	     error SEMI
+	     ;</tt></pre>
+
+This indicates that if none of the normal productions for <tt>stmt</tt> can
+be matched by the input, then a syntax error should be declared, and recovery
+should be made by skipping erroneous tokens (equivalent to matching and 
+replacing them with <tt>error</tt>) up to a point at which the parse can 
+be continued with a semicolon (and additional context that legally follows a 
+statement).  An error is considered to be recovered from if and only if a 
+sufficient number of tokens past the <tt>error</tt> symbol can be successfully 
+parsed.  (The number of tokens required is determined by the 
+<tt>error_sync_size()</tt> method of the parser and defaults to 3). <p>
+
+Specifically, the parser first looks for the closest state to the top
+of the parse stack that has an outgoing transition under
+<tt>error</tt>.  This generally corresponds to working from
+productions that represent more detailed constructs (such as a specific
+kind of statement) up to productions that represent more general or
+enclosing constructs (such as the general production for all
+statements or a production representing a whole section of declarations) 
+until we get to a place where an error recovery production
+has been provided for.  Once the parser is placed into a configuration
+that has an immediate error recovery (by popping the stack to the first
+such state), the parser begins skipping tokens to find a point at
+which the parse can be continued.  After discarding each token, the
+parser attempts to parse ahead in the input (without executing any
+embedded semantic actions).  If the parser can successfully parse past
+the required number of tokens, then the input is backed up to the point
+of recovery and the parse is resumed normally (executing all actions).
+If the parse cannot be continued far enough, then another token is
+discarded and the parser again tries to parse ahead.  If the end of
+input is reached without making a successful recovery (or there was no
+suitable error recovery state found on the parse stack to begin with)
+then error recovery fails.
+
+<a name="conclusion">
+<h3>6. Conclusion</h3></a>
+
+This manual has briefly described the Java CUP LALR parser generation system.
+Java CUP is designed to fill the same role as the well known YACC parser
+generator system, but is written in and operates entirely with Java code 
+rather than C or C++.  Additional details on the operation of the system can 
+be found in the parser generator and runtime source code.  See the Java CUP
+home page below for access to the API documentation for the system and its
+runtime.<p>
+
+This document covers the system as it stands at the time of its fourth alpha 
+release (v0.9d).  Check the Java CUP home page: 
+<a href="http://www.cc.gatech.edu/gvu/people/Faculty/hudson/java_cup/home.html">
+http://www.cc.gatech.edu/gvu/people/Faculty/hudson/java_cup/home.html</a>
+for the latest release information, instructions for downloading the
+system, and additional news about the system.  Bug reports and other 
+comments for the developers can be sent to 
+<a href="mailto:java-cup@cc.gatech.edu"> java-cup@cc.gatech.edu</a><p>
+
+Java CUP was originally written by 
+<a href="http://www.cc.gatech.edu/gvu/people/Faculty/Scott.E.Hudson.html">
+Scott Hudson</a>, in August of 1995.<p>
+
+<a name="refs">
+<h3>References</h3></a>
+<dl compact>
+
+<dt><a name = "YACCref">[1]</a> 
+<dd>S. C. Johnson, 
+"YACC -- Yet Another Compiler Compiler",
+CS Technical Report #32, 
+Bell Telephone Laboratories,  
+Murray Hill, NJ, 
+1975.
+
+<dt><a name = "dragonbook">[2]</a> 
+<dd>A. Aho, R. Sethi, and J. Ullman, 
+<i>Compilers: Principles, Techniques, and Tools</i>, 
+Addison-Wesley Publishing,
+Reading, MA, 
+1986.
+
+<dt><a name = "crafting">[3]</a> 
+<dd>C. Fischer, and R. LeBlanc,
+<i>Crafting a Compiler with C</i>,
+Benjamin/Cummings Publishing,
+Redwood City, CA,
+1991.
+
+</dl>
+
+<h3><a name="appendixa">
+Appendix A. Grammar for Java CUP Specification Files</a></h3>
+<hr><br>
+<pre><tt>java_cup_spec      ::= package_spec import_list code_part init_code 
+		       scan_code symbol_list start_spec production_list
+package_spec       ::= PACKAGE multipart_id SEMI | empty
+import_list        ::= import_list import_spec | empty
+import_spec        ::= IMPORT import_id SEMI
+code_part          ::= action_code_part parser_code_part 
+action_code_part   ::= ACTION CODE CODE_STRING SEMI | empty
+parser_code_part   ::= PARSER CODE CODE_STRING SEMI | empty
+init_code          ::= INIT WITH CODE_STRING SEMI | empty
+scan_code          ::= SCAN WITH CODE_STRING SEMI | empty
+symbol_list        ::= symbol_list symbol | symbol
+symbol             ::= TERMINAL type_id term_name_list SEMI | 
+                       NON TERMINAL type_id non_term_name_list SEMI
+term_name_list     ::= term_name_list COMMA new_term_id | new_term_id
+non_term_name_list ::= non_term_name_list COMMA new_non_term_id |
+	               new_non_term_id
+start_spec         ::= START WITH nt_id SEMI | empty
+production_list    ::= production_list production | production
+production         ::= nt_id COLON_COLON_EQUALS rhs_list SEMI
+rhs_list           ::= rhs_list BAR rhs | rhs
+rhs                ::= prod_part_list
+prod_part_list     ::= prod_part_list prod_part | empty
+prod_part          ::= symbol_id opt_label | CODE_STRING
+opt_label          ::= COLON label_id | empty
+multipart_id       ::= multipart_id DOT ID | ID
+import_id          ::= multipart_id DOT STAR | multipart_id
+type_id            ::= multipart_id
+new_term_id        ::= ID
+new_non_term_id    ::= ID
+nt_id              ::= ID
+symbol_id          ::= ID
+label_id           ::= ID
+</tt></pre>
+<hr><p><p>
+
+<h3><a name = "appendixb">Appendix B. A Very Simple Example Scanner<a></h3>
+<hr><br>
+<pre>
+<tt>// Simple Example Scanner Class
+
+import java_cup.runtime.*;
+
+public class scanner {
+  /* single lookahead character */
+  protected static int next_char;
+
+  /* advance input by one character */
+  protected static void advance()  { next_char = System.in.read(); }
+
+  /* initialize the scanner */
+  public static void init() { advance(); }
+
+  /* recognize and return the next complete token */
+  public static token next_token()
+    {
+      for (;;)
+        switch (next_char)
+	  {
+	    case '0': case '1': case '2': case '3': case '4': 
+	    case '5': case '6': case '7': case '8': case '9': 
+	      /* parse a decimal integer */
+	      int i_val = 0;
+	      do {
+	        i_val = i_val * 10 + (next_char - '0');
+	        advance();
+	      } while (next_char >= '0' && next_char <= '9');
+	    return new int_token(sym.NUMBER, i_val);
+
+	    case ';': advance(); return new token(sym.SEMI);
+	    case '+': advance(); return new token(sym.PLUS);
+	    case '-': advance(); return new token(sym.MINUS);
+	    case '*': advance(); return new token(sym.TIMES);
+	    case '/': advance(); return new token(sym.DIVIDE);
+	    case '%': advance(); return new token(sym.MOD);
+	    case '(': advance(); return new token(sym.LPAREN);
+	    case ')': advance(); return new token(sym.RPAREN);
+
+	    case -1: return new token(sym.EOF);
+
+	    default: 
+	      /* in this simple scanner we just ignore everything else */
+	      advance();
+	    break;
+	  }
+    }
+};
+</tt></pre>
+
+<hr>
+<a name="trademark">
+Java and HotJava are
+trademarks of <a href="http://www.sun.com/">Sun Microsystems, Inc.</a>,
+and refer to Sun's Java programming language and HotJava browser
+technologies.
+Java CUP is not sponsored by or affiliated with Sun Microsystems, Inc.
+</a>
+ 
+<hr><p><p>
+
+
+</body></html>