EECS665 – Compiler Construction

Project 4

Due on October 23rd 11:59 PM (Accepted with no penalty).

Accepted late by October 24th 11:59 PM for 1 penalty

Accepted late by October 25th 11:59 PM for 2 penalties

Not accepted on or after October 26th 12:00 AM

Each penalty incurs a 15% stacking reduction on the project grade or uses 1 "penalty token". Penalty tokens are applied by course personnel to your maximum benefit.

Updates

None yet!

Resources

Starter code: You can use the code from your previous project, enhanced with the new options to the main file entry point. If you'd prefer to start fresh, you may use the tarball p4.tgz
Project Oracle: If you're confused about the correct output for a specific case, you can submit a sample to the Project Oracle and it will respond with the expected output.

Overview

For this assignment, you'll enhance the A compiler by implementing name analysis, which collects each semantic symbol (e.g. variable or function) of the input program and builds a correspondance between identifiers and their symbol.

From a black-box view of ac, the changes needed are fairly straightforward: Implement a new command-line option for ac that augments unparsing so that every use or definition of an identifier has its type, in curly brackets, after its name. You'll also write error checking code which reports name analysis issues (these errors are described below). Make any changes in the code necessary to support the above tasks.

How we'll grade:

We will run the new -n <output_file> option for ac with various test-case input files, then diff the expected results from the output of your code. We'll use test files that we expect to unparse correctly, and test files that have type errors that we expect to report type errors.

Requirements
This section contains information that affects how your project will be graded. It should be considered mandatory reading.

Your submission

You should submit a single tarred-and-zipped directory containing source code and a makefile such that the following invocations build and execute your code:


	make


./ac <infile.a> -n <namedunparse.txt> 2> <errors.txt>

Where

<infile.a> is a lexically- and syntactically-valid A file, which may contain binding errors (i.e. the kinds of errors name analysis would catch).
<namedunparse.txt> is the file to which the augmented unparse file will be written in the required output format (described in detail below).
<errors.txt> is the file to which the required error messages will be written in the required error message format (described in detail below).

You should include a file named TEAM.txt in your tarball with each person who worked on the project included in the following format. (lastname1,firstname1) (lastname2,firstname2) You may work with a partner for the project. If you do, include both names. Otherwise, include only your name in TEAM.txt.

Unparse Output Format

Since we'll be grading with diff, we'll require a very specific format for the unparsed data. The following format should be used for the type data in curly brackets after the identifier:

For names of functions, the information should be of the form
(param1Type,param2Type,...,paramNType) -> returnType
For names of global variables, parameters, and local variables output the type.

Error Reporting

Your name analysis should find all of the errors described in the table given below; it should report the specified position of the error, and it should give exactly the specified error message (each message should appear on a single line, even if your browser adds a line break in the following table). Error messages should have the same format as in the scanner and parser.

Type of Error	Error Message Description
More than one declaration of an identifier in a given scope	`<p>: Multiply declared identifier`
Use of an undeclared identifier	`<p>: Undeclared identifier`
Bad declaration type (variable or parameter of `void` type)	`<p>: Invalid type in declaration`

Where <p> is the position of the IDNode for each error.

Note that the ID names should not be printed as part of the error messages.

During name analysis, if a function name is multiply declared you should still process the formals and the body of the function; don't add a new entry to the symbol table for the function, but do create a new scope for the function's body, and add any formals to that new scope.

If you find a bad variable declaration (a variable of type void or of a bad aggregate type), give an error message and add nothing to the symbol table.

If a declaration is both "bad" (e.g., a non-function declared void) and is a declaration of a name that has already been declared in the same scope, you should give two error messages (first the "bad" declaration error, then the "multiply declared" error).

Advice
This section contains tips and guidance. It may be helpful, but it's optional reading.

For this assignment, you'll enhance the ac compiler by implementing name analysis by traversing the AST. You'll also need to modify the AST nodes to assist in the collection of symbol table entries.

While there are a variety of ways to implement name analysis, the basic workflow should take the form of a traversal over the AST, doing the following:

Keep track of the scope
When an IDNode is discovered as part of a declaration, check if the declaration is valid and, if appropriate, create a new semantic symbol.
When an IDNode is discovered as part of a use or definition, check if that use or definition corresponds to a valid symbol and, if appropriate, builds a link between the IDNode and the symbol.