EECS665 – Compiler Construction

Project 5

Due on November 15th 11:59 PM (Accepted with no penalty).

Accepted late by November 16th 11:59 PM for 1 penalty

Accepted late by November 17th 11:59 PM for 2 penalties

Not accepted on or after November 18th 12:00 AM

Each penalty incurs a 15% stacking reduction on the project grade or uses 1 "penalty token". Penalty tokens are applied by course personnel to your maximum benefit.

Updates

None yet!

Resources

The oracle will be released shortly.
You can download the starter code here.

Overview

For this assignment you will write a type checker for A programs represented as abstract-syntax trees. Your main task will be to create type checking functions for the nodes of the AST. This part of the compiler is intended to be an extension of the previous project code, but will be invoked via a new command-line argument.

How We'll Grade

As in previous projects, we will rely heavily on an automated testing suite to determine the correctness of your compiler. This suite works by doing a diff against the output produced by your code and the expected output. As such, you should prioritize basic functionality that could affect numerous cases.

Your code should be built and invoked via the command sequence

		make all
		./ac <file.a> -c
		

Where <file.a> is an input file that passes name analysis. Note that this project only checks for errors and produces no output file, so -c has no argument. All error messages should be printed to the standard error stream (i.e. stderr/std::cerr).

Getting Started

You are encouraged to use your own implementation of the compiler up to and including name analysis. If you're not confident in your implementation, a set of starter files is available here: (link).

Specifications
This section goes over the requirements for the project and should be considered mandatory reading.

The type checker will determine the type of every expression represented in the abstract-syntex tree. You must report errors when types are incompatbile with each other and when types are invalid for an operator.

Details of the A type system are available at the Type system section of the language spec. You should read the info at this link.

Error Messages

Your type checker must report the specified position of the error, and it must give the specified error message. (Each message should appear on a single line, rather than how it is formatted in the following table.)

Type of Error	Error Message	Position to Report
Printing a function; e.g., `toconsole f`, where `f` is a function name.	`Attempt to output a function`	Function name IDNode
Printing a `custom`-type value function); e.g., `toconsole c`, where `c` is the name of a custom type or instance of a custom.	`Attempt to output a custom`	Expression node being written.
Printing a `void` value (note: this can only happen if there is an attempt to write the return value from a `void` function); e.g., `toconsole f()`, where `f` is a `void` function.	`Attempt to output void`	Call expression node
Reading a function: e.g., `fromconsole f`, where `f` is a function name.	`Attempt to assign user input to function`	Function name IDNode
Reading a class: e.g., `fromconsole c`, where `c` is a custom name or instance of a custom.	`Attempt to assign user input to class`	Argument to fromconsole
Calling something other than a function; e.g., "`x();`", where `x` is not a function name. Note: In this case, you should not type-check the actual parameters.	`Attempt to call a non-function`	Variable IDNode
Calling a function with the wrong number of arguments.	`Function call with wrong number of args`	IDNode for the function name.
Calling a function with argument(s) of the wrong type. If there are several arguments with the wrong type, give an error message for each such argument.	`Type of actual does not match type of formal`	Actual argument subexpression node
Using an empty `return` statement (i.e., one that does not return a value) in a function with a non-void return type.	`Missing return value`	Return subexpression
Returning a value from a `void` function.	`Return with a value in void function`	Return subexpression
Returning a value of the wrong type from a non-`void` function.	`Bad return value`	Return subexpression
Applying an arithmetic operator (`+`, `-`, `*`, `/`, `++`, or `--`) to an invalid type (e.g. a basic bool type, a function).	`Arithmetic operator applied to invalid operand`	The offending expression operand root node
An invalid operand given to a relational operator (`<`, `>`, `<=`, `>=`).	`Relational operator applied to non-numeric operand`	The offending expression operand root node
An invalid operand given to a logical operator ( `!`, `and`, `or`).	`Logical operator applied to non-bool operand`	The offending expression operand root node
Using a non-`bool` expression as a condition (such as an `if` statement, or loop.	`Non-bool expression used as a condition`	The offending expression operand root node
Applying an (in)equality operator to an invalid operand, such as a void function call result, custom type or a function name (not a call) .	`Invalid equality operand`	The offending expression operand root node
Applying an (in)equality operator to two valid but incompatible operands, such as an int to a bool.	`Invalid equality operation`	The offending expression operand root node
Comparing two `custom` names for (in)equality, e.g., "`A == B`" or "`A != B`", where `A` and `B` are the names of `custom` types.	`Equality operator applied to custom names`	IDNode of the `custom` name.
Comparing two `custom` variables for (in)equality, e.g., "`a == b`" or "`a != b`", where `a` and `b` are variables declared to be of `custom` types.	`Equality operator applied to custom variables`	1^st character of the first `custom` variable.
Applying an invalid operand to an assignment (either as the target or source). e.g., "`f = g;`", where `f` and/or `g` are function names or custom names, or variables of custom type.	`Invalid assignment operand`	The offending expression operand root node
Applying an assignment to two valid but incompatible types. e.g., "`f = g;`", where `f` is a bool and `g` is an int.	`Invalid assignment operation`	The offending expression operand root node
Printing a `custom` variable; e.g., "`toconsole p`", where `p` is a variable declared to be of a `custom` type.	`Attempt to output a custom variable`	1^st character of the `custom` variable.
Assigning a `custom` name to a `custom` name; e.g., "`A = B;`", where `A` and `B` are the names of `custom` types.	`custom name assignment`	1^st character of the first `custom` name.
Assigning a `custom` variable to a `custom` variable; e.g., "`a = b;`", where `a` and `b` are variables declared to be of `custom` types.	`custom variable assignment`	Position of the first `custom` variable.
Getting a `custom` name from the OS; e.g., "`fromconsole P`", where `P` is the name of a `custom` type.	`Attempt to fromconsole a custom name`	Position of the `custom` name.
Getting a `custom` variable from the OS; e.g., "`fromconsole p`", where `p` is a variable declared to be of a `custom` type.	`Attempt to fromconsole a custom variable`	Position of the `custom` variable.

Preventing Cascading Errors

A single type error in an expression or statement should not trigger multiple error messages. For example, assume that

p is a variable declared to be of bool type,
f is a function that has one integer parameter and returns a bool.

Each of the following should cause only one error message:

toconsole p + 1          // p + 1 is an error; the write is OK
(true + 3) * 4         // true + 3 is an error; the * is OK
true && (false || 3)   // false or 3 is an error; the and is OK
f("a" * 4);            // "a" * 4 is an error; the call is OK
1 + p();               // p() is an error; the + is OK
(true + false) == 7    // true + false is an error; the == is OK

The following should each cause two error messages (assuming the same declarations of f as above):

true + "hello"    // one error for each of the non-int operands of the +
1 + f(true)       // one for the bad arg type and one for the 2nd operand of the +
1 + f(1, 2)       // one for the wrong number of args and one for the 2nd operand of the +
return 3+true;    // in a void function: one error for the 2nd operand to +
                  // and one for returning a value

If you need clarification on these issues, please feel free to ask a question on Piazza.

Guide
This section gives advice for completing the project and should be considered optional, FYI reading.

A good way to implement your type checker is by writing member functions for the different subclasses of ASTNode. Your type checker should find all of the type errors described in the table below.

Thinking About Type Analysis

At a high level, the type analysis for a language like A should yield:

A correspondance between AST Nodes and the type of that node
A (possibly empty) collection of error messages
An indication of whether the program is well-typed

Achieving 1. is crucial in achieving 2. which is crucial in achieving 3. Thus, the main "action" of the type checker is a traversal of the AST, assigning types according to the rules of the type system, and flagging violations as errors.

High-Level Implementation Suggestions

Much like name analysis, visiting every node of the AST can take the form of a careful recursive walk over the AST. A bespoke solution for type analysis is to imbue each ASTNode (sub)class with a typeAnalysis function, set up so that each node invokes typeAnalysis on it's children nodes.

Many nodes (i.e. StmtNode (sub)classes, the ProgramNode, and (perhaps surprisingly) TypeNode (sub)classes do not need to have a type attached to them. ExpNode (subclasses) do, so one reasonable implementation is to add a protected instance variable to the ExpNode (call it myDataType or whatever) that indicates that node's data type. During the recursive type analysis walk, visiting each instance of an ExpNode subclass in the AST should result in setting this->myDataType to the appropriate value.

Supressing Cascading Errors

In order to prevent an error from being re-reported, it's helpful to use a special ErrorType for expressions that contain type errors. In the first example above, the type given to (true + 3) should be ErrorType, and the type-check method for the multiplication node should not report "Arithmetic operator applied to non-numeric operand" for the first operand.