Project 5
Due on November 15th 11:59 PM (Accepted with no penalty).
Accepted late by November 16th 11:59 PM for 1 penalty
Accepted late by November 17th 11:59 PM for 2 penalties
Not accepted on or after November 18th 12:00 AM
Each penalty incurs a 15% stacking reduction on the project grade or uses 1 "penalty token". Penalty tokens are applied by course personnel to your maximum benefit.
Updates

None yet!

Resources

  • The oracle will be released shortly.
  • You can download the starter code here.

Overview

For this assignment you will write a type checker for A programs represented as abstract-syntax trees. Your main task will be to create type checking functions for the nodes of the AST. This part of the compiler is intended to be an extension of the previous project code, but will be invoked via a new command-line argument.

How We'll Grade

As in previous projects, we will rely heavily on an automated testing suite to determine the correctness of your compiler. This suite works by doing a diff against the output produced by your code and the expected output. As such, you should prioritize basic functionality that could affect numerous cases.

Your code should be built and invoked via the command sequence

make all ./ac <file.a> -c

Where <file.a> is an input file that passes name analysis. Note that this project only checks for errors and produces no output file, so -c has no argument. All error messages should be printed to the standard error stream (i.e. stderr/std::cerr).

Getting Started

You are encouraged to use your own implementation of the compiler up to and including name analysis. If you're not confident in your implementation, a set of starter files is available here: (link).

Specifications
This section goes over the requirements for the project and should be considered mandatory reading.

The type checker will determine the type of every expression represented in the abstract-syntex tree. You must report errors when types are incompatbile with each other and when types are invalid for an operator.

Details of the A type system are available at the Type system section of the language spec. You should read the info at this link.

Error Messages

Your type checker must report the specified position of the error, and it must give the specified error message. (Each message should appear on a single line, rather than how it is formatted in the following table.)

Type of Error Error Message Position to Report
Printing a function; e.g., toconsole f, where f is a function name. Attempt to output a function Function name IDNode
Printing a custom-type value function); e.g., toconsole c, where c is the name of a custom type or instance of a custom. Attempt to output a custom Expression node being written.
Printing a void value (note: this can only happen if there is an attempt to write the return value from a void function); e.g., toconsole f(), where f is a void function. Attempt to output void Call expression node
Reading a function: e.g., fromconsole f, where f is a function name. Attempt to assign user input to function Function name IDNode
Reading a class: e.g., fromconsole c, where c is a custom name or instance of a custom. Attempt to assign user input to class Argument to fromconsole
Calling something other than a function; e.g., "x();", where x is not a function name. Note: In this case, you should not type-check the actual parameters. Attempt to call a non-function Variable IDNode
Calling a function with the wrong number of arguments. Function call with wrong number of args IDNode for the function name.
Calling a function with argument(s) of the wrong type. If there are several arguments with the wrong type, give an error message for each such argument. Type of actual does not match type of formal Actual argument subexpression node
Using an empty return statement (i.e., one that does not return a value) in a function with a non-void return type. Missing return value Return subexpression
Returning a value from a void function. Return with a value in void function Return subexpression
Returning a value of the wrong type from a non-void function. Bad return value Return subexpression
Applying an arithmetic operator (+, -, *, /, ++, or --) to an invalid type (e.g. a basic bool type, a function). Arithmetic operator applied to invalid operand The offending expression operand root node
An invalid operand given to a relational operator (<, >, <=, >=). Relational operator applied to non-numeric operand The offending expression operand root node
An invalid operand given to a logical operator ( !, and, or). Logical operator applied to non-bool operand The offending expression operand root node
Using a non-bool expression as a condition (such as an if statement, or loop. Non-bool expression used as a condition The offending expression operand root node
Applying an (in)equality operator to an invalid operand, such as a void function call result, custom type or a function name (not a call) . Invalid equality operand The offending expression operand root node
Applying an (in)equality operator to two valid but incompatible operands, such as an int to a bool. Invalid equality operation The offending expression operand root node
Comparing two custom names for (in)equality, e.g., "A == B" or "A != B", where A and B are the names of custom types. Equality operator applied to custom names IDNode of the custom name.
Comparing two custom variables for (in)equality, e.g., "a == b" or "a != b", where a and b are variables declared to be of custom types. Equality operator applied to custom variables 1st character of the first custom variable.
Applying an invalid operand to an assignment (either as the target or source). e.g., "f = g;", where f and/or g are function names or custom names, or variables of custom type. Invalid assignment operand The offending expression operand root node
Applying an assignment to two valid but incompatible types. e.g., "f = g;", where f is a bool and g is an int. Invalid assignment operation The offending expression operand root node
Printing a custom variable; e.g., "toconsole p", where p is a variable declared to be of a custom type. Attempt to output a custom variable 1st character of the custom variable.
Assigning a custom name to a custom name; e.g., "A = B;", where A and B are the names of custom types. custom name assignment 1st character of the first custom name.
Assigning a custom variable to a custom variable; e.g., "a = b;", where a and b are variables declared to be of custom types. custom variable assignment Position of the first custom variable.
Getting a custom name from the OS; e.g., "fromconsole P", where P is the name of a custom type. Attempt to fromconsole a custom name Position of the custom name.
Getting a custom variable from the OS; e.g., "fromconsole p", where p is a variable declared to be of a custom type. Attempt to fromconsole a custom variable Position of the custom variable.
Preventing Cascading Errors

A single type error in an expression or statement should not trigger multiple error messages. For example, assume that

  • p is a variable declared to be of bool type,
  • f is a function that has one integer parameter and returns a bool.
Each of the following should cause only one error message:

toconsole p + 1          // p + 1 is an error; the write is OK
(true + 3) * 4         // true + 3 is an error; the * is OK
true && (false || 3)   // false or 3 is an error; the and is OK
f("a" * 4);            // "a" * 4 is an error; the call is OK
1 + p();               // p() is an error; the + is OK
(true + false) == 7    // true + false is an error; the == is OK

The following should each cause two error messages (assuming the same declarations of f as above):

true + "hello"    // one error for each of the non-int operands of the +
1 + f(true)       // one for the bad arg type and one for the 2nd operand of the +
1 + f(1, 2)       // one for the wrong number of args and one for the 2nd operand of the +
return 3+true;    // in a void function: one error for the 2nd operand to +
                  // and one for returning a value

If you need clarification on these issues, please feel free to ask a question on Piazza.

Guide
This section gives advice for completing the project and should be considered optional, FYI reading.

A good way to implement your type checker is by writing member functions for the different subclasses of ASTNode. Your type checker should find all of the type errors described in the table below.

Thinking About Type Analysis

At a high level, the type analysis for a language like A should yield:

  1. A correspondance between AST Nodes and the type of that node
  2. A (possibly empty) collection of error messages
  3. An indication of whether the program is well-typed

Achieving 1. is crucial in achieving 2. which is crucial in achieving 3. Thus, the main "action" of the type checker is a traversal of the AST, assigning types according to the rules of the type system, and flagging violations as errors.

High-Level Implementation Suggestions

Much like name analysis, visiting every node of the AST can take the form of a careful recursive walk over the AST. A bespoke solution for type analysis is to imbue each ASTNode (sub)class with a typeAnalysis function, set up so that each node invokes typeAnalysis on it's children nodes.

Many nodes (i.e. StmtNode (sub)classes, the ProgramNode, and (perhaps surprisingly) TypeNode (sub)classes do not need to have a type attached to them. ExpNode (subclasses) do, so one reasonable implementation is to add a protected instance variable to the ExpNode (call it myDataType or whatever) that indicates that node's data type. During the recursive type analysis walk, visiting each instance of an ExpNode subclass in the AST should result in setting this->myDataType to the appropriate value.

Supressing Cascading Errors

In order to prevent an error from being re-reported, it's helpful to use a special ErrorType for expressions that contain type errors. In the first example above, the type given to (true + 3) should be ErrorType, and the type-check method for the multiplication node should not report "Arithmetic operator applied to non-numeric operand" for the first operand.