- Overview
- Spim
- Auxiliary Fields and Methods
- Code Generation for Global Variable Declarations
- Code Generation for Functions
- Code Generation for Statements
- Code Generation for Expressions
- Control-Flow Code
Code can be generated by a syntax-directed translation while parsing or by traversing the abstract syntax tree after the parse (i.e., by writing a codeGen method for the appropriate kinds of AST nodes). We will assume the latter approach, and will discuss code generation for a subset of the C language. In particular, we will discuss generating MIPS assembly code suitable for input to the Spim interpreter. Some information on Spim is provided in the next section; the following sections discuss code generation for:
- global variables
- functions (entry and exit)
- statements
- expressions
Documentation on Spim is available on-line:
To run the (plain) Spim interpreter, type:
- spim -file <name>
To run Spim with an X-windows interface (on a Linux machine), type just: xspim. This will cause a window to open. Click on the load button in that window, then type in the name of your assembly-code file, then press return. If there are syntax errors, you will see error messages. If there are no errors, you can run your program by clicking on run, then (in the small window that will be opened) on OK. If your program generates any output, a new window will be opened to display that output.
Spim uses the following special registers (there are others; you can see the on-line Spim documentation for those, but you should not need them for the class project):
Register Purpose $sp stack pointer $fp frame pointer $ra return address $v0 used for system calls and to return int values from function calls, including the syscall that reads an int $f0 used to return double values from function calls, including the syscall that reads a double $a0 used for output of int and string values $f12 used for output of double values $t0 - $t7 temporaries for ints $f0 - $f30 registers for doubles (used in pairs; i.e., use $f0 for the pair $f0, $f1) Auxiliary Constants and Methods
To simplify the task of code generation, it is convenient to have a set of constants (final static fields) that define the string representations of the registers that will be used in the generated code and the values used to represent true and false, as well as a set of methods for actually writing the generated code to a file. We will assume that we have the following register constants: SP, FP, RA, V0, A0, T0, T1, F0, F12, as well as the constants TRUE and FALSE (and that TRUE is represented as 1, and false as 0). We will also assume that we have the following methods:
Code Generation for Global Variable DeclarationsMethod Purpose generate write the given op code and arguments, nicely formatted, to the output file generateIndexed the arguments are: an op code, a register R1, another register R2, and an offset; generate code of the form: op R1, offset(R2) genPush(String reg, int bytes) generate code to push the value in the given register onto the stack;
parameter bytes is 4 for an int and 8 for a doublegenPop(String reg, int bytes) generate code to pop the top-of-stack value into the given register nextLabel return a string to be used as a label (more on this later) genLabel given a label L, generate: L: For each global variable v, generate:
.data .align 2 # align on a word boundary _v: .space N
where N is the size of the variable in bytes. (Scalar integer variables require 4 bytes; double variables require 8 bytes.) This code tells the assembler to set aside N bytes in the static data area, in a location labeled with the name _v.Example: Given this source code:
int x; double y;
you should generate this code:.data .align 2 _x: .space 4 .data .align 2 _y: .space 8
It is not actually necessary to generate .data if the previous generated code was also for a global variable declaration; however, since function declarations can be intermixed with global variable declarations (and cause code to be generated in the text area, not the static data area), this may not be the case; it is safe (and easier) just to generate those directives for every global variable.For every function you will generate code for:
- the function "preamble"
- the function entry (to set up the function's Activation Record)
- the function body (its statements)
- function exit (restoring the stack, and returning to the caller).
For the main function, generate:
.text .globl main main:
For all other functions, generate:
.text _<functionName>:
using the actual name in place of <functionName>. This tells the assembler to store the following instructions in the text area, labeled with the given name.After generating this "preamble" code, you will generate code for (1) function entry, (2) function body, and (3) function exit.
We assume that when a function starts executing, the stack looks like this:
- push the return address
- push the control link
- set the FP
- push space for local variables
# (1) push return addr sw $ra, 0($sp) subu $sp, $sp, 4 # (2) push control link sw $fp, 0($sp) subu $sp, $sp, 4 # (3) set the FP # note: the following sets the FP to point to the "bottom" # of the new AR; the reason for "+ 8" is: # 4 bytes each for the control link and the return addr addu $fp, $sp, <size of params in bytes + 8> # (4) push space for locals subu $sp, $sp, <size of locals in bytes>
Note: <size of params> and <size of locals> will need to be available to the code generator. The symbol-table entry for the function name will have information about the parameters (because that will have been used for type checking). For example, it might have a list of the symbol-table entries for the parameters. You could also store the total size of the parameters in the function name's symbol-table entry, or you could write a method that takes the list of parameters as its argument and computes the total size. It is not so easy to compute the total size of the local variables at code-generation time; it is probably a better idea to do that during name analysis. The name-analysis phase will be computing the offsets for the parameters and local variables anyway; it should not be difficult to extend that code to also compute the total size of the locals (and to store that information in the function name's symbol-table entry).Note: we are talking about the codeGen method for the FnBodyNode, whose subtree will look like this:
There is no need to generate any code for the declarations. So to generate code for the function body, just call the codeGen method of the StmtListNode, which will in turn call the codeGen method of each statement in the list. What those methods will do is discussed below in the section on Code Generation for Statements.
Just before a function returns, the stack looks like this:
Here is the code that needs to be generated:
lw $ra, -<param size>($fp) # load return address move $t0, $fp # save control link lw $fp, -<paramsize+4>($fp) # restore FP move $sp, $t0 # restore SP jr $ra # return
Note that there are two things that cause a function to return:- A return statement is executed, or
- The last statement in the function is executed (i.e., execution "falls off the end" of the function).
- Generate the "return" code just once after generating the code for the function body. Label that code with a unique label (e.g., the result of calling nextLabel).
- For each return statement, generate a jump to the label you used (the op code for an unconditional jump is just b).
What about a return statement that returns a value? As discussed below, the codeGen method for the returned expression will generate code to evaluate that expression, leaving the value on the stack. The MIPS convention is to use register V0 to return a int value from a function and to use register F0 to return a double value. So the codeGen method for the return statement should generate code to pop the value from the stack into the appropriate register (before generating the "return" code or the jump to the return code discussed above).
Code Generation for Statements
You will write a different codeGen method for each kind of StmtNode. You are strongly advised to write this method for the WriteIntStmtNode, WriteDblStmtNode, and WriteStrStmtNode first. Then you can test code generation for the other kinds of statements and the expressions by writing a program that computes and prints a value. It will be much easier to find errors in your code this way (by looking at the output produced when a program is run) than by looking at the assembly code you generate.
To generate code for a write statement whose expression is of type int you must:
- Call the codeGen method of the expression being printed. That method will generate code to evaluate the expression, leaving that value on the top of the stack.
- Generate code to pop the top-of-stack value into register A0 (a special register used for output of strings and ints)
- Generate code to set register V0 to 1.
- Generate a syscall instruction.
Below is the code you would write for the codeGen method of the WriteIntStmtNode.
// step (1) myExp.codeGen(); // step (2) genPop(A0, 4); // step (3) generate("li", V0, 1); // step (4) generate("syscall");
The code for the WriteStrStmtNode and the WriteDblStmtNode is similar, except for the following:- For a string, the codeGen method of the expression being printed will leave the address of the string on the stack.
- For a double, the value to be written must be popped into register F12 instead of A0.
- For a string, register V0 must be set to 4.
- For a double, register V0 must be set to 3.
If-Then Statement The AST for an if-then statement looks like:
------------ | IfStmtNode | ------------ / | \ --------- -------------- ------------- | ExpNode | | DeclListNode | | StmtListNode| --------- -------------- -------------
There are two different approaches to generating code for statements that involve conditions (e.g., for if statements and while loops):- The numeric method, and
- the control-flow method.
- Evaluate the condition, leaving the value on the stack.
- Pop the top-of-stack value into register T0.
- Jump to FalseLabel if T0 == FALSE.
- Code for the statement list.
- FalseLabel:
Note that the code generated for an if-then statement will need to include a label. Each label in the generated code must have a unique name (although we will refer to labels in these notes using names like "FalseLabel" as above). As discussed above, we will assume that there is a method called nextLabel that returns (as a String) a new label every time it is called, and we will assume that there is a method called genLabel that prints the given label to the assembly-code file.
TEST YOURSELF #1 Question 1: What is the form of the code generated by an IfElseStmtNode's codeGen method?
Question 2: What is the actual code that needs to be written for the IfStmtNode's codeGen method?
Question 3: What is the form of the code generated by a WhileStmtNode's codeGen method?
The AST for a return statement is either:
---------------- | ReturnStmtNode | ----------------
or:---------------- | ReturnStmtNode | ---------------- | ----------- | ExpNode | -----------
As discussed above, if a value is being returned, the ReturnStmtNode's codeGen method should call its ExpNode's codeGen method (to generate code to evaluate the returned expression, leaving the value on the stack), then should generate code to pop that value into register V0 or register F0 (depending on its type).To generate the code that actually does the return, use one of the following approaches:
- For each return statement in the program, generate a copy of the code that pops the AR off the stack then jumps to the return address (that code was discussed above under Function Exit), or
- For each return statement in the program, generate a jump to the "return" code that is generated at the end of the function. Note that in this case you will need to label that return code, and you will need to know what that label is when generating code for a return statement.
The AST for a read statement is one of the following:
------------------- ------------------- | ReadIntStmtNode | | ReadDblStmtNode | ------------------- ------------------- | | ---------- ---------- | IdNode | | IdNode | ---------- ----------
To read an integer value into register V0, you must generate this code:
li $v0, 5 syscall
- function calls (the name of the called function)
- expressions (an expression can be just a name, or a name can be the operand of any operator)
- assignment statements (on the left-hand side)
- For a function call, we will need to generate a jump-and-link instruction using the name of the function (the same name that was generated as a label in the function's "preamble" code).
- For an expression, we will need to generate code to fetch the current value either from the static data area or from the current Activation Record, and to push that value onto the stack.
- For an assignment, we will need to generate code to push the address of the variable (either the address in the static data area, or in the current Activation Record) onto the stack. Then we will generate code to store the value of the right-hand-side expression into that address.
- genJumpAndLink,
- codeGen,
- genAddr.
- The symbol-table entry includes a "kind" field (which distinguishes between globals and locals).
- Different sub-classes of the Sym class are used for globals and for local variables (so you can tell whether you have a global or a local using "instanceof", or using an IsGlobal method that you write for each sub-class of Sym).
- The symbol-table entry includes an "offset" field; for local variables, that field has a value less than or equal to zero, while for globals, the value is greater than zero.
- Evaluate the right-hand-side expression, leaving the value on the stack.
- Push the address of the left-hand-side Id onto the stack.
- Store the value into the address.
- Leave a copy of the value on the stack.
- <label> needs to be a new label; e.g., returned by a call to nextLabel.
- The <string value> needs to be a string in quotes. You should be storing string literals that way, so just write out the value of the string literal, quotes and all.
- Evaluate each actual parameter, pushing the values onto the stack;
- Jump and link (jump to the called function, leaving the return address in the RA register).
- Push the returned value (which will be in register V0 or F0 depending on the return type) onto the stack.
- Call each child's codeGen method to generate code that will evaluate the operand(s), leaving the value(s) on the stack.
- Generate code to pop the operand value(s) off the stack into register(s) (e.g., T0 and T1 for ints; F0 and F2 for doubles). Remember that if there are two operands, the right one will be on the top of the stack.
- Generate code to perform the operation (see Spim documentation for a list of opcodes).
- Generate code to push the result onto the stack.
- The numeric approach: This is the approach that we have assumed so far. Using the numeric approach, the codeGen method for a statement with a condition generates code to evaluate the condition, leaving the value on the stack. That value is then popped, and a jump is executed if it has a particular value.
- The control-flow or jump-code approach: In this case, the code-generation method for the condition has two parameters, both of which are labels (i.e., Strings) named TrueLabel and FalseLabel. Instead of leaving the value of the condition on the stack, the code generated to evaluate the condition jumps to the TrueLabel if the condition is true, and jumps to the FalseLabel if the condition is false.
- Start by calling the genJumpCode method of the left child. That call will generate code to evaluate the left operand, jumping to the "TrueLabel" that we pass if its value is true, and jumping to the "FalseLabel" that we pass if its value is false.
- So what labels should be passed?
- If the left operand is false, then the value of the whole expression is false, so pass the given "FalseLabel" as the "FalseLabel" in the recursive call.
- If the left operand is true, then we must evaluate the right operand, so pass a new label as the "TrueLabel" in the recursive call.
- After the call to the left child's genJumpCode method, call genLabel to generate the new label.
- Finally, call the genJumpCode method of the right child.
- What labels should be passed as the arguments to the right child's
genJumpCode method?
- This expression will only be evaluated if the left operand evaluated to true; in that case, the value of this expression (the right operand) is the value of the whole expression. Therefore, for this recursive call, pass the original "TrueLabel" and "FalseLabel".
Code Generation
Contents
To read a double value, load the value 7 into V0, do a syscall, and expect the input value to be in register F0 (not V0).
Digression: Code Generation for IdNodes
Before considering other kinds of statements, let's think about the role identifiers will play in code generation. Names show up in the following contexts:
We use "codeGen" for the second case (fetching the value and pushing it onto the stack) since that is what the codeGen methods of all ExpNodes must do.
genJumpAndLink
The genJumpAndLink method will simply generate a jump-and-link instruction (with opcode jal) using the appropriate label as the target of the jump. If the called function is "main", the label is just "main". For all other functions, the label is of the form:-
_<functionName>
codeGen
The codeGen method must copy the value of the global / local variable into a register (e.g., T0 for an int variable and F0 for a double), then push the value onto the stack. Different code will be generated for a global and a local variable. Below are four examples.lw $t0 _g
| // load the value of int global g into T0 |
lw $t0 0($fp)
| // load the value of the int local stored at offset 0 into T0 |
l.d $f0 _d | // load the value of dbl global d into F0 |
l.d $f0 -4($fp) | // load the value of the dbl local stored at offset -4 into F0 |
Note that this means there must be a way to tell whether an IdNode represents a global or a local variable. There are several possible ways to accomplish this:
genAddr
The genAddr method must load the address of the identifier into a register (e.g., T0), then push it onto the stack. The code is very similar to the code to load the value of the identifier; we just use the la (load address) opcode instead of the lw (load word) or l.d (load double) opcode.Here is the code you need to generate to load the address of a global variable g into register T0 (this works whether g is int or double, since an address always takes 4 bytes):
-
la $t0, _g
-
la $t0, -8($fp)
Assignment Statement
The AST for an assignment statement looks like this:---------------- | AssignStmtNode | ---------------- | ------------ | AssignNode | ------------ / \ -------- --------- | IdNode | | ExpNode | -------- ---------The AssignStmtNode's codeGen method can call the AssignNode's method to do the assignment, but be careful: an AssignNode is a subclass of ExpNode, so like all nodes that represent expressions, it must leave the value of the expression on the stack. Therefore, the AssignStmtNode must generate code to pop (and ignore) that value.
Code Generation for Expressions
The codeGen method for the subclasses of ExpNode must all generate code that evaluates the expression and leaves the value on top of the stack. We have already talked about how to do this for IdNodes; in the subsections below we discuss code generation for other kinds of expressions.
The codeGen method for an assignment expression must generate code to:
The codeGen methods for IntLitNodes must simply generate code to push the literal value onto the stack. The generated code will look like this:
li $t0, <value> # load value into T0 sw $t0, ($sp) # push onto stack subu $sp, $sp, 4
The code for a DblLitNode is similar, except that the value must be loaded into a double register using the appropriate opcode, and the "push" code is different for a double value. For example:
li.d $f0, <value> # load value into F0 s.d $f0, -4($sp) # push onto stack subu $sp, $sp, 8
For a StringLitNode, the string literal itself must be stored in the static data area, and its address must be pushed. The code to store a string literal in the static data area looks like this:
.data <label>: .asciiz <string value>Note:
The code you need to generate to push the address of a string literal onto the stack looks like this:
.text la $t0, <label> # load addr into $t0 sw $t0, ($sp) # push onto stack subu $sp, $sp, 4
The AST for a function call looks like:
-------------- | CallExpNode | -------------- / \ -------- ------------- | IdNode | | ExpListNode | -------- -------------We need to generate code to:
Note that there is also a call statement:
-------------- | CallStmtNode | -------------- | -------------- | CallExpNode | -------------- / \ -------- ------------- | IdNode | | ExpListNode | -------- -------------In this case, the called function may not actually return a value (i.e., may have return type void). It doesn't hurt to have the CallExpNode's codeGen method push the value in V0 (or F0) after the call (it will just be pushing some random garbage), but it is important for the CallStmtNode's codeGen method to pop that value.
In general, the codeGen methods for the non short-circuited operators (PlusNode, MinusNode, ..., NotNode, LessNode, ..., EqualsNode, etc.) must all do the same basic sequence of tasks:
To minimize the amount of code you must write, you might want to define subclasses of ExpNode that have very similar code-generation methods. For example, you could define a BinaryExpNode class whose codeGen method does all of the steps above, calling an opCode method (that would be implemented in each BinaryExpNode subclass) to get the appropriate opcode for use in step 3.
Note that the comparison operators (described in Section 2.6 of the Spim Reference Manual) produce one when the result of the comparison is true. However, the not operator does a bitwise logical negation, so it will not work correctly if you use 0 and 1 to represent true and false. Thus, it is better to use the seq operator instead of the not operator.
Example: Recall that the AST for an addition looks like this:
---------- | PlusNode | ---------- / \ --------- --------- | ExpNode | | ExpNode | --------- ---------Here is the codeGen method for the PlusNode, assuming that both operands are ints.
public void codeGen() { // step 1: evaluate both operands myExp1.codeGen(); myExp2.codeGen(); // step 2: pop values in T0 and T1 genPop(T1, 4); genPop(T0, 4); // step 3: do the addition (T0 = T0 + T1) generate("add", T0, T0, T1); // step 4: push result genPush(T0, 4) }
To illustrate how code is generated for an expression involving several operators, consider generating code for the expression: b + c * d. Here is the AST for the expression:
PlusNode / \ IdNode TimesNode b / \ IdNode IdNode c dBelow is the sequence of calls that would be made at compile time to generate code for this expression, and a description of what the generated code does.
Sequence of calls What the generated code does ----------------- ---------------------------- +--- PlusNode.codeGen() | IdNode.codeGen() ---------> push b's value | +- TimesNode.codeGen() | | IdNode.codeGen() ---------> push c's value | | IdNode.codeGen() ---------> push d's value | | | +-------------------------------> pop d's value into T1 | pop c's value into T0 | T0 = T0 * T1 | push T0's value | +---------------------------------> pop result of * into T1 pop b's value into T0 T0 = T0 + T1 push T0's value
The short-circuited operators are represented by AndNodes and OrNodes. "Short-circuiting" means that the right operand is evaluated only if necessary. For example, for the expression (j != 0) && (k/j > epsilon), the sub-expression (k/j > epsilon) is evaluated only if variable j is not zero. Therefore, the code generated for an AndNode must work as follows:
evaluate the left operand if the value is true then evaluate the right operand; that value is the value of the whole expression else don't bother to evaluate the right operand the value of the whole expression is falseSimilarly, for an OrNode:
evaluate the left operand if the value is false then evaluate the right operand; that value is the value of the whole expression else don't bother to evaluate the right operand the value of the whole expression is trueThis means that the code generated for the logical operators will need to involve some jumps depending on the values of some expressions.
Expand the outlines given above for the code generated for AndNodes and OrNodes, giving a lower-level picture of the generated code. Use the outline of the code generated for an if-then statement as a model of what to write.
As mentioned above in the section on If-Then Statements, there are actually two different approaches to generating code for statements with conditions (like if statements and while loops):
First, let's reconsider code generation for an if-then statement; this time we'll use the control-flow method instead of the numeric method for evaluating the condition part of the statement. Under this new assumption, the code generated by the IfStmtNode's codeGen method will have the following form:
-- code to evaluate the condition, jumping to TrueLab if it is true, and to DoneLab if it is false -- TrueLab: -- code for the statement list -- DoneLab:The actual code written for the IfStmtNode's codeGen method will be:
public void codeGen() { String trueLab = nextLabel(); String doneLab = nextLabel(); myExp.genJumpCode(trueLab, doneLab); genLabel(trueLab); myStmtList.codeGen(); genLabel(doneLab); }
To implement the control-flow approach, in addition to changing the codeGen method for the statements that have conditions, we must also write a new genJumpCode method for each AST node that could represent a condition. Below, we look at two representative cases, for IdNode and LessNode. We give the code generated for the (old) numeric approach, and for the (new) control-flow approach so that we can see which code is better (in terms of the number of instructions). (The code given below assumes that all operands are int, and it is not quite assembly code -- for example, for clarity, we use "push" instead of the actual two instructions that implement a push operation.)
IdNode: numeric control-flow ------- ------------ lw $t0, <var's addr> lw $t0, <var's addr> push $t0 beq $t0, FALSE, falseLab b trueLabNote that in both approaches to generating code for an IdNode, 3 instructions are generated (because "push" is actually 2 instructions). However, while all 3 of the numeric-approach instructions will always execute, the last instruction generated by the control-flow approach will execute only if the value of the Id is true.
LessNode: numeric control-flow ------- ------------ -- code to evaluate both operands -- ditto -- pop values into T1, T0 -- ditto slt $t2, $t0, $t1 blt $t0, $t1, trueLab push $t2 b falseLabThe operands of a LessNode are integer expressions, not boolean expressions, so both approaches start by generating (the same) code to evaluate the operands and to pop their values into registers T0 and T1. After that, however, the numeric approach will generate 3 instructions (to set $t2 to TRUE or FALSE as appropriate, then to push that value onto the stack -- remember that "push" is really two instructions), while the control-flow code will generate only 2 instructions. Furthermore, as was the case for the IdNode, all three of the numeric-approach instructions will always execute, while the last instruction generated by the control-flow approach will only execute if the comparison evaluates to FALSE.
Now let's consider how to write the new genJumpCode method for the short-circuited operators (AndNodes and OrNodes).
Recall that the AST for an && expression looks like this:
--------- | AndNode | --------- / \ --------- --------- | ExpNode | | ExpNode | --------- ---------Here's how the genJumpCode method of the AndNode works:
The AndNode's genJumpCode method would be:
public void genJumpCode(String trueLab, String falseLab) { String newLab = nextLabel(); myExp1.genJumpCode(newLab, falseLab); genLabel(newLab); myExp2;genJumpCode(trueLab, falseLab); }Example: Consider the code that would be generated for the statement: if (a && b>0) { ... }, represented by the AST:
------------ | IfStmtNode | ------------ / | \ --------- -------------- ------------- | AndNode | | DeclListNode | | StmtListNode| --------- -------------- ------------- / \ -------- ---------- | IdNode | | LessNode | -------- ---------- / \ ... ...The IfStmtNode's codeGen method would create two labels, TrueLab and DoneLab, and would call the AndNode's genJumpCode method, passing those labels as the arguments. The AndNode's genJumpCode method would create one new label (NewLab) and then would call the genJumpCode method of its left child (the IdNode), passing NewLab and DoneLab as the arguments. It would then generate the NewLab label, and then would call its right child's genJumpCode method, passing TrueLab and DoneLab. The code generated for the whole condition would look like this:
Generated Code Generated By -------------- ------------ -- code to load the value of a into T0 IdNode jump to DoneLab if T0 == FALSE IdNode jump to NewLab IdNode NewLab: AndNode -- code to push the value of b LessNode's child -- code to push the iteral 0 LessNode's child pop into T1 LessNode pop into T0 LessNode jump to TrueLab if T0 < T1 LessNode jump to DoneLab LessNode(Of course, the actual label would be something like L3, not NewLab.) After calling the AndNode's genJumpCode method, the IfStmtNode's codeGen method would call genLabel to print TrueLab, then would call its StmtList child's codeGen method to generate code for the list of statements. Finally, it would call genLabel to print DoneLab. So the code generated for this if statement would be like this:
-- code to load the value of a into T0 jump to DoneLab if T0 == FALSE jump to NewLab NewLab: -- code to push the value of b -- code to push the iteral 0 pop into T1 pop into T0 jump to TrueLab if T0 < T1 jump to DoneLab TrueLab: -- code for the list of statements DoneLab:
Question 1: What is the form of the code generated by an OrNode's genJumpCode method?
Question 2: What is the form of the code generated by a NotNode's genJumpCode method?
How does the code generated for an AndNode using the control-flow method compare to the code generated using the numeric method? Here are outlines of the code generated in each case:
Numeric Code Control-Flow Code ------------ ----------------- -- code to evaluate left -- code to evaluate left -- operand, leaving the -- operand, including jumps -- value on the stack -- to NewLab and FalseLab pop into T0 goto TrueLab if T0 == TRUE newLab: push FALSE goto DoneLab TrueLab: -- code to evaluate right -- code to evaluate right -- operand, leaving the -- operand, including -- value on the stack -- jumps to TrueLab and -- FalseLab DoneLab:Note that the numeric code includes 6 instructions (shown in bold) in addition to the ones generated by the codeGen methods of the two children, while in the control-flow case, no instructions are generated by the AndNode itself (just a label); the instructions are all generated by the genJumpCode methods of its two children. Those children could represent names, boolean literals, comparisons or logical expressions. We have already seen that in the first three cases, better code is generated using the control-flow approach. Now we see that for logical expressions (at least for AndNodes) fewer instructions are generated using the control-flow approach than using the numeric approach.
Compare the code generated by the two approaches for an OrNode and for a NotNode.
Finally, let's compare the code generated by the numeric and control-flow approaches for an if-then statement. Here are outlines of the two different versions of the code that would be generated:
Numeric Code Control-Flow Code ------------ ----------------- -- code for condition, leaving -- code for condition, -- value on the stack -- including jumps to -- trueLab and falseLab pop into T0 goto falseLab if T0 == FALSE trueLab: -- code for "then" stmts -- code for "then" stmts goto doneLab goto doneLab falseLab: falseLab: -- code for "else" stmts -- code for "else" stmts doneLab: doneLab:Note that much of the code is the same for the two methods; the code generated to evaluate the condition will be different, and the numeric method has three extra instructions: a pop, followed by a conditional goto (those instructions are shown in bold font). So as long as the code generated for the condition is no worse for the control-flow method than for the numeric method, the control-flow method is better (both in terms of the number of instructions generated, and in terms of the number of instructions that will be executed).
But we have already looked at the code generated for all the different kinds of conditions, for each of the two approaches, and in fact the control-flow method was never worse, and was sometimes better. Thus, the control-flow method is the winner, in terms of generating less and more efficient code!