- Local variables: variables that are declared in the method that accesses them.
- Global variables: variables that are not declared in any method, but are accessible in all methods. For example, the public static fields of a Java class can be used in any method; variables can be declared at the file level in a C or C++ program, and used in any function; variables can be declared in the outermost scope of a Pascal program and can be used in any procedure or function.
- Non-local variables: we will use this term for two situations:
- In languages that allow sub-programs to be nested, it refers to variables that are declared in one sub-program and used in a nested sub-program. This is allowed, for example, in Pascal.
- In languages with dynamic scope, it refers to variables that are used in a method without being declared there (so the use corresponds to the declaration in the "most recently called, still active" method).
- x: offset 0
- a: offset -12
- b: offset -16
- Keep track of the current offset in a global variable (e.g., a static ASTnode field) or in the symboltable (i.e., add a new field to the SymbolTable class, with methods to set and get the field's value).
- When the symbol-table-building code starts processing a method, set the current offset to zero.
- For each parameter: add the name to the symboltable as usual, but also add the value of the current offset as an attribute, and update the current offset by subtracting the size of the parameter (which will depend on its type).
- After processing all parameters, subtract 8 from the current offset (to leave room for the return address and the control link).
- For each local variable, add the name to the symboltable with the current offset, and subtract the appropriate amount from the current offset.
- x: offset -4
- a: offset -20
- b: offset -28
- In statically-scoped languages that allow nested sub-programs, a variable can be declared in one sub-program and accessed in another, nested sub-program. Such variables are "non-local" in the nested sub-program that accesses them. These variables are stored in the activation record of the sub-program that declares them. When they are used as non-local variables, that activation record is found at runtime either using access links or a display (discussed below).
- In dynamically-scoped languages, a variable can be declared in one method, and accessed in a called method. The variable is non-local in the method that accesses it. Two different approaches to supporting the use of non-local variables in dynamically-scoped languages are discussed below.
- Add a new field to each AR -- the access link field.
- If P is (lexically) nested inside Q, then at runtime, P's AR's access link will point to the access link field in the AR of the most recent activation of Q.
- Therefore, at runtime, access links will form a chain corresponding
to the nesting of sub-programs.
For each use of a non-local x:
- At compile time, use the "Level Number" attribute of x and the "Current Level Number" (of the sub-program that is accessing x) to determine how many links of the chain to follow at runtime.
- If P at level i uses variable x, declared at level j, follow i-j links, then use x's "Offset" attribute to find x's storage space inside the AR.
- The link is set up by the calling procedure.
- How to set up the link depends on the relative nesting levels of the calling and called procedures.
- case #1: Calling procedure's level is less than called procedure's level: Push the value of the FP to be the access link field of the called procedure's AR.
- case #2: Calling procedure's level is greater than or equal to the called procedure's level: Find the value to push by starting with the value of the calling procedure's access links, then following X-Y links, where X = calling procedure's level, and Y = called procedure's level.
- When procedure P at nesting level k is executing, DISPLAY[0],...DISPLAY[k-2] hold pointers to the ARs of the most recent activations of the k-1 procedures that lexically enclose P. DISPLAY[k-1] holds a pointer to P's AR.
- To access a non-local variable declared at level x, use DISPLAY[x-1] to get to the AR that holds x, then use the usual offset to get x itself.
- When a procedure at nesting level k is called:
- The current value of DISPLAY[k-1] is saved in the "save-display" field of the new AR.
- DISPLAY[k-1] is set to point to (the save-display field of) the new AR.
- When the procedure returns, DISPLAY[k-1] is restored using the value saved in the "save-display" field (of the returning procedure).
- Access links can require more time (at runtime) to access non-locals (especially when the non-local is many nesting levels away).
- It can also require more time to set up a new access link when a procedure is called (if the nesting level of the called procedure is much smaller than the nesting level of the calling procedure).
- Displays require more space (at runtime).
- If the compiler is flexible enough to implement the display using registers if enough are available, or using space in the static data area, then the compiler code itself may be rather complicated (and error-prone).
- In practice, sub-programs usually are not very deeply nested, so the runtime considerations may not be very important.
Variable Access
Contents
In this set of notes we will consider how three different kinds of variables: local, global, and non-local, are accessed at runtime. For the purposes of this discussion, we will define these three categories as follows:
Local variables (including parameters) are stored in the activation record of the method in which they are declared. They are accessed at runtime using an offset from the frame pointer (FP). Since we're assuming that "up" in the stack means a lower address, these offsets will be negative numbers. For example, given this code:
void P(int x) { int a, b; ... }and assuming that activation records do not include an access link field or space for saved registers, P's activation records will be organized as follows:
<--- Stack Pointer | | |_______________| b: | | |_______________| a: | | |_______________| | Control Link | |_______________| | Return Addr | |_______________| x: | | |_______________| <--- Frame PointerAssuming 4 bytes for each pointer and each integer, here are the offsets for each of P's locals:
lw $t1, -12($fp) # load a lw $t2, -16($fp) # load bTo be able to generate this code at compile time (e.g., to process a statement like x = a + b), the offset of each local variable must be known by the compiler. Therefore, the offset of each local variable from the Frame Pointer should be stored as an attribute of the variable in the symbol table. This can be done as follows:
For MIPS code, each double variable requires 8 bytes, and its offset is in the "middle" of those bytes. For example, given this code:
void P(double x) { double a, b; ... }variable x will be stored in the bytes at offsets 0 to -7 from the Frame Pointer, and the offset that should be stored in x's symboltable entry (and used to access x at runtime) will be -4. Here are the offsets for all of P's locals:
l.d $f0, -20($fp) # load a l.d $f2, -28($fp) # load b
Assume that both an address and an integer require 4 bytes of storage, and that a double requires 8 bytes. Also assume that each activation record includes parameters, a return address, a control link, and space for local variables as illustrated above. What are the offsets (in bytes) for each of the parameters and local variables in the following functions?
void P1(int x, int y) { int a, b, c; ... while (...) { double a, w; } } void P2() { int x, y; ... if (...) { double a; ... } else { int b, c; ... } }
As noted above, global variables are stored in the static data area. Using MIPS code, each global is stored in space labeled with the name of the variable, and the variable is accessed at runtime using its name. (Since we will be using the SPIM simulator, and since SPIM has some reserved words that can also be used as Simple variable names, you will need to add an underscore to global variable names to prevent clashes with SPIM reserved words.)
For example, if the source code includes:
// global variables int g; double d;The following code would be generated to reserve space in the static data area for variables g and d:
.data # put the following in the static data area .align 2 # align on a word boundary _g: .space 4 # set aside 4 bytes .data .align 2 _d: .space 8And the following code would be generated to load the value of variable g into register t0, and to load the value of variable d into register f0:
lw $t0, _g # load contents of g into t0 l.d $f0, _dd # load contents of dd into f0Non-Local Variables
Recall that we are using the term "non-local variable" to refer to two situations:
First, let's consider an example (Pascal) program that includes accesses to non-local variables (the nesting levels of the procedures are given for later reference):
+ program MAIN; | var x: integer; | | + procedure P; | (2) write(x); | + | | + procedure Q; | | var y: integer = x; | | | | + procedure R; | | | x = x + 1; | | | y = y + x; (1) (2) (3) if y<6 call R; | | | call P | | + | | | | call R; | | call P; | | if x < 5 call Q; | + | | x = 2; | call Q; +
Trace the execution of this program, drawing the activation records for the main program and for each called procedure (include only local variables and control links in the activation records). Each time a non-local variable is accessed, determine which activation record it is stored in. Notice that the relative nesting levels of the variable's declaration and its use does not tell you how far down the stack to look for the activation record that contains the non-local variable. In fact, this number changes for different calls (e.g., due to recursion).
The idea behind the use of access links is as follows:
____________ P | |------------+ <-- FP +==========+ | R | |----------+ | +==========+ | | R | |------+ | | +==========+ | | | y:| 9 | | | | Q +----------| | | | | |--+ <-+ <-+ | +==========+ | | x:| 4 | | | MAIN |----------| | | | null |<-+ <--+ |__________|To access the value of x from procedure R, access links must be followed. The number of links that must be followed is:
-
(level # of R) - (level # of decl of x)
= 3 - 1 = 2
move $t0, FP // no links followed yet; t0 holds ptr to R's access link field lw $t0, ($t0) // 1 link: t0 holds ptr to Q's AR's access link field +-> lw $t0, ($t0) // 2 links: t0 holds ptr to main's AR's access link field | lw $t0, -12($t0) // t0 holds value of x | sw $t0, -12(FP) // y = x | This code would be repeated if we needed to follow more links.How to set up access links:
Case 1: The calling procedure's level is less than the called procedure's level (i.e., the called procedure is nested directly inside the calling procedure, because if it's not, it's invisible and it can't be called). In this case, the called procedure's access link should point to the access link field of the calling procedure's AR. This case occurs in the example when Q calls R.
In this case, the calling procedure sets up the called procedure's access link by pushing the value of the FP just before the call, since the FP points to its access link. For example, when Q is calling R, here is a picture of the stack just after R's access link has been set up by Q:
<-- SP |----------| | |------+ |==========| | y:| | | Q |----------| | | |--+ <-+ <-- FP |==========| | x:| | | MAIN |----------| | | null |<-+ |__________|Case 2: The calling procedure's level is greater than or equal to the called procedure's level. In this case, the called procedure's access link should point to an AR that is already on the stack. The value for the new access link is found by starting with the value of the calling procedure's access link field (which is pointed to by the FP), and following X-Y access links, where:
-
X = calling procedure's level
Y = called procedure's level
lw $t0, 0(FP)If X == Y (i.e., no links should be followed to find the value of the new access link), then the value of t0 should simply be pushed onto the stack. If X is greater than Y, then the code:
lw $t0, 0($t0)should be generated X - Y times, before pushing the value of t0.
To illustrate this situation, consider two cases from the example code: R calls P, and Q calls itself. Recall that the nesting structure of the example program is:
+-- | | +-- | | | P| | | | +-- | MAIN | +-- | | | | +-- | | | | Q| R| | | | | | +-- | | | +-- | +--When R is about to call P, the stack looks like this:
+==========+ R | |----------+ +==========+ | R | |------+ | +==========+ | | y:| 9 | | | Q +----------| | | | |--+ <-+ <-+ +==========+ | x:| 4 | | MAIN |----------| | | null |<-+ |__________|Since P is nested inside MAIN, P's access link should point to MAIN's AR (i.e., the bottom AR on the stack). R's nesting level is 3 and P's nesting level is 2. Therefore, we start with the value of R's access link (the pointer to Q's AR) and follow one link. This retrieves the values of Q's access link, which is (as desired) a pointer to MAIN's AR. This is the value that will be pushed onto the stack (copied into P's AR as its access link).
When Q is about to call itself, the stack looks like this:
|==========| y:| | Q |----------| | |--+ |==========| | x:| | | MAIN |----------| | | null |<-+ |__________|The access link in the new AR for Q should be the same as the access link in the current AR for Q; namely, it should be a pointer to the AR for MAIN. This value is found by starting with the value of Q's access link (a pointer to MAIN's AR) and following zero links (because X = Y = 1).
Trace the execution of the example program again. Each time a procedure is called, determine which case applies in terms of how to set up the called procedure's access link. Then use the appropriate algorithm to find the value of the new access link, and draw the new AR with its access link.
Each time a non-local variable x is accessed, make sure that you find its activation record by following i - j access links (where i is the nesting level of the procedure that uses x and j is the nesting level of the procedure that declares x).
To use an access link:
-
Follow i - j links to find the AR with space for non-local x,
where i is the nesting level of the procedure that uses x and j is the
nesting level of the procedure that declares x.
The motivation for using a display is to avoid the runtime overhead of following multiple access links to find the activation record that contains a non-local variable. The idea is to maintain a global "array" called the display. Ideally, the display is actually implemented using registers (one for each array element) rather than an actual array; however, it can also be implemented using an array in the static data area.
The size of the display is the maximum nesting depth of a procedure in the program (which is known at compile time). The display is used as follows:
+-- |int x; | | +-- | | | P|...x... | | | +-- | MAIN | +-- | | int y; | | | | +-- | | |...x...y... | Q| R|call R | | |call P | | +-- | | | | call R | | call P | | if (...) call Q | +-- | | call Q +--Below are two pictures comparing the use of access links with the use of a display. Both show the same moment at runtime.
USING ACCESS LINKS USING A DISPLAY ____________ ---------- P | |------------+ P | |<--+ |==========| | |========| | +--+ R | |----------+ | R | |<------| | [2] |==========| | | |========| | +--+ y:| | | | y:| | +---| | [1] Q |----------| | | Q |--------| +--+ | |------+ <-+ | | | +---| | [0] |==========| | | |========| | +--+ x:| | | | x:| | | DISPLAY MAIN |----------| | | MAIN |--------| | | null | <-+ <--+ | |<--+ |__________| ---------- STACK STACK
To maintain the display, a new field (called the "save-display" field) is added to each activation record. The display is maintained as follows:
Before R calls P: ---------------- ________ ____________ [2] | |--\ R | | |______| -------------->| | [1] | |--\ |==========| |______| \ y:| | [0] | |--\ \ Q |----------| |______| \ ------------>| | \ |==========| \ x:| | \ MAIN |----------| ---------->| | |==========| After R calls P: --------------- ________ [2] | |--\ ------------ |______| \ P | |---+ [1] | |----\------------>| | | |______| \ |==========| | [0] | |--\ \ R | | | |______| \ ---------->| | | \ |==========| | \ x:| | | \ Q |----------| | \ | |<--+ \ |==========| \ y:| | \ MAIN |----------| ------>| | |----------|
Trace the execution of the running example program, assuming that a display is used instead of access links. Each time a non-local variable x is accessed, make sure that you understand how to find its AR using the display.
Comparison: Access Links vs The Display
Recall that under dynamic scoping, a use of a non-local variable corresponds to the declaration in the "most recently called, still active" method. So the question of which non-local variable to use can't be determined at compile time. It can only be determined at run-time. There are two ways to implement access to non-locals under dynamic scope: "deep access" and "shallow access", described below.
Using this approach, given a use of a non-local variable, control links are used to search back in the stack for the most recent AR that contains space for that variable. Note that this requires that it be possible to tell which variables are stored in each AR; this is more natural for languages that are interpreted rather than being compiled (which was indeed the case for languages that used dynamic scope). Note also that the number of control links that must be followed cannot be determined at compile time; in fact, a different number of links may be followed at different times during execution, as illustrated by the following example program:
void P() { write x; } void Q() { x = x + 1; if (x < 23) Q(); else P(); } void R() { int x = 20; Q(); P(): } void main() { int x = 10; R(); P(); }
Trace the execution of the program given above. Note that method P includes a use of non-local variable x. How many control links must be followed to find the AR with space for x each time P is called?
Using this approach, space is allocated (in registers or in the static data area) for every variable name that is in the program (i.e., one space for variable x even if there are several declarations of x in different methods). For every reference to x, the generated code refers to the same location.
When a method is called, it saves, in its own AR, the current values of all of the variables that it declares itself (i.e., if it declares x and y, then it saves the values of x and y that are currently in the space for x and y). It restores those values when it finishes.
Note that this means that when a method accesses a non-local variable x, the value of x from the most-recently-called, still-active method is stored in the (single) location for x. There is no need to go searching down the stack!
Question 1: Trace the execution of the program given above again, this time assuming that shallow access is used.
Question 2: What are the advantages and disadvantages of shallow access compared with deep access? (Consider both time and space requirements.)