 Topic Introduction
 TopDown SDT
 Example: Counting Parentheses
 SelfStudy #1
 Handling NonLL(1) Grammars
 SelfStudy #2
 BottomUp SDT
 Summary
 The semantic stack holds nonterminals' translations; when the parse is finished, it will hold just one value: the translation of the root nonterminal (which is the translation of the whole input).
 Values are pushed onto the semantic stack (and popped off) by
adding actions to the grammar rules. The action for one
rule must:
 Pop the translations of all righthandside nonterminals.
 Compute and push the translation of the lefthandside nonterminal.
 The actions themselves are represented by action numbers, which become part of the righthand sides of the grammar rules. They are pushed onto the (normal) stack along with the terminal and nonterminal symbols. When an action number is the topofstack symbol, it is popped and the action is carried out.
 Pop all righthandside nonterminals' translations from the semantic stack.
 Compute and push the lefthandside nonterminal's translation.
SyntaxDirected Translation (SDT)
Our journey through the syntactic analysis has, thus far, gone as follows: We first showed how to define the parse tree, then explored how to actually build it. Likewise, we can define the AST through a syntaxdirected definition, so we now turn to exploring how to actually build the AST using the two parsing strategies (topdown and bottomup) that we presented. The actual task of producing the AST according to the syntaxdirected definition is known as syntaxdirected translation (SDT).
Now we consider how to implement a syntaxdirected translation using a predictive parser. It is not obvious how to do this, since the predictive parser works by building the parse tree topdown, while the syntaxdirected translation needs to be computed bottomup (since the translations attached to LHS symbol translations depend on the translations attached to RHS symbols). Of course, we could design the parser to actually build the parse tree (topdown), then use the translation rules to build the translation (bottomup). However, that would not be very efficient.
Instead, we avoid explicitly building the parse tree by giving the parser a second stack called the semantic stack:
So what actually happens is that the action for a grammar rule $X \longrightarrow Y_1 \; Y_2 \; \ldots \; Y_n$ is pushed onto the (normal) stack when the derivation step $X \longrightarrow Y_1 Y_2 \ldots Y_n$ is made, but the action is not actually performed until complete derivations for all of the $Y$s have been carried out.
For example, consider the following syntaxdirected translation for the language of balanced parentheses and square brackets. The translation of a string in the language is the number of parenthesis pairs in the string.
CFG  Transition Rules  
$Exp$  $\longrightarrow$  $\varepsilon$  $Exp$.trans = 0  
  ( $Exp$ )  $Exp_1$.trans = $Exp_2$.trans + 1  
  [ $Exp$ ]  $Exp_1$.trans = $Exp_2$.trans 
The first step is to replace the transition rules with translation actions. Each action must:
CFG  Transition Actions  
$Exp$  $\longrightarrow$  $\varepsilon$  push 0;  
  ( $Exp$ )  exp2trans = pop() ; push(exp2trans + 1)  
  [ $Exp$ ]  exp2trans = pop() ; push(exp2trans) 
Next, each action is represented by a unique action number, and those action numbers become part of the grammar rules:
CFG with Embedded Actions  
$Exp$  $\longrightarrow$  $\varepsilon$ #1 
  ( $Exp$ ) #2  
  [ $Exp$ ] #3 
where
#1 is  push 0; 
#2 is  exp2trans = pop() ; push(exp2trans + 1) 
#3 is  exp2trans = pop() ; push(exp2trans) 
Note that since action #3 just pushes exactly what is popped, that action is redundant, and it is not necessary to have any action associated with the third grammar rule. Here's a picture that illustrates what happens when the input "([])" is parsed (assuming that we have removed action #3):
Input so far  Stack  Semantic Stack  Action  
(  $Exp$  pop, push ( exp ) #2  
(  ( $Exp$ ) 2  pop, scan  
([  $Exp$ ) 2  pop, push "[ exp ]"  
([  [ $Exp$ ] ) 2  pop, scan  
([]  $Exp$ ] ) 2  pop, push $\varepsilon$ 1  
([]  1 ] ) 2  pop, do action 1  
([]  ] ) 2  0  pop, scan  
([]  ) 2  0  pop, scan  
([])  2  0  pop, do action 2  
([])  1  pop, scan  
([])  empty stack, accept 
translation of input = 1
In the example above, there is no grammar rule with more than one nonterminal on the righthand side. If there were, the translation action for that rule would have to do one pop for each righthandside nonterminal. For example, suppose we are using a grammar that includes the rule:
CFG Rule: methodBody > { varDecls stmts } Translation Rule: methodBody.trans = varDecls.trans + stmts.trans Translation Action: stmtsTrans = pop(); declsTrans = pop(); push( stmtsTrans + declsTrans ); CFG rule with Action: methodBody > { varDecls stmts } #1 #1: stmtsTrans = pop(); declsTrans = pop(); push( stmtsTrans + declsTrans );
Note that the righthandside nonterminals' translations are popped from the semantic stack righttoleft. That is because the predictive parser does a leftmost derivation, so the varDecls nonterminal gets "expanded" first; i.e., its parse tree is created before the parse tree for the stmts nonterminal. This means that the actions that create the translation of the varDecls nonterminal are performed first, and thus its translation is pushed onto the semantic stack first.
Another issue that has not been illustrated yet arises when a lefthandside nonterminal's translation depends on the value of a righthandside terminal. In that case, it is important to put the action number before that terminal symbol when incorporating actions into grammar rules. This is because a terminal symbol's value is available during the parse only when it is the "current token". For example, if the translation of an arithmetic expression is the value of the expression:
CFG Rule:  Factor $\longrightarrow$ intlit 
Translation Rule:  factor.trans = intlit.value 
Translation Action:  push ( intlit.value ) 
CFG rule with Action:  Factor $\longrightarrow$ #1
intlit // action BEFORE terminal
#1: push( currToken.value )

For the following grammar, give (a) translation rules, (b) translation actions with numbers, and (c) a CFG with action numbers, so that the translation of an input expression is the value of the expression. Do not worry about the fact that the grammar is not LL(1).
Exp  $\longrightarrow$  Exp plus Term 
$$  Exp minus Term  
$$  Term  
Term  $\longrightarrow$  Term times Factor 
$$  Term divide Factor  
$$  Factor  
Factor  $\longrightarrow$  intlit 
$$  lparen Exp rparen 
Handling NonLL(1) Grammars
Recall that a nonLL(1) grammar must be transformed to an equivalent LL(1) grammar if it is to be parsed using a predictive parser. Recall also that the transformed grammar usually does not reflect the underlying structure the way the original grammar did. For example, when left recursion is removed from the grammar for arithmetic expressions, we get grammar rules like this:
CFG  
$Exp$  $\longrightarrow$  $Term$ $Exp'$ 
$Exp'$  $\longrightarrow$  $\varepsilon$ 
  + $Term$ $Exp'$ 
For example:

NonLL(1) Grammar Rules With Actions
$Exp$  $\longrightarrow$  $Exp$ + $Term$ #1 
  $Term$  
$Term$  $\longrightarrow$  $Term$ * $Factor$ #2 
  $\mathit{Factor}$ 
#1 is  TTrans = pop() ; ETrans = pop() ; push(ETrans + TTrans); 
#2 is  FTrans = pop() ; TTrans = pop() ; push(TTrans * FTrans); 
$Exp$  $\longrightarrow$  $Term$ $Exp'$ 
$Exp'$  $\longrightarrow$  + $Term$ #1 $Exp'$ 
  $\varepsilon$  
$\mathit{Term}$  $\longrightarrow$  $\mathit{Factor}$ $\mathit{Term}'$ 
$\mathit{Term}'$  $\longrightarrow$  * $\mathit{Factor}$ 2 $\mathit{Term}'$ 
  $\varepsilon$ 
Transform the grammar rules with actions that you wrote for the "Test Yourself #1" exercise to LL(1) form. Trace the actions of the predictive parser on the input 2 + 3 * 4.
In contrast to topdown SDT, performing SDT bottomup SDT is fairly straightforward, since RHS symbol translations will naturally be available at the time of the RHS symbol translations. Consequently, we will not cover the implementation in very much depth. It is sufficient to simply note that when a single grammar production has been reduced according to the grammar, we can simply fire the corresponding rule from the SDD, referencing the position of the RHS symbol according to it's appearance in the production.
A syntaxdirected translation is used to define the translation of a sequence of tokens to some other value, based on a CFG for the input. A syntaxdirected translation is defined by associating a translation rule with each grammar rule. A translation rule defines the translation of the lefthandside nonterminal as a function of the righthandside nonterminals' translations, and the values of the righthandside terminals. To compute the translation of a string, build the parse tree, and use the translation rules to compute the translation of each nonterminal in the tree, bottomup; the translation of the string is the translation of the root nonterminal.
There is no restriction on the type of a translation; it can be a simple type like an integer, or a complex type list an abstractsyntax tree.
To implement a syntaxdirected translation using a predictive parser, the translation rules are converted to actions that manipulate the parser's semantic stack. Each action must pop all righthandside nonterminals' translations from the semantic stack, then compute and push the lefthandside nonterminal's translation. Next, the actions are incorporated (as action numbers) into the grammar rules. Finally, the grammar is converted to LL(1) form (treating the action numbers just like terminal or nonterminal symbols).