Three Address Code in Compiler Design
Three Address Code
Three address code is an intermediate code used by optimizing compilers whereby a given expression is broken down into several separate instructions which can be easily translated into assembly language.
A statement involving no more than three references(two for operands and one for result) is known as three address statement. A sequence of three address statements is known as three address code. Three address statement is of the form x = y op z , here x, y, z will have address (memory location). Sometimes a statement might contain less than three references but it is still called three address statement.
For example a source language expression a+b*c is translated into a sequence of three-address instruction as follows,
t1 = b * c
t2 = a + t1
here t1 and t2 are compiler generated temporary names.
Three address code presents multi-operator arithmetic expressions and nested flow-of-control statements which makes it useful for generating and optimizing target code.
We can also view three-address code as a linearized representation of syntax to a directed acyclic graph(DAG) whereby names correspond to the interior nodes of the DAG as shown below.
An example DAG for the expression: a + a * ( b – c ) + ( b - c ) * d
General Form
Here,
- a, b and c are the operands.
- Operands may be constants, names, or compiler generated temporaries.
- op represents the operator.
Example-
- a = b + c
- c = a * b
Addresses and instructions
- Three-address code involves addresses and instructions.
- In object-oriented terms addresses will correspond to classes while instructions correspond to the appropriate subclasses.
- An address can either be a name, constant or a compiler-generated temporary. In a sequence of instructions symbolic labels will represent the index of a three-address instruction.
- These symbolic labels are used by instructions which alter the flow of control. Labels can be used in place of actual indexes by making a separate pass or by backpatching.
Forms of three-address instructions
1. Assignment Statement
- Assignment in the form of a = b op c where op is a binary arithmetic or logical operation and a, b and c are addresses.
- Assignments in the form a = op b where op is a unary operation. Essential unary operations include unary minus, logical negation and conversion operators.
2. Copy Statement
- x and y are the operands.
- = is an assignment operator.
3. Conditional Jump
- x & y are the operands.
- X is the tag or label of the target statement.
- relop is a relational operator.
- The control is sent directly to the location specified by label X.
- All the statements in between are skipped.
- The control is not sent to the location specified by label X.
- The next statement appearing in the usual sequence is executed.
4. Unconditional Jump
- The control is sent directly to the location specified by label X.
- All the statements in between are skipped.
5. Procedure Call
Implementation of Three Address Code
- Quadruple
- Triples
1. Quadruple
Example: See the below expression:
a = b * - c + b * - c
Here, we have used the special operator uminus to differ the unary minus operator from the binary minus operator.
Three address code for the given statements are:
t1 = uminus ct2 = b * t1
t3 = uminus c
t4 = b * t3
t5 = t2 + t4
a = t5