Three Address Code in Compiler Design

𝐀𝐝𝐦𝐢𝐧

We will Learn about the concept of three address code in compiler design. Understand its role in translating high-level code into machine code.

Three Address Code

Three address code is an intermediate code used by optimizing compilers whereby a given expression is broken down into several separate instructions which can be easily translated into assembly language.

A statement involving no more than three references(two for operands and one for result) is known as three address statement. A sequence of three address statements is known as three address code. Three address statement is of the form x = y op z , here x, y, z will have address (memory location). Sometimes a statement might contain less than three references but it is still called three address statement.

For example a source language expression a+b*c is translated into a sequence of three-address instruction as follows,

t1 = b * c

t2 = a + t1

here t1 and t2 are compiler generated temporary names.

Three address code presents multi-operator arithmetic expressions and nested flow-of-control statements which makes it useful for generating and optimizing target code.

We can also view three-address code as a linearized representation of syntax to a directed acyclic graph(DAG) whereby names correspond to the interior nodes of the DAG as shown below.

An example DAG for the expression: a + a * ( b – c ) + ( b - c ) * d

Example – The three address code for the expression a + b * c + d :

T 1 = b * c

T 2 = a + T 1

T 3 = T 2 + d

Where T 1 , T 2 , T 3 are temporary variables.

General Form

In general, Three Address instructions are represented as-

a = b op c

Here,

a, b and c are the operands.
Operands may be constants, names, or compiler generated temporaries.
op represents the operator.

Example-

a = b + c
c = a * b

Addresses and instructions

Three-address code involves addresses and instructions.
In object-oriented terms addresses will correspond to classes while instructions correspond to the appropriate subclasses.
An address can either be a name, constant or a compiler-generated temporary. In a sequence of instructions symbolic labels will represent the index of a three-address instruction.
These symbolic labels are used by instructions which alter the flow of control. Labels can be used in place of actual indexes by making a separate pass or by backpatching.

Forms of three-address instructions

The common forms of Three Address instructions are-

1. Assignment Statement

Assignment in the form of a = b op c where op is a binary arithmetic or logical operation and a, b and c are addresses.
Assignments in the form a = op b where op is a unary operation. Essential unary operations include unary minus, logical negation and conversion operators.

It assigns the result obtained after solving the right side expression of the assignment operator to the left side operand.

2. Copy Statement

x = y

Here,

x and y are the operands.
= is an assignment operator.

It copies and assigns the value of operand y to operand x.

3. Conditional Jump

If x relop y goto X

Here,

x & y are the operands.
X is the tag or label of the target statement.
relop is a relational operator.

If the condition x relop y gets satisfied, then-

The control is sent directly to the location specified by label X.
All the statements in between are skipped.

If the condition x relop y fails, then-

The control is not sent to the location specified by label X.
The next statement appearing in the usual sequence is executed.

4. Unconditional Jump

goto X

Here, X is the tag or label of the target statement.

On executing the statement,

The control is sent directly to the location specified by label X.
All the statements in between are skipped.

5. Procedure Call

param x call p return y

Here, p is a function which takes x as a parameter and returns y.

Implementation of Three Address Code

There are 2 main representations of three address code namely

Quadruple
Triples

1. Quadruple

A quadruples consists of four fields such as op, arg1, arg2, and result. The internal code for the operator is stored in the op field. For instance, the three-address instruction a = b + c is depicted by putting + in the op, b in arg1, c in arg2, and a in the result.

Example: See the below expression:

a = b * - c + b * - c

Here, we have used the special operator uminus to differ the unary minus operator from the binary minus operator.

Three address code for the given statements are:

t1 = uminus c
t2 = b * t1
t3 = uminus c
t4 = b * t3
t5 = t2 + t4
a = t5

Quadruple representation for given expression:

Quadruple representation of three address code in compiler design

2. Triples

A triple has only three fields, which are known as op, arg1, and arg2. Result fields in quadruple are used for temporary names. Triples using the position of an operation to store results rather than temporary names. So triple using position (0) rather than temporary t1.

Example – Consider expression: a = b * – c + b * – c

Triple of this expression are shown below.

Triple representation of three address code in compiler design

Conclusion

In short, three address code is an intermediate representation used in compiler design to break down complex expressions into separate instructions. It simplifies the translation process and enables code optimization. By using three addresses for operands and results, it facilitates efficient generation of machine code. Overall, three address code plays a crucial role in improving compiler performance and generating optimized executable programs.

Three Address Code in Compiler Design

Three Address Code

General Form

Addresses and instructions