Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

2.

9 attribute Grammars
An attribute grammar is a device used to describe more of the
structure of a programming language than can be described with a
context-free grammar.
An attribute grammar is an extension to a context-free grammar.

2.9.1 Static Semantics


There are some characteristics of programming languages that are
difficult to describe with BNF, and some that are impossible.
As an example of a syntax rule that is difficult to specify with BNF,
consider type compatibility rules.
In Java, for example, a floating-point value cannot be assigned to an
integer type variable, although the opposite is legal. Although this
restriction can be specified in BNF, it requires additional
nonterminal symbols and rules. If all of the typing rules of Java were
specified in BNF, the grammar would become too large to be useful,
because the size of the grammar determines the size of the syntax
analyzer.
As an example of a syntax rule that cannot be specified in BNF,
consider the common rule that all variables must be declared before
they are referenced.
These problems exemplify the categories of language rules called
static semantics rules.
The static semantics of a language is only indirectly related to the
meaning of programs during execution; rather, it has to do with the
legal forms of programs (syntax rather than semantics).
Many static semantic rules of a language state its type constraints.
Static semantics is so named because the analysis required to check
these specifications can be done at compile time.
Because of the problems of describing static semantics with BNF, a
variety of more powerful mechanisms has been devised for that task.
One such mechanism, attribute grammars, to describe both the
syntax and the static semantics of programs.
Attribute grammars are a formal approach both to describing and
checking the correctness of the static semantics rules of a program.
Attribute grammars are context-free grammars to which have been
added attributes, attribute computation functions, and predicate
functions.
 Attributes, which are associated with grammar symbols (the
terminal and nonterminal symbols), are similar to variables in
the sense that they can have values assigned to them.
 Attribute computation functions, sometimes called semantic
functions, are associated with grammar rules. They are used to
specify how attribute values are computed.
 Predicate functions, which state the static semantic rules of
the language, are associated with grammar rules.

An attribute grammar is a grammar with the following additional


features:
 Associated with each grammar symbol X is a set of attributes
A(X). The set A(X) consists of two disjoint sets S(X) and
I(X), called synthesized and inherited attributes, respectively.
o Synthesized attributes are used to pass semantic
information up a parse tree, while
o inherited attributes pass semantic information down
and across a tree.
 Associated with each grammar rule is a set of semantic
functions and a possibly empty set of predicate functions over
the attributes of the symbols in the grammar rule. For a rule
X0 →X1… Xn, the synthesized attributes of X0 are computed
with semantic functions of the form
S(X0) = f(A(X1), … , A(Xn)).
 A predicate function has the form of a Boolean expression on
the union of the attribute set {A(X0), …, A(Xn)} and a set of
literal attribute values. The only derivations allowed with an
attribute grammar are those in which every predicate
associated with every nonterminal is true. A false predicate
function value indicates a violation of the syntax or static
semantics rules of the language.

2.9.2 Examples of Attribute Grammars


The following fragment of an attribute grammar that describes the
rule that the name on the end of an Ada procedure must match the
procedure’s name. (This rule cannot be stated in BNF.)
The string attribute of <proc_name>, denoted by
<proc_name>.string, is the actual string of characters that were
found immediately following the reserved word procedure by the
compiler.
Notice that when there is more than one occurrence of a
nonterminal in a syntax rule in an attribute grammar, the
nonterminals are subscripted with brackets to distinguish them.
Neither the subscripts nor the brackets are part of the described
language.
Syntax rule: <proc_def> → procedure <proc_name>[1]
<proc_body> end <proc_name>[2];
Predicate: <proc_name>[1]string = = <proc_name>[2].string
In this example, the predicate rule states that the name string
attribute of the <proc_name> nonterminal in the subprogram header
must match the name string attribute of the <proc_name>
nonterminal following the end of the subprogram.
Next, we consider a larger example of an attribute grammar. In this
case, the example illustrates how an attribute grammar can be used
to check the type rules of a simple assignment statement. The syntax
and static semantics of this assignment statement are as follows:
 The only variable names are A, B, and C.
 The right side of the assignments can be either a variable or
an expression in the form of a variable added to another
variable.
 The variables can be one of two types: int or real. When there
are two variables on the right side of an assignment, they need
not be the same type.
 The type of the expression when the operand types are not the
same is always real. When they are the same, the expression
type is that of the operands.
 The type of the left side of the assignment must match the
type of the right side. So the types of operands in the right
side can be mixed, but the assignment is valid only if the
target and the value resulting from evaluating the right side
have the same type.
The attribute grammar specifies these static semantic rules.
The syntax portion of our example attribute grammar is
<assign> → <var> = <expr>
<expr> → <var> + <var>
| <var>
<var> → A | B | C
The attributes for the nonterminals in the example attribute grammar
are described in the following paragraphs:
 actual_type—A synthesized attribute associated with the
nonterminals <var> and <expr>. It is used to store the actual
type, int or real, of a variable or expression. In the case of a
variable, the actual type is intrinsic. In the case of an
expression, it is determined from the actual types of the child
node or children nodes of the <expr> nonterminal.
 expected_type—An inherited attribute associated with the
nonterminal <expr>. It is used to store the type, either int or
real, that is expected for the expression, as determined by the
type of the variable on the left side of the assignment
statement.
The complete attribute grammar follows in Example 2.6.
Example 2.6
An Attribute Grammar for Simple Assignment Statements
1. Syntax rule: <assign> → <var> = <expr>
Semantic rule: <expr>.expected_type ← <var>.actual_type
2. Syntax rule: <expr> → <var>[2] + <var>[3]
Semantic rule: <expr>.actual_type ←
if (<var>[2].actual_type = int) and
(<var>[3].actual_type = int)
then int
else real
end if
Predicate: <expr>.actual_type = = <expr>.expected_type
3. Syntax rule: <expr> → <var>
Semantic rule: <expr>.actual_type ← <var>.actual_type
Predicate: <expr>.actual_type = = <expr>.expected_type
4. Syntax rule: <var> → A | B | C
Semantic rule: <var>.actual_type ← look- up(<var>.string)
The look-up function looks up a given variable name in the symbol
table and returns the variable’s type.

A parse tree of the sentence A = A + B generated by the grammar


in Example 2.6 is shown in Figure 2.6. As in the grammar, bracketed
numbers are added after the repeated node labels in the tree so they
can be referenced unambiguously.
Figure 2.6 A parse tree for A = A + B

2.10 Operational Semantics


The idea behind operational semantics is to describe the meaning
of a statement or program by specifying the effects of running it on a
machine. The effects on the machine are viewed as the sequence of
changes in its state, where the machine’s state is the collection of the
values in its storage. An obvious operational semantics description,
then, is given by executing a compiled version of the program on a
computer.
Most programmers have written a small test program to determine
the meaning of some programming language construct, often while
learning the language. Essentially, what such a programmer is doing
is using operational semantics to determine the meaning of the
construct.
There are several problems with using this approach for complete
formal semantics descriptions.
 First, the individual steps in the execution of machine
language and the resulting changes to the state of the machine
are too small and too numerous.
 Second, the storage of a real computer is too large and
complex.
There are usually several levels of memory devices, as well as
connections to enumerable other computers and memory devices
through networks. Therefore, machine languages and real computers
are not used for formal operational semantics. Rather, intermediate-
Level languages and interpreters for idealized computers are
designed specifically for the process.

There are different levels of uses of operational semantics. At the


highest level, the interest is in the final result of the execution of a
complete program. This is sometimes called natural operational
semantics. At the lowest level, operational semantics can be used to
determine the precise meaning of a program through an examination
of the complete sequence of state changes that occur when the
program is executed. This use is sometimes called structural
operational semantics.
The first step in creating an operational semantics description of a
language is to design an appropriate intermediate language, where
the primary desired characteristic of the language is clarity. Every
construct of the intermediate language must have an obvious and
unambiguous meaning. This language is at the intermediate level,
because machine language is too low-level to be easily understood
and another high-level language is obviously not suitable. If the
semantics description is to be used for natural operational semantics,
a virtual machine (an interpreter) must be constructed for the
intermediate language.
The virtual machine can be used to execute either single
statements, code segments, or whole programs. The semantics
description can be used without a virtual machine if the meaning of a
single statement is all that is required. In this use, which is structural
operational semantics, the intermediate code can be visually
inspected.
For example, the semantics of the C for construct can be
described in terms of simpler statements, as in
C Statement Meaning
for (expr1; expr2; expr3) { expr1;
... loop: if expr2 = = 0 goto out
} ...
expr3;
goto loop
out: . . .
The human reader of such a description is the virtual computer and
is assumed to be able to ―execute‖ the instructions in the definition
correctly and recognize the effects of the ―execution.‖
The intermediate language and its associated virtual machine used
for formal operational semantics descriptions are often highly
abstract. The intermediate language is meant to be convenient for the
virtual machine, rather than for human readers.
consider the following list of statements, which would be adequate
for describing the semantics of the simple control statements of a
typical programming language:
ident = var
ident = ident + 1
ident = ident – 1
goto label
if var relop var goto label

In these statements, relop is one of the relational operators from


the set {=,<>, >, <, >=, <=}, ident is an identifier, and var is either
an identifier or a constant. These statements are all simple and
therefore easy to understand and implement. A slight generalization
of these three assignment statements allows more general arithmetic
expressions and assignment statements to be described. The new
statements are
ident = var bin_op var
ident = un_op var
where bin_op is a binary arithmetic operator and un_op is a unary
operator.

You might also like