Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Describing a Programming Language Chapter 3 Rearranged by 14203100 Language and Sentence • A language is a set of strings of characters from some alphabet. • The strings of a language are called sentences or statements. These small units are called lexemes which includes numeric literals, operators and special words. Each lexeme group is represented by a name or token. Chapter 3: Describing a Programming Langauge 2 Syntax and Semantic • Syntax: Syntax is a set of rules for grammar and spelling which specify the correct combined sequence of symbols that can be used to form a correctly structured program. • Semantic: is the meaning of programming languages. Semantic is used to describe those expressions, statements and program units. Chapter 3: Describing a Programming Langauge 3 Regular Expression • Each Regular Expression (RE) corresponds to a regular langauge. The regular expressions over  are the smallest set of expressions including   `c`  A+B  AB  A* where c   where A, B are RE over  where A, B are RE over  where A is a RE over  Chapter 3: Describing a Programming Langauge 4 Backus-Naur Form (BNF) • BNF is a metalanguage that is use to describe programming langauges. BNF uses abstractions for syntactic structures. For example; Java assignment statement might be represented by the abstraction <assign> and definition of <assign> can be given by <assign> → <var> = <expression> The abstractions in BNF grammar are often called non-terminal symbols or simply non-terminals and the lexemes and tokens of the rules are called terminal symbols or simply terminals. Chapter 3: Describing a Programming Langauge 5 • A grammar is a finite nonempty set of rules • A rule has one left hand symbol (LHS) and can have more than one right hand symbols <program>  <stmts> <stmts>  <stmt> | <stmt> ; <stmts> <stmt>  <var> = <expr> <var>  a | b | c | d <expr>  <term> + <term> | <term> - <term> <term>  <var> | const Chapter 3: Describing a Programming Langauge 6 BNF Rules • A rule is recursive if its LHS appears in its RHS. • The following rules illustrate how recursion is used to describe lists: <ident_list> → identifier | identifier, <ident_list> Chapter 3: Describing a Programming Langauge 7 Derivation • Derivation is the process of generating sentence by repeating application of rules, starting from the start symbol. A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded. <program> => <stmts> => <stmt> => <var> = <expr> => a =<expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const Chapter 3: Describing a Programming Langauge 8 Grammars and Derivations Chapter 3: Describing a Programming Langauge 9 Chapter 3: Describing a Programming Langauge 10 Chapter 3: Describing a Programming Langauge 11 Parse Tree <program> • A parse tree is the hierarchical representation of a derivation. Every internal node of a parse tree is labeled with a non-terminal symbol. Every leaf is labeled with a terminal symbol. <stmts> <stmt> <var> = <expr> a <term> <var> + <term> const b Chapter 3: Describing a Programming Langauge 12 Parse Tree of A=B*(A+C) Chapter 3: Describing a Programming Langauge 13 Show a parse tree and a leftmost derivation for A := A * (B + (C * A)) <assign> --> <id> := <expr> A := <expr> A := <id> * <expr> A := A * <expr> A := A * (<expr>) A := A * ( <id> + <expr> ) A := A * ( B + <expr> ) A := A * ( B + ( <expr> ) ) A := A * ( B + ( <id> * <expr> ) ) A := A * ( B + ( C * <expr> ) ) A := A * ( B + ( C * <id> ) ) A := A * ( B + ( C * A ) ) Chapter 3: Describing a Programming Langauge 14 A = A + (B + (C * A)) < assign > = > <id> = <expr> = > A = <expr> = > A = <id> + <expr> = > A = A + <expr> = > A = A + ( <expr> ) = > A = A + ( <id> + <expr> ) = > A = A + ( B + <expr> ) = > A = A + ( B + ( <expr> ) ) = > A = A + ( B + ( <id> * <expr> ) ) = > A = A + ( B + ( C * <expr> ) ) = > A = A + ( B + ( C * <id> ) ) =>A=A+( B + ( C *A)) Chapter 3: Describing a Programming Langauge 15 Ambiguity • A grammar is ambiguous if and if it generates a sentential form that has two or more distinct parse trees. For example, the following grammar can generate two distinct parse tree. <expr>  <expr> <op> <expr> | const <op>  / | <expr> <expr> <expr> <expr> const <op> - <op> <expr> <expr> <op> <expr> const / const Chapter 3: Describing a Programming Langauge const <expr> <expr> <op> <expr> - const const / 16 Removing Ambiguity • Associativity: when an expression includes two operators that have same precedence - for example, A / B * C—a semantic rule is required to specify which should have precedence. This process is known as associativity. • Precedence: when an expression includes two different operators, for example, x + y * z, assigning different levels to operators is important. For example; the multiplication operator is generated lower in the tree, which could indicate that it has precedence over the addition operator in the expression. • ambiguous: <expr> -> <expr> + <expr> | const • unambiguous: <expr> -> <expr> + <term> | <term> <term>  <term> / const| const Chapter 3: Describing a Programming Langauge 17 Ambiguous Grammar Example Prove that the following grammar is ambiguous: <S> -> <A> <A> -> <A> + <A> | <id> <id> -> a | b | c There are two different parse trees for many expressions, for example, a + b + c Chapter 3: Describing a Programming Langauge 18 Unambiguous Grammar for if-then-else <if_stmt> -> if <logic_expr> then <stmt> if <logic_expr> then <stmt> else <stmt> Chapter 3: Describing a Programming Langauge 19 Extended BNF • Optional parts are placed in brackets [ ] <proc_call> -> ident [(<expr_list>)] • Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term> (+|-) const • Repetitions (0 or more) are placed inside braces { } <ident> → letter {letter|digit} Chapter 3: Describing a Programming Langauge 20 BNF and EBNF BNF <expr>  <expr> + <term> | <expr> - <term> | <term> <term>  <term> * <factor> | <term> / <factor> | <factor> Chapter 3: Describing a Programming Langauge <expr>  <term> {(+ | -) <term>} EBNF <term>  <factor> {(* | /) <factor>} 21 Extensions in EBNF • Three extensions are commonly included in the various versions of EBNF. • The first extension denotes an optional part of an RHS, which is delimited by brackets. • The second extension is the use of the brackets in an RHS to indicate that the enclosed part can be repeated indefinitely or left out altogether. • And the third extension deals with multiple-choice options. Chapter 3: Describing a Programming Langauge 22 Some Examples Consider the grammar given below (d) Draw a parse tree for the sentence (x). <pop> ::= [ <bop> , <pop> ] | <bop> <bop> ::= <boop> | ( <pop> ) <boop> ::= x | y | z (a) What are the nonterminal symbols? <pop> <bop> <boop> (b) What are the terminal symbols? [ ] , ( ) x y z (c) What is the start symbol? <pop> Chapter 3: Describing a Programming Langauge 23 (e) Draw a parse tree for the sentence [(x),[y,x]]. Chapter 3: Describing a Programming Langauge 24 Describe, in English the language defined by the following grammar. <S> -> <A> <B> <C> <A> -> a <A> | a <B> -> b <B> | b <C> -> c <C> | c <A> will generate one or more consecutive a's <B> will generate one or more consecutive b's <C> will generate one or more consecutive c's So <A><B><C> will generate One or more a's followed by one or more b's followed by one or more c's E.g. aaaaabbbccccccc Also <S> is start symbol for this grammar. Chapter 3: Describing a Programming Langauge 25 Which of the following sentences are in the language generated by this grammar? a. baab b. bbbab c. bbaaaaa d. bbaab <A> will generate one or more consecutive b's <B> will generate one or more consecutive a's So <A> a <B> b will generate One or more b's followed by One a followed by One or more a's followed by a b which is the same as One or more b's followed by Two a's followed by a b Which matches a and d Chapter 3: Describing a Programming Langauge 26 Consider the following grammar <S> -> a <S> c <B> | <A> | b <A> -> c <A> | c <B> -> d | <A> Which of the following sentences are in the language generated by this grammar a. abcd b. acccbd c. acccbcc d. acd e. accc <A> generates one or more c's <B> will generate either One d or a string of one or more c's So S will generate a <S> c {d | c's} | c's | b which matches a and e Chapter 3: Describing a Programming Langauge 27 Describing a Programming Language Describing Tokens (Using Regular Expressions) Describing Syntax (Using BNF / CFG) Describing Semantics (Using Regular Expressions) Static Semantic (Using Attributed Grammar) Chapter 3: Describing a Programming Langauge Dynamic Semantic (Using Attributed Grammar) Possible ways: Operational, Axiomatic and Denotational. 28 Static Semantics • Static semantics illustrate the categories of language rules and it’s only indirectly related to the meaning of programs during execution. Many static sematic rules of a language state its type constraints and static sematics can be described using attrributed grammar which is an extension of context-free grammar. Chapter 3: Describing a Programming Langauge 29 Attributes • An attribute is a specification that defines a property of an object, element or file. It may also refer to or set the specific value for a given instance. There are two main types of attrubute. Synthesized attributes Inherited attributes values are computed from ones of the children nodes P values are computed from attributes of the siblings and parent of the node P c1 c2 c3 c4 Synthesized of P = f(c1, c2, c3, c4) S1 S2 S3 S4 Inherited of S4= f(P, S1, S2, S3) Chapter 3: Describing a Programming Langauge 30 Example of Attributes A A D E F b is synthesized attribute of A ADEF D E F b is Inherited attribute of D DAEF Synthesized/Inherited attributes are naturally computed bottom-up/top-down, respectively Chapter 3: Describing a Programming Langauge 31 The Attributes for the Non terminals • Actual_type: A Synthesized attribute associated with the non terminals <var> and <expr>. In the case of an expression, it is determined from the actual types of the child node. • Expected_type: An Inherited attribute associated with the non terminal <expr> determined by the type of the variable. Chapter 3: Describing a Programming Langauge 32 Rules for Type Checking • Let us consider the following attributed grammar. <assign> → <var> = <expr> <expr> → <var> + <var> | <var> <var> → A | B | C The syntax and static semantics of this assignment statement are as follows; - The only variable names are A, B and C. - The right side of the assignments can be either a variable or an expression of a variable added to another variable. Chapter 3: Describing a Programming Langauge 33 Rules for Type Checking (cont’) - The variable can be int or real. - When there are two variables on the right side of an assignment, they need not be same type. - The type of the expression when the operand types are not the same is always real. - When they are same, the expression type is that of the operands. - The type of the left side of the assignment must match the type of the right side. Chapter 3: Describing a Programming Langauge 34 The look-up function looks up a given variable name in the symbol table and returns the variable’s type. Chapter 3: Describing a Programming Langauge 35 The flow of Attributes in the Tree Chapter 3: Describing a Programming Langauge 36 Dynamic Semantics Operational Semantics • In operational semantics, certain properties of a program, such as correctness, safety or security are verified by constructing proofs from logical statements about its execution and procedures. For example; operational semantics are used to describe semantics of PL/I. • Operational semantics are classified in two categories. • Structural operational semantics or small step semantics: formally describe how the individual steps of a computation take place in a computer-based system. • Natural semantics or big-step semantics: describe how the overall results of the executions are obtained. Chapter 3: Describing a Programming Langauge 37 • Example of operational semantics. C statement: for (expr1; expr2; expr3) { ... } Meaning expr1; loop: if expr2 == 0 goto out ... expr3; goto loop out: . . . Chapter 3: Describing a Programming Langauge 38 Axiomatic Semantics • Axiomatic semantics is an approach based on predicate calculus to proving the correctness of computer programs. • Axiomatic semantics can form some rules. Assertions: pre-, post- conditions: {P} statement {Q} {b > 0} a=b+1 {a > 1} • Axioms (logical statement that assumed to be true) or inference rules: {P} S {Q}, P'  P, Q  Q' {P'} S {Q' } Chapter 3: Describing a Programming Langauge 39 Denotational Semantics • Denotational semantics is an approach of formalizing the meaning of programming languages by constructing mathematical objects that describe the meaning of expressions from the languages. • Denotational semantics based on recursive function theory and originally developed by Scott and Strachey in 1970. This is the most abstract semantics description method. • The state of a program is the values of all its current variables s = {<i1, v1>, <i2, v2>, …, <in, vn>} Chapter 3: Describing a Programming Langauge 40 Example: We use a very simple language construct, character string representations of binary numbers, to introduce the denotational method. The syntax of such binary numbers can be described by the following grammar rules: <bin_num> → '0' | '1' | <bin_num> '0' | <bin_num> '1' Chapter 3: Describing a Programming Langauge 41 Static vs Dynamic Semantics • Static semantics is more on the legal forms of programs (syntax rather symantics) and is only indirectly related to the meaning of the programs during execution. The semantic rules of language state its type constraints. • Dynamic semantics is describing the meaning of the programs. Programmers need to know precisely what statements of a language do. Compile writers determine the semantics of a language for which they are writing compilers from English descriptions. Chapter 3: Describing a Programming Langauge 42