Tải bản đầy đủ (.pdf) (94 trang)

slike bài giảng môn chương trình dịch chương 3 syntax analysis

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (201.72 KB, 94 trang )

Syntax Analysis
Quan Thanh Tho (qttho@)
Nguyen Hua Phung (phung@)
cse.hcmut.edu.vn
Objectives
• Context-free grammar for designing
programming languages syntax
• Parsing methods typically used in
compilers
• Error recovery from commonly occurring
syntax errors
CSE-HCMUT Syntax Analysis I 2
Outline
• The role of syntax analysis (parser)
• Language syntax specification
• Parsing Techniques
• Error Recovery
CSE-HCMUT Syntax Analysis I 3
The role of syntax analysis
• Receive tokens from lexical analyzer
•Verifyif the received tokens conform to the
language grammar or not
• Generate a parsing representation (usually a
parse tree)
• Handle syntax error (report and recover)
Lexical Analyzer Syntax Analyzer
token
get next token
CSE-HCMUT Syntax Analysis I 4
Outline
• The role of syntax analysis (parser)


• Language syntax specification
– Syntax and Grammar
– Context-free Grammar
• Derivation
• Parse Tree
– Grammar Construction for Programming Language:
• Language construct definition
• Operators precedence and associativity
•Ambiguity
• Parsing Techniques
• Error Recovery
CSE-HCMUT Syntax Analysis I 5
Syntax and Grammar
• Syntax (programming language sense):
– Define structure of a program
– Not reflect the meaning (semantic) of the
program
• Grammar:
– Rule-based formalism to specify a language
syntax
CSE-HCMUT Syntax Analysis I 6
Why Grammar?
• Capable of specifying language syntax
precisely
• Rule-based representation supported by
grammar is natural and easy to
understand for human
• Effectively support language modification
and extension
• Provide fundamental basic to develop

parsers systematically
CSE-HCMUT Syntax Analysis I 7
Context-Free Grammar (CFG)
• A kind of grammar
• Not as complex as context-sensitive and
phase-structure grammar
• More powerful than regular grammar
CSE-HCMUT Syntax Analysis I 8
Formal Definition of CFG
• G = (V
N
,V
T
,S, P)
•V
N
: finite set of nonterminal symbols
V
T
: finite set of tokens (V
T
∩V
N
=∅)
S∈V
N
: start symbol
P: finite set of rules (or productions) of
BNF (Backus – Naur Form) form AÆ
(a)* where A ∈ V

N
,a∈(V
T
∪V
N
)
CSE-HCMUT Syntax Analysis I 9
Example 1
• G = ({exp,op},{+,-,*,/,id} ,exp,P) where P is
the following
exp Æ exp op exp
exp Æ id
op Æ +|-|*|/
CSE-HCMUT Syntax Analysis I 10
Derivation
• α = uXv derives β = uγvif X-> γ is a production
Notation: α⇒β(directly derive)
α⇒
*
β (α⇒ ⇒β| α = β)
α⇒
+
β (α⇒
*
γ and γ⇒β)
Derivations
: S ⇒
+
α where α consists of tokens only.
Sentential form: S ⇒

*
α Ù α is a sentential form
Sentence:
S ⇒
+
α is a derivation Ù α is a sentence
Language: set of all sentences possibly derived
CSE-HCMUT Syntax Analysis I 11
Example 2
•exp ⇒ exp op exp
⇒ exp op id
⇒ id op id
⇒ id + id
CSE-HCMUT Syntax Analysis I 12
Example 2 (cont’d)
•exp ⇒ exp op exp
⇒ id op exp
⇒ id + exp
⇒ id + id
CSE-HCMUT Syntax Analysis I 13
Example 2 (cont’d)
•exp ⇒ exp op exp
⇒ exp op exp op exp
⇒ id op exp op exp
⇒ id + exp op exp
⇒ id + exp * exp
⇒ id + id * exp
⇒ id + id * exp
⇒ id + id * id
CSE-HCMUT Syntax Analysis I 14

Leftmost/ Rightmost Derivation
• There may be many derivations for a
certain sentence
• Leftmost derivation: each generated
sentential form is further derived by
replacing its leftmost nonterminal
• Rightmost derivation: each generated
sentential form is further derived by
replacing its rightmost nonterminal
CSE-HCMUT Syntax Analysis I 15
Example 3 – Leftmost Derivation
•exp ⇒ exp op exp
⇒ id op exp
⇒ id + exp
⇒ id + id
CSE-HCMUT Syntax Analysis I 16
Example 3 – Rightmost Derivation
•exp ⇒ exp op exp
⇒ exp op id
⇒ exp + id
⇒ id + id
CSE-HCMUT Syntax Analysis I 17
Hands-on Exercise
• Find the leftmost derivation and rightmost
derivation of id+id*id+id
CSE-HCMUT Syntax Analysis I 18
Parsing
• Verify if the sequence of tokens generated
by the lexical analyzer are grammatically
legal or not

• Carried out by finding a derivation
corresponding to the sequence
• Represent the derivation as a computer-
understandable structure for further
analysis
CSE-HCMUT Syntax Analysis I 19
Parse Tree
• Tree-based structure representing a
derivation
– Root node ÙStart symbol
– Interior node ÙNonterminal symbol
– Leaf node Ùtoken or nonterminal
– Children of a node from left to right form the
right-hand side of a production whose left-
hand side is the node.
– Parse tree is constructed based on the
deriving sequence of the derivation
CSE-HCMUT Syntax Analysis I 20
Example 4
•exp
exp
CSE-HCMUT Syntax Analysis I 21
Example 4
•exp ⇒ exp op exp
exp
exp op exp
CSE-HCMUT Syntax Analysis I 22
Example 4
exp ⇒ exp op exp ⇒ id op exp
exp

exp op exp
id
CSE-HCMUT Syntax Analysis I 23
Example 4
•exp ⇒ exp op exp ⇒ id op exp ⇒ id +
exp
exp
exp op exp
id +
CSE-HCMUT Syntax Analysis I 24
Example 4
•exp ⇒ exp op exp ⇒ id op exp ⇒ id +
exp
⇒ id + id
exp
exp op exp
id + id
CSE-HCMUT Syntax Analysis I 25

×