Tải bản đầy đủ (.pdf) (6 trang)

INTRODUCTION TO COMPUTER SCIENCE - PART 7 pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (298.46 KB, 6 trang )

INTRODUCTION TO COMPUTER SCIENCE
HANDOUT #7. AUTOMATA
K5 & K6, Computer Science Department, Vaên Lang University
Second semester Feb, 2002
Instructor: Traàn Ñöùc Quang
Major themes:
1. Patterns and Pattern Matching
2. Finite State Machines and Automata
3. Deterministic and Nondeterministic Automata
Reading: Sections 10.2 and 10.3.
7.1 PATTERNS AND PATTERN MATCHING
A pattern is a set of objects with some recognizable property. One type of pattern is a
set of character strings, such as the set of legal C identifiers, each of which is a string
of letters, digits, and underscores, beginning with a letter or underscore.
Given a pattern and an input, the process of determining if the input matches the
pattern is called pattern matching, a problem also known as pattern recognition. In
compiling, for example, one of the essential parts is to regconize construct patterns in
programs before translating programs into a desired code. Let’s see an illustration for
the first phase of this process.
Consider an if-statement in C,
if (a==b)
x = 1;
A C compiler will read input characters from the left, one at a time, collect them into
small groups of characters (lexemes or tokens) matching some lexical pattern. This
phase is called lexical analysis. Our statement, for example, may be grouped into the
following tokens, each has its own pattern:
1. The keyword if
2. The left parenthesis (
3. The identifier a
4. The comparison operator ==
40 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA


5. The identifier b
6. The right parenthesis )
7. The identifier x
8. The assignment operator =
9. The integer 1
10. The statement-terminator ;
White space characters (blanks, tabs, and newlines) would also be eliminated.
7.2 STATE MACHINES AND AUTOMATA
Programs that search for patterns often have a special structure. We can identify cer-
tain positions in the code at which we know something particular about the program’s
progress toward its goal of finding an instance of a pattern. We call these positions
states. The overall behavior of the program can be viewed as moving from state to
state as it reads its input.
To see the behavior of such a program, we can draw a graph with a node for each
state, and an arc for each moving from state to state (called a transition). A graph for
a program recognizing English words with five vowels in order is shown below:
There are two important states in this graph, one with an incoming arc labeled
start (state 0), and the other with a double circle (state 5). The former, the start state,
is the state in which we begin to recognize the pattern; the latter, the accepting state,
is the state we reach after having found our pattern and "accept". There may be several
accepting states but one start state.
Such a graph is called a finite automaton or just automaton.
We can design a pattern-matching program by first designing the automaton, then
mechanically translating it into a program. I will give an example in the next section.
Automata can be viewed as a state machine consisting of a finite control, an input
tape, and a head to read a sequence of symbols written on the tape. At any time during
its operation, the machine reads a symbol on the tape, changes its state, and moves
the head one symbol to the right. A picture of automata is shown in the figure on the
next page.
4 53210

ΛΛ − a ΛΛ − e ΛΛ − i ΛΛ − o ΛΛ − u
a uoie
start
7.3 DETERMINISTIC AND NONDETERMINISTIC AUTOMATA 41
7.3 DETERMINISTIC AND NONDETERMINISTIC AUTOMATA
The automaton discussed in the previous section has an important property. For any
state s and any input character x, there is at most one transition out of state s whose
label includes x. Such an automaton is said to be deterministic.
It is straighforward to convert deterministic finite automata (DFA) into programs.
We create a piece of code for each state. The code for state s examines its input and
decides which of transitions out of s, if any, should be followed. If a transition from
state s to state t is selected, then the code for state s must arrange for the code of state
t to be executed next, perhaps by using a goto-statement.
Suppose we have a DFA for a bounce filter.
You need not understand its meaning. Just observe that the DFA has the start state
a and the two accepting states c and d, examines the input characters 1 and 0.
From this DFA, we can mechanically produce a simple program under the guide
mentioned. A resulting program is given on the next page.
i f ( a = =
finite control
input tape
ca
b d
start
0 1 1 0
1
0
1 0
42 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA
void bounce()

{
char x;
/* state a */
a: putchar(’0’);
x = getchar();
if (x == ’0’) goto a; /* transition to state a */
if (x == ’1’) goto b; /* transition to state b */
goto finis;
/* state b */
b: putchar(’0’);
x = getchar();
if (x == ’0’) goto a; /* transition to state a */
if (x == ’1’) goto c; /* transition to state c */
goto finis;
/* state 1 */
c: putchar(’0’);
x = getchar();
if (x == ’0’) goto d; /* transition to state d */
if (x == ’1’) goto c; /* transition to state c */
goto finis;
/* state d */
d: putchar(’1’);
x = getchar();
if (x == ’0’) goto a; /* transition to state a */
if (x == ’1’) goto c; /* transition to state c */
goto finis;
finis: ;
}
Although it is easy to convert a DFA into a program, designing it is more difficult. In
fact, there is a generalization of DFAs, which is conceptually more natural. This kind

of automata, called nondeterministic finite automata (NFA for short), may have two or
more transitions containing the same symbol out of one state.
Note that a DFA is technically a NFA as well, one that happens not to have multi-
ple transitions on one symbol.
7.4 GLOSSARY 43
NFAs are not directly implementable by programs, but they are useful conceptual
tools for a number of applications. Moreover, by using the "subset construction", it is
possible to convert any NFA to a DFA that accepts the same set of character strings
but this topic is beyond our discussion.
For an illustration, I only show a NFA in the following figure.
Note that we use the symbol ΛΛ to indicate any legal symbol.
7.4 GLOSSARY
Pattern: Mẫu. See the definition in text.
Pattern Matching: Đối sánh mẫu, so mẫu.
Recognition: Nhận dạng.
Identifier: Đònh danh. A name of an data object in a program.
Character: Ký tự. Any symbol that we may input from the keyboard, including letters,
digits, special symbols such as +, ^, and some nonprintable symbols.
Letter: Chữ cái.
Digit: Ký số, chữ số.
Underscore: Dấu gạch thấp _.
Input: Nguyên liệu, dữ liệu nhập.
Output: Thành phẩm, dữ liệu xuất.
Code: Mã lệnh, mã chương trình. A full program or program segment in any form, such
as a high-level language or machine language.
Compilation: Quá trình biên dòch. Sometimes also translation.
Compiler: Trình biên dòch.
Interpreter: Trình thông dòch.
Translator: Chương trình dòch (nói chung).
Lexeme: Từ tố.

Token: Thẻ từ.
2 310
ΛΛ
namstart
44 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA
Assignment operator: Toán tử gán.
Statement-terminator: Dấu kết thúc câu lệnh.
Instance: Thể hiện.
Automaton, automata (pl.): Automat, Ôtômat.
Deterministic finite automata: Automat hữu hạn đơn đònh (tất đònh).
Nondeterministic finite automata: Automat hữu hạn đa đònh (không đơn đònh, không
tất đònh).
State: Trạng thái.
Transition: Chuyển vò.
Start state: Khởi trạng.
Accepting state, final state: Trạng thái kiểm nhận, kết trạng.
Finite control: Bộ điều khiển hữu hạn.
Input tape: Băng nguyên liệu.
Head: Đầu đọc.

×