Tải bản đầy đủ (.pdf) (38 trang)

DM2 Chapter 4 automata BK TPHCM

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (425.25 KB, 38 trang )

Automata

Chapter 4

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Automata
Discrete Mathematics II
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression

(Materials drawn from this chapter in:
- Peter Linz. An Introduction to Formal Languages and Automata, (5th Ed.),
Jones & Bartlett Learning, 2011.
- John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullamn. Introduction to
Automata Theory, Languages, and Computation (3rd Ed.), Prentice Hall,
2006.
- Antal Iv´
anyi Algorithms of Informatics, Kempelen Farkas Hallgat´
oi
Inform´
aci´
os K¨
ozpont, 2011. )


Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation

Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang
Faculty of Computer Science and Engineering
University of Technology, VNU-HCM
4.1


Contents

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

1 Motivation
2 Alphabets, words and languages
Contents

3 Regular expression or rationnal expression

Motivation
Alphabets, words and
languages


4 Non-deterministic finite automata

Regular expression or
rationnal expression
Non-deterministic
finite automata

5 Deterministic finite automata

Deterministic finite
automata
Recognized languages

6 Recognized languages

Determinisation

7 Determinisation

4.2


Automata

Introduction

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang


Standard states of a process in operating system
• O with label: states
• →: transitions

Contents

Resource

Motivation

Waiting

Blocked

Alphabets, words and
languages
Regular expression or
rationnal expression

Resource

Non-deterministic
finite automata

CPU

Deterministic finite
automata
Recognized languages


CPU

Resource

Determinisation

Running

4.3


Why study automata theory?

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

A useful model
for many important kinds of software and hardware
1

designing and checking the behaviour of digital circuits

2

lexical analyser of a typical compiler: a compiler component
that breaks the input text into logical units


3

scanning large bodies of text, such as collections of Web
pages, to find occurrences of words, phrases or other patterns

Contents
Motivation

4

verifying pratical systems of all types that have a finite
number of distinct states, such as communications protocols
of protocols for secure exchange information, etc.

Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation

4.4


Alphabets, symbols


Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Definition

Alphabet Σ (bảng chữ cái) is a finite and non-empty set of
symbols (or characters).
For example:
• Σ = {a, b}
• The binary alphabet: Σ = {0, 1}
• The set of all lower-case letters: Σ = {a, b, . . . , z}
• The set of all ASCII characters.
Remark

Σ is almost always all available characters (lowercase letters,
capital letters, numbers, symbols and special characters such as
space or newline).
But nothing prevents to imagine other sets.

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic

finite automata
Deterministic finite
automata
Recognized languages
Determinisation

4.5


Automata

Strings (words)

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Definition
• A string/word u (chuỗi/từ) over Σ is a finite sequence (possibly
empty) of symbols (or characters) in Σ.
• A empty string is denoted by ε.
• The length of the string, denoted by |u|, is the number of
characters.

Contents
Motivation
Alphabets, words and
languages

• All the strings over Σ is denoted by Σ∗ .


Regular expression or
rationnal expression

• A language L over Σ is a sub-set of Σ∗ .

Non-deterministic
finite automata
Deterministic finite
automata

Remark

Recognized languages



The purpose aims to analyze a string of Σ in order to know
whether it belongs or not to L.

Determinisation

4.6


Example

Automata

Nguyen An Khuong,

Huynh Tuong Nguyen,
Bui Hoai Thang

Let Σ = {0, 1}
• ε is a string with length of 0.
• 0 and 1 are the strings with length of 1.
• 00, 01, 10 and 11 are the strings with length of 2.
• ∅ is a language over Σ . It’s called the empty language.


• Σ is a language over Σ . It’s called the universal language.
• {ε} is a language over Σ .
• {0, 00, 001} is also a language over Σ .
• The set of strings which contain an odd number of 0 is a language
over Σ.
• The set of strings that contain as many of 1 as 0 is a language
over Σ.

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation


4.7


String concatenation

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Intuitively, the concatenation of two strings 01 and 10 is 0110.
Concatenating the empty string ε and the string 110 is the string
110.
Definition

String concatenation is an application of Σ∗ × Σ∗ to Σ∗ .
Concatenation of two strings u and v in Σ is the string u.v.

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata

Recognized languages
Determinisation

4.8


Languages

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Specifying languages

A language can be specified in several ways:
a) enumeration of its words, for example:
• L1 = {ε, 0, 1},
• L2 = {a, aa, aaa, ab, ba},
• L3 = {ε, ab, aabb, aaabbb, aaaabbbb, . . .},
b) a property, such that all words of the language have this property
but other words have not, for example:
• L4 = {an bn |n = 0, 1, 2, . . .},
• L5 = {uu−1 |u ∈ Σ∗ },
• L6 = {u ∈ {a, b}∗ |na (u) = nb (u)} where na (u) denotes the
number of letter ’a’ in word u.
c) its grammar, for example:
• Let G = (N, T, P, S) where
N = {S}, T = {a, b}, P = {S → aSb, S → ab}

i.e. L(G) = {an bn |N ≥ 1} since
S ⇒ aSb ⇒ a2 Sb2 ⇒ . . . ⇒ an Sbn

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation

4.9


Operations on languages
L, L1 , L2 are languages over Σ

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

• union
L1 ∪ L2 = {u ∈ Σ∗ | u ∈ L1 or u ∈ L2 },

• intersection
L1 ∩ L2 = {u ∈ Σ∗ | u ∈ L1 and u ∈ L2 },
• difference
L1 \ L2 = {u ∈ Σ∗ | u ∈ L1 and u ∈ L2 },
• complement
L = Σ∗ \ L,
• multiplication
L1 L2 = {uv | u ∈ L1 , v ∈ L2 },
• power
L0 = {ε},
Ln = Ln−1 L , if n ≥ 1,
• iteration or star operation


Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages

Li = L0 ∪ L ∪ L2 ∪ · · · ∪ Li ∪ · · · ,

L∗ =

Contents


Determinisation

i=0

We will use also the notation L+


Li = L ∪ L2 ∪ · · · ∪ Li ∪ · · · .

L+ =
i=1

The union, product and iteration are called regular operations.
4.10


Example

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Let Σ = {a, b, c} and L = {ab, aa, b, ca, bac}

L2 = u.v, with u, v ∈ L including the following strings:
• abab, abaa, abb, abca, abbac,
• aaab, aaaa, aab, aaca, aabac,

• bab, baa, bb, bca, bbac,

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression

• caab, caaa, cab, caca, cabac,

Non-deterministic
finite automata

• bacab, bacaa, bacb, bacca, bacbac.

Deterministic finite
automata
Recognized languages
Determinisation

4.11


Exercise

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,

Bui Hoai Thang

Let Σ = {a, b, c}
Give at least 3 strings for each of the following languages
1) all strings with exactly one ’a’.
2) all strings of even length.

Contents

3) all strings which the number of appearances of ’b’ is divisible by 3.

Motivation

4) all strings ending with ’a’.

Alphabets, words and
languages

5) all strings not ending with ’a’.

Regular expression or
rationnal expression

6) all non-empty strings not ending with ’a’.

Non-deterministic
finite automata

7) all strings with at least one ’a’.


Deterministic finite
automata

8) all strings with at most one ’a’.

Recognized languages

9) all strings without any ’a’.

Determinisation

10) all strings including at least one ’a’ and whose the first appearance
of ’a’ is not followed by a ’c’.

4.12


Exercise

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Let Σ = {a, b, c} and L = {ab, aa, b, ca, bac}
Which of the following strings are in L∗ :
1) aaa = a3 ,
2) abaabaaabaa = aba2 ba3 ba2 ,
3) bbb,

4) aab,
5) cc,

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression

6) aaaabaaaa = a4 ba4 ,

Non-deterministic
finite automata

7) cabbbbaaaaaaaaab = cab3 a9 b,

Deterministic finite
automata

8) baaaaabaaaab = ba5 ba4 b,

Recognized languages

9) baaaaabaac = ba5 ba2 c,

Determinisation

10) baca ?.


4.13


Regular expressions

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Regular expressions (biểu thức chính quy)

Permit to specify a language with strings consist of letters and ε,
parentheses (), operating symbols +, ., ∗. This string can be
empty, denoted ∅.
Contents

Regular operations on the languages
• union ∪ or +
• product of concatenation
• transitive closure ∗

Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata

Deterministic finite
automata

Example

Recognized languages
Determinisation

• (a + b)∗ represent all the strings over the aphabet Σ = {a, b}
• a∗ (ba∗ )∗ represent the same language
• (a + b)∗ aab represent all strings ending with aab.
4.14


Automata

Regular expressions

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

• ∅ is a regular expression representing the empty language.
• ε is a regular expression representing language {ε}.
• If a ∈ Σ, then a is a regular expression representing language {a}.
• If x, y are regular expressions representing languages X and Y
respectively, then (x + y), (xy), x∗ are regular expression
representing languages X Y , XY and X ∗ respectively.




(x + y)

Contents
Motivation
Alphabets, words and
languages

x+y



y+x

(x + y) + z



x + (y + z)

(xy)z



x(yz)

Non-deterministic
finite automata

(x + y)z




xz + yz

Deterministic finite
automata

x(y + z) ≡

xy + xz







(x + y )

(x + y)∗



(x∗ y ∗ )∗

(x∗ )∗




x∗

≡ (x + y)



x x ≡
xx∗ + ε ≡

∗ ∗

Regular expression or
rationnal expression

Recognized languages



∗ ∗

≡ (x + y )

Determinisation

xx∗
x∗

4.15



Regular expressions

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Contents

Kleene’s theorem

Motivation

Language L ⊆ Σ∗ is regular if and only if there exists a regular
expression over Σ representing language L.

Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation

4.16



Exercise

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Let Σ = {a, b, c}
Give at least 3 words for each language represented by the
following regular expressions

Contents

1) a∗ + b∗ ,

Motivation

2) a∗ b + b∗ a,

Alphabets, words and
languages

3) b(ca + ac)(aa)∗ + a∗ (a + b),

Regular expression or
rationnal expression

4) (a∗ b + b∗ a)∗ .


Non-deterministic
finite automata
Deterministic finite
automata

Example
a∗ b = {b, ab, a2 b, a3 b, . . . , aaa . . . ab},

Recognized languages
Determinisation

4.17


Automata

Exercise

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Let Σ = {a, b, c} and L = {ab, aa, b, ca, bac}
Which languages represented by the following regular expressions
are in L∗ :
1) a∗ + b,

Contents
Motivation




2) b ,

Alphabets, words and
languages

3) aab + cab∗ ac,

Regular expression or
rationnal expression

4) b(ca + ac)(aa)∗ + a∗ (a + b),
5) (aaaabaaa)2∗ c,

Non-deterministic
finite automata

6) b+ ac (b+ = bb∗ ),

Deterministic finite
automata

7) (b + c)ab + (ba(c + ab2 + a3 + a4 + b)∗ )∗ ?

Recognized languages
Determinisation




Define a (simple) regular expression representing the language L .

4.18


Automata

Finite automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Finite automata (Ôtômat hữu hạn)
• The aim is representation of a process system.
• It consists of states (including an initial state and one or

several (or one) final/accepting states) and transitions
(events).
• The number of states must be finite.

Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression

b


Non-deterministic
finite automata

a, b

Deterministic finite
automata

q0

q1

Recognized languages
Determinisation

Regular expression

b∗ (a + b)
4.19


Automata

Exercise

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang


Let Σ = {a, b}
Which of the strings
1) a3 b,
2) aba2 b,
3) a4 b2 ab3 a,

Contents

4) a4 ba4 ,

Motivation

5) ab4 a9 b,

Alphabets, words and
languages

6) ba5 ba4 b,

Regular expression or
rationnal expression

7) ba5 b2 ,

Non-deterministic
finite automata

2

8) bab a?


Deterministic finite
automata

are accepted by the following finite automata?

a

b

Recognized languages
Determinisation

b
q0

q1

a

q2

b
a
4.20


Automata

Exercise


Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Give regular expression for the following finite automata.

a

b
b

q0

q1

a
b

q2

Contents
Motivation
Alphabets, words and
languages

a

Regular expression or
rationnal expression


and this one.

Non-deterministic
finite automata

b

b
a, b
q0

Deterministic finite
automata
Recognized languages
Determinisation

q1
a
4.21


Nondeterministic finite automata

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang


Definition

A nondeterministic finite automata (NFA, Ôtômat hữu hạn phi
đơn định) is mathematically represented by a 5-tuples
(Q, Σ, q0 , δ, F ) where
• Q a finite set of states.
• Σ is the alphabet of the automata.

Contents
Motivation
Alphabets, words and
languages

• q0 ∈ Q is the initial state.

Regular expression or
rationnal expression

• δ : Q × Σ → Q is a transition function.

Non-deterministic
finite automata

• F ⊆ Q is the set of final/accepting states.

Deterministic finite
automata
Recognized languages

Remark


Determinisation

According to an event, a state may go to one or more states.

4.22


Automata

NFA with empty symbol ε

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang

Other definition of NFA

Finite automaton with transitions defined by character x (in Σ) or
empty character ε.

Contents
Motivation
Alphabets, words and
languages

b

b


ε
q0

a, b

a
q1

Regular expression or
rationnal expression
Non-deterministic
finite automata

q2

Deterministic finite
automata
Recognized languages
Determinisation

4.23


Exercise

Automata

Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang


Contents

Consider the set of strings on {a, b} in which every aa is followed
immediately by b.
For example aab, aaba, aabaabbaab are in the language,
but aaab and aabaa are not.
Construct an accepting NFA.

Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation

4.24


Automata

Exercise

Nguyen An Khuong,
Huynh Tuong Nguyen,

Bui Hoai Thang

Let Σ = {a, b, c}
Construct an accepting finite automata for languages represented
by the following regular expressions.
Contents

• E1 = a∗ + b,

Motivation

• E2 = b∗ ,


• E3 = aab + cab ac,
• E4 = b(ca + ac)(aa)∗ + a∗ (a + b),
• E5 = (aaaabaaa)2∗ c,
• E6 = b+ ac (b+ = bb∗ ),
• E7 = (b + c)ab + (ba(c + ab2 + a3 + a4 + b)∗ )∗ ,

Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation


• E8 = [a(b + c)∗ abc]∗ .

4.25


×