Tải bản đầy đủ (.pdf) (65 trang)

Service robot for students based on computer vision and natural language processing

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.3 MB, 65 trang )

MINISTRY OF EDUCATION AND TRAINING
HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION
FACULTY FOR HIGH QUALITY TRAINING

GRADUATION PROJECT
AUTOMATION AND CONTROL ENGINEERING
TECHNOLOGY

SERVICE ROBOT FOR STUDENTS BASED ON
COMPUTER VISION AND NATURAL LANGUAGE
PROCESSING

LECTURER: ASSOC. PROF. PHD. LE MY HA
STUDENT: NGUYEN TUAN THANH

SKL009325

Ho Chi Minh City, August, 2022


HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION
FACULTY FOR HIGH QUALITY TRAINING

GRADUATION PROJECT

SERVICE ROBOT FOR STUDENTS BASED ON
COMPUTER VISION AND NATURAL LANGUAGE
PROCESSING

NGUYỄN TUẤN THANH
Student ID: 17151028


Major: AUTOMATION AND CONTROL ENGINEERING
TECHNOLOGY
Advisor: Assoc. Prof. PhD. LE MY HA

Ho Chi Minh City, August 2022


HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION
FACULTY FOR HIGH QUALITY TRAINING

GRADUATION PROJECT

SERVICE ROBOT FOR STUDENTS BASED ON
COMPUTER VISION AND NATURAL LANGUAGE
PROCESSING

NGUYỄN TUẤN THANH
Student ID: 17151028
Major: AUTOMATION AND CONTROL ENGINEERING
TECHNOLOGY
Advisor: Assoc. Prof. PhD. LE MY HA

Ho Chi Minh City, August 2022


Faculty for High Quality Training – HCMC University of Technology and Education
THE SOCIALIST REPUBLIC OF VIETNAM
Independence – Freedom– Happiness

-------Ho Chi Minh City, August 6th, 2022


GRADUATION PROJECT ASSIGNMENT

Student name: Nguyen Tuan Thanh

Student ID: 17151028

Major: Automation and Control Engineering
Technology

Class: 17151CLA1

Advisor: Assoc. Prof. PhD. Le My Ha

Phone number: 0938811201

Date of assignment: Feb 21th, 2022

Date of submission: August 6th, 2022

1. Project title: Service robot for students based on computer vision and natural language
processing.
2. Initial materials provided by the advisor: References, reference programs, data sets, expected
parameters of the Robot.
3. Content of the project:
- Design, implement a service robot with two functions: chat and talk.
- Apply computer vision to identify wearing a mask and user information.
- Apply natural language processing in virtual voice assistant to communicate with human.
- Apply natural language toolkit (NLTK) to build chatbot to communicate with human.
- Build database and collect more database when communicate with human.

4. Final product: Finish a service robot that have abilities to recognize human with high accuracy
and communicating with human by given knowledge database.
CHAIR OF THE PROGRAM
(Sign with full name)

ADVISOR
(Sign with full name)


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Faculty for High Quality Training – HCMC University of Technology and Education
THE SOCIALIST REPUBLIC OF VIETNAM
Independence – Freedom– Happiness

-------Ho Chi Minh City, August 6, 2022

ADVISOR’S EVALUATION SHEET
Student name: Nguyen Tuan Thanh

Student ID: 17151028

Major: Automation and Control Engineering Technology
Project title: Service robot for students based on computer vision and natural language
processing
Advisor: Assoc. Prof. PhD. Le My Ha
EVALUATION
1. Content of the project:
- Design, implement a service robot with two functions: chat and talk.
- Apply computer vision to identify wearing a mask and user information.

- Apply natural language processing in virtual voice assistant to communicate with human.
- Apply natural language toolkit (NLTK) to build chat bot to communicate with human.
- Build database and collect more database when communicate with human.
2. Strengths:
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
3. Weaknesses:
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
4. Approval for oral defense? (Approved or denied)
...............................................................................................................................................
5. Overall evaluation: (Excellent, Good, Fair, Poor)
.............................................................................................................................................
6. Mark: …………. (in words:............................................................................................)
Ho Chi Minh City, August 6th, 2022
ADVISOR
(Sign with full name)

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Faculty for High Quality Training – HCMC University of Technology and Education
THE SOCIALIST REPUBLIC OF VIETNAM
Independence – Freedom– Happiness


-------Ho Chi Minh City, August 6, 2022

PRE-DEFENSE EVALUATION SHEET
Student name: Nguyen Tuan Thanh

Student ID: 17151028

Major: Automation and Control Engineering Technology
Project title: Service robot for students based on computer vision and natural language
processing
Name of Reviewer: ................................................................................................................
EVALUATION
1. Content of the project:
- Design, implement a service robot with two functions: chat and talk.
- Apply computer vision to identify wearing a mask and user information.
- Apply natural language processing in virtual voice assistant to communicate with human.
- Apply natural language toolkit (NLTK) to build chat bot to communicate with human.
- Build database and collect more database when communicate with human.
2. Strengths:
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
3. Weaknesses:
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
4. Approval for oral defense? (Approved or denied)
...............................................................................................................................................
5. Overall evaluation: (Excellent, Good, Fair, Poor)

..............................................................................................................................................
6. Mark: …………. (in words:............................................................................................)
Ho Chi Minh City, August 6th, 2022
REVIEWER
(Sign with full name)

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Faculty for High Quality Training – HCMC University of Technology and Education
THE SOCIALIST REPUBLIC OF VIETNAM
Independence – Freedom– Happiness
--------

EVALUATION SHEET OF
DEFENSE COMMITTEE MEMBER
Student name: Nguyen Tuan Thanh

Student ID: 17151028

Major: Automation and Control Engineering Technology
Project title: Service robot for students based on computer vision and natural language
processing
Name of Defense Committee Member:
...........................................................................................................................................
EVALUATION
1. Content of the project:
- Design, implement a service robot with two functions: chat and talk.

- Apply computer vision to identify wearing a mask and user information.
- Apply natural language processing in virtual voice assistant to communicate with human.
- Apply natural language toolkit (NLTK) to build chat bot to communicate with human.
- Build database and collect more database when communicate with human.
2. Strengths:
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
3. Weaknesses:
...............................................................................................................................................
...............................................................................................................................................
...............................................................................................................................................
4. Overall evaluation: (Excellent, Good, Fair, Poor)
..............................................................................................................................................
5. Mark: …………. (in words: ............................................................................................)
Ho Chi Minh City, August 6th, 2022
COMMITTEE MEMBER
(Sign with full name)

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

ACKNOWLEDGEMENT
In the process of completing the graduation project, in addition to my own
understanding, I have received a lot of support and dedicated help.
First, I would like to express my deep gratitude to Associate Professor Dr. Le My

Ha, who is both a teacher, a supporter, and an inspiration for me to complete this thesis.
He oriented me to the right topic, and how to do it, and gave objective feedback to help
me when defending myself in front of the council. Therefore, I feel very fortunate to have
worked with him.
Next, I would like to thank the faculty of electronics and electronics faculty as
well as the high-quality training department for imparting useful knowledge during four
years at the university. This knowledge plays a fundamental role in the implementation of
my graduation thesis.
In addition, I would also like to thank the Intelligent Systems Laboratory (ISLAB)
of the Faculty of Electrical and Electronic Engineering for supporting me in terms of
facilities as well as useful knowledge during the completion of the project. And
indispensable is the deep thanks to a friend Tran Thanh Hung who supported and guided
me to develop this topic.
Finally, I would like to thank my family for always supporting, caring, and
motivating me to complete the project in the best possible way.
Ho Chi Minh city, August 6th 2022
Student

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

Table of Contents
CHAPTER 1:

INTRODUCTION .....................................................................................1


1.1 Define a problem ........................................................................................................ 1
1.2 Project objectives ........................................................................................................2
1.3 Project task ..................................................................................................................2
1.4 Project scopes ..............................................................................................................2
1.5 Approach and research ............................................................................................... 2
1.6 Project description ...................................................................................................... 2
CHAPTER 2:

LITERATURE REVIEW ..........................................................................4

2.1 Survey of robots being used in service industry ........................................................ 4
2.1.1 Mission of robots in the service industry ............................................................ 4
2.1.2 Pepper robot .........................................................................................................4
2.2 Background of face recognition system ..................................................................... 6
2.2.1 Concept ................................................................................................................ 6
2.2.2 Structure and procedure for face recognition ......................................................6
2.2.3 Face Detection ..................................................................................................... 8
2.3 Color spaces in image processing ...............................................................................9
2.3.1 RGB color space (Red-Green-Blue) ................................................................... 9
2.3.2 HSV color space (Hue-Saturation-Value) ...........................................................9
2.4 Histogram of Oriented Gradients algorithm ............................................................ 10
2.5 Support Vector Machine algorithm ..........................................................................11
2.6 Background of speech recognition system ...............................................................13
2.6.1 Concept .............................................................................................................. 13
2.6.2 Speech Recognition ........................................................................................... 13
2.6.3 Applications ....................................................................................................... 15
2.7 Framework and libraries ........................................................................................... 17
2.7.1 Framework Pytorch ........................................................................................... 17
2.7.2 Pandas ................................................................................................................ 18
2.7.3 Numpy ................................................................................................................18

2.8 Voice Assistant ......................................................................................................... 18
2.9 ChatBot ..................................................................................................................... 19
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
CHAPTER 3:

SYSTEM DESIGN AND CONSTRUCTION ....................................... 22

3.1 Requirements of the system ......................................................................................22
3.2 System description ....................................................................................................22
3.2.1 The block diagram of the system ...................................................................... 22
3.2.2 The function of each block ................................................................................ 22
3.3 System design ........................................................................................................... 23
3.3.1 Face detection: ................................................................................................... 23
3.3.2 Face recognition and identification: ..................................................................24
3.3.3 Face mask detection ...........................................................................................28
3.3.4 Speech recognition and voice assistant ............................................................. 30
3.3.5 Chatbot ...............................................................................................................30
CHAPTER 4: EXPERIMENT RESULTS, FINDINGS AND ANALYSIS ..................... 36
4.1 Face detection ........................................................................................................... 36
4.2 Face recognition and identification .......................................................................... 37
4.2.1 Training image data ........................................................................................... 37
4.2.2 Performing the face recognition ........................................................................ 37
4.3 Face mask detection ..................................................................................................38
4.4 Speech recognition and voice assistant .................................................................... 40
4.5 Chatbot ...................................................................................................................... 43

4.5.1 Create Training Data ......................................................................................... 43
4.5.2 NLP Basics ........................................................................................................ 44
4.5.3 Complete chatbot ............................................................................................... 45
4.6 User interface ............................................................................................................ 46
CHAPTER 5: CONCLUSIONS AND DIRECTIONS OF DEVELOPMENT .................47
5.1 Conclusion ................................................................................................................ 47
5.2 Direction of development ......................................................................................... 47
REFERENCES ....................................................................................................................48

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

ABBREVIATIONS
NLP: Natural Language Processing
OpenCV: Open Source Computer Vision Library
HOG: Histogram of Oriented Gradients
SVM: Support Vector Machine
Q&A: Question and answer

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis


List of figures
Figure 2. 1. Pepper robot working in a mobile store ............................................................5
Figure 2. 2. A typically procedure for face recognition model ............................................7
Figure 2. 3. Face and eye detection ...................................................................................... 8
Figure 2. 4. RGB color space (Red-Green-Blue) ................................................................. 9
Figure 2. 5. HSV color space (Hue-Saturation-Value) ...................................................... 10
Figure 2. 6. Applications of HOG ...................................................................................... 11
Figure 2. 7. An example of support vector in 2-Dimensional data ....................................11
Figure 2. 8. Margins describing in a plane ......................................................................... 12
Figure 2. 9. An example of linearly non separable dataset ................................................ 12
Figure 2. 10. Speech Recognition .......................................................................................14
Figure 2. 11. Implementation of Speech Recognition ........................................................14
Figure 2. 12. Interface of Windows Speech Recognition .................................................. 16
Figure 2. 13. Interface of Voice-To-Text Facebook Messenger ........................................16
Figure 2. 14. Interface of Google Speech to Text .............................................................. 17
Figure 2. 15. Pytorch and TensorFlow Frameworks from 2017 to 2021 [10] ...................17
Figure 2. 16. Reading CSV files with Pandas [11] ............................................................ 18
Figure 2. 17. Example illustrating some functions in Numpy ........................................... 18
Figure 2. 18. Market share of voice assistants in the US, May 2018 [12] .........................19
Figure 2. 19. Illustration for Chatbot ..................................................................................20
Figure 3. 1. Block diagram of service robot designing by student .................................... 22
Figure 3. 2. Face detection with 6 landmarks and multi-face support [17] .......................23
Figure 3. 3. Training Process of face recognition .............................................................. 24
Figure 3. 4. Five features of Haar cascade method [18]. (a) Edge features. (b) Line
features. (c) Four-rectangle feature .................................................................................... 24
Figure 3. 5. Cascade structure for Haar classifiers [18] .....................................................25
Figure 3. 6. Sliding window in grayscale image [19] ........................................................ 27
Figure 3. 7. Image meshing and histogram calculation [19] ..............................................27
Figure 3. 8. Face recognition and identification processing .............................................. 28
Figure 3. 9. Face mask detection process ...........................................................................29

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Figure 3. 10. Plotting all the milestone central issues of an individual's face on a white
foundation can provide us with a best guess of the shape [20] ..........................................29
Figure 3. 11. The structure of training data [22] ................................................................ 31
Figure 3. 12. Example of training data bag of words [22] ................................................. 32
Figure 3. 13. Example of NLP preprocessing pipeline [22] .............................................. 32
Figure 3. 14. Structure of Feed Forward Neural Network [23] ......................................... 33
Figure 3. 15. The simplest form of perceptron [23] ...........................................................34
Figure 3. 16. Chatbot training structure [22] ......................................................................34
Figure 4. 1. Six facial features are displayed when human face is detected and frame rate
is measured ..........................................................................................................................36
Figure 4. 2. Detecting multiple faces in the same frame ....................................................37
Figure 4. 3. The process of training image data ................................................................. 37
Figure 4. 4. Username recognition and display ..................................................................38
Figure 4. 5. Detect 68 landmarks on user's face .................................................................39
Figure 4. 6. Bounding the mouth and warning when the user is not wearing a mask .......39
Figure 4. 7. When the user wears a mask, the system will not give an alert ..................... 40
Figure 4. 8. Identify and answer questions from users when the question is in the data set
............................................................................................................................................. 41
Figure 4. 9. Identify and answer questions from users when the question is not in the data
set .........................................................................................................................................41
Figure 4. 10. Save unknown questions to unknown question sheet in excel .....................42
Figure 4. 11. Relative calculation of response speed of gtts library ..................................42
Figure 4. 12. Relative calculation of response speed of pyttsx3 library ............................43
Figure 4. 13. Training data made by the student ................................................................43

Figure 4. 14. Tokenize all questions from data file ............................................................44
Figure 4. 15. Lowercase all word tokenized and remove characters .................................44
Figure 4. 16. All words after remove duplicate word and sorted ...................................... 45
Figure 4. 17. Example of the bag of words for all patterns ................................................45
Figure 4. 18. Chatbot interface ........................................................................................... 46
Figure 4. 19. User interface designed by the student ......................................................... 46

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

List of Tables
Table 2- 1 Specifications of Pepper robot ............................................................................ 5
Table 2- 2 The speech recognition package in Python ...................................................... 15
Table 4- 1 Sample collects data from students ...................................................................40

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

ABSTRACT
With the advancement of science and technology, robots are gradually replacing
humans in work or help in daily life. Similarly, to bring convenience to answering the
daily questions of students, this project will design a service robot that combines

computer vision and natural language processing to adapt to this purpose. Compared to
the traditional way of answering questions, students can go to school personnel or
message student forums to ask about the problem they are facing. These forms will often
take a lot of time because the response time is often quite long, the number of staff is
limited, and the number of students asking questions is often quite large. Therefore, this
topic proposes a solution to replace traditional question-answering forms with robots
capable of consulting and answering questions of students through two forms of
communication: talking and chatting. When in talking mode, the robot will recognize the
user, recognize the question by voice, and process it to give the appropriate answer.
When in chatting mode, the user will enter a question into the chat box, and then the
robot will process and give an appropriate answer. From the descriptions above, this topic
shows the convenience of answering questions of students quickly, saving human
resources for the school, and at the same time capturing objectively questions of students.
Keywords: service robot, computer vision, natural language processing.

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

Chapter 1: Introduction
CHAPTER 1: INTRODUCTION

1.1 Define a problem
In the field of education, in addition to imparting useful knowledge, it is also
necessary to listen to and answer the questions of students in the most effective way.
Usually, the school will set up counseling teams or online forums for students to give
their opinions or ask about unclear issues. For the form of Q&A with the counselor, the

school will set up a team to take charge of this task. The advantage of this form is that
students will easily communicate and receive the right answers with more focus. As for
the online asking form through forums, the university also has to hire human resources
for the waiter to reply to messages to answer the questions of the students. This form
brings convenience, even students can ask for answers through this format without
having to go to school. On the other hand, these two forms have disadvantages such as
long waiting time for counseling, inflexible counseling hours, a limited number of
consultants, and a large number of students. Figure 1.1 reflects the fact that students have
to queue to receive advice from the school.

Figure 1. 1. Students line up to wait for their turn for advice from the school

In addition, due to the impact of the Covid-19 pandemic, human-to-human
communication has become increasingly difficult. From the above problems, the robot
cannot be a more suitable solution in reducing the limitations of the two forms above.
This device can effectively work with inquiries of students through two forms talking and
chatting by using computer vision and natural language. Therefore, this thesis will be
proposed with the name “Service robot for students based on computer vision and natural
language processing”.
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

1


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

Chapter 1: Introduction


1.2 Project objectives
With the essential need to serve the needs of students in answering the problems
encountered, this thesis was created to build a service robot with two functions talking
and chatting. This robot is capable of recognizing and warning when the user is not
wearing a mask, user information storage, and communicating by voice or text depending
on the intended use of the user.
1.3 Project task
The project is implemented with the following main contents:
Task 1: Collecting inquiries from students in the university.
Task 2: Surveying methods for face detection and face recognition.
Task 3: Surveying methods for speech recognition and processing.
Task 4: Researching about virtual assistant and chatbot.
Task 5: Researching natural language processing methods.
Task 6: Write the outlines to summarize the requirements of the project, design the
block diagram of the system, and explain the functions of the blocks.
 Task 7: Designing software interfaces to interact with users.
 Task 8: Test experiment, evaluate and calibrate the entire system.
 Task 9: Write the project report.







1.4 Project scopes
This project was created just to serve the questions of students on campus in
Vietnamese language on software interface, the accuracy of the answers is based on the
variety of data collected and suitable in a low-noise environment.
1.5 Approach and research

 Approach:
– Reach out to the research object.
– List the challenges that can be encountered when solving the problem.
– Survey, evaluate and select algorithms, thereby forming the suitable system.
 Research method:
– Theoretical research method: Based on the theory of face detection, face
recognition, and natural language processing.
– Experimental method: Collecting data, build models capable of detecting and
distinguishing faces. Design software that is interactive and supports user
inquiries.
1.6 Project description
The project is presented in 5 chapters as follows:
 Chapter 1: INTRODUCTION
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

2


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis

Chapter 1: Introduction

Introducing the research content of the topic, setting out the objectives and tasks
that the topic needs to achieve, as well as clearly identifying the specific subject
and scope of research for the topic.
 Chapter 2: THEORETICAL BASIS
A general presentation of the subject of study, the algorithms used, and the
knowledge involved in the system training process

 Chapter 3: SYSTEM DESIGN AND CONSTRUCTION
Detailing the functionality of each working block, explaining specifically the
improvements used in system development, the functionality of the interface and
software
 Chapter 4: RESULTS ACHIEVED
Giving the test results that have been achieved proving the system's ability to
complete the work.
 Chapter 5: CONCLUSIONS AND DIRECTIONS OF DEVELOPMENT
Summarizing the solved problems and bring out the remaining problems, thereby
giving directions to solve them.

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

3


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Review

Chapter 2: Literature
CHAPTER 2: LITERATURE REVIEW

In this chapter, the student will introduce the application of robots in the industry,
the theory of face recognition and speech recognition, and their applications. Besides, the
student also introduced the Pytorch framework, a popular framework for Machine
Learning problems, and some other libraries.
2.1 Survey of robots being used in service industry
It is necessary to first describe robots in order to talk about their purposes. A robot

is, in the simplest words, a machine designed to do difficult actions or jobs automatically.
Some robots are designed to resemble humans and these are called androids, but many
robots do not take such a form.
Modern robots may employ artificial intelligence (AI) and speech recognition
technologies, and they may be fully or partially autonomous. The industrial robots used
in factories or production lines are an example of how most robots are programmed to
carry out certain jobs with remarkable precision.
2.1.1 Mission of robots in the service industry
Robots have been a prominent technology trend in the hospitality sector in part
because self-service and automation concepts are becoming more and more important to
the client experience. The usage of robots can result in advancements in efficiency,
accuracy, and even speed.
For example, chatbots allow a hotel or travel company to provide 24/7 support
through online chat or instant messaging services, even when staff would be unavailable,
delivering extremely swift response times. Meanwhile, a robot used during the check-in
process can speed up the entire process, reducing congestion.
2.1.2 Pepper robot
Pepper is a semi-humanoid robot manufactured by SoftBank Robotics (formerly
Aldebaran Robotics), designed with the ability to read emotions. It was introduced in a
conference on 5 June 2014, and was showcased in SoftBank Mobile phone stores in
Japan beginning the next day. Pepper's ability to recognize emotion is based on detection
and analysis of facial expressions and voice tones. To do so, Pepper has been equipped
with hardware such as:






20 degrees of freedom for normal and expressive movements.

Speech recognition and voice assistant in 15 languages.
Perception modules.
Touch sensors, LEDs and microphones.
Infrared sensors, bumpers, an inertial unit, 2D and 3D cameras, and sonars.
Figure 2.1 shows a robot called Pepper working in a mobile store [1].
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

4


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Review

Chapter 2: Literature

Figure 2. 1. Pepper robot working in a mobile store

● Specifications:
The robot's head has four microphones, two HD cameras (in the mouth and
forehead), and a 3-D depth sensor (behind the eyes). There is a gyroscope in the torso and
touch sensors in the head and hands. The mobile base has two sonars, six lasers, three
bumper sensors, and a gyroscope.
It is able to run the existing content in the app store designed for SoftBank's Nao
robot. Some necessary information about the robot is shown in that specifications table
2.1.
Table 2- 1 Specifications of Pepper robot

Features


Description

Dimensions

Height: 1.20 meters (4 ft)
Depth: 425 millimeters (17 in)
Width: 485 millimeters (19 in)

Weight

28 kilograms

Battery

Lithium-ion battery
Capacity: 30.0Ah/795Wh

Display

10.1-inch touch display

Head

Mic × 4, RGB camera × 2,3D sensor × 1,
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

5



37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Review

Chapter 2: Literature
Touch sensor × 3

Chest

Gyro sensor × 1

Hands

Touch sensor × 2

Legs

Sonar sensor × 2, Laser sensor × 6, Bumper
sensor × 3, Gyro sensor × 1

Moving parts

Degrees of motion
Head (2°), Shoulder (2° L&R), Elbow (2
rotations L&R), Wrist (1° L&R), Hand with
5 fingers (1° L&R), Hip (2°), Knee (1°),
Base (3°)
20 Motors


2.2 Background of face recognition system
2.2.1 Concept
Facial recognition is a way of identifying or confirming an individual’s identity
using their face. Facial recognition systems can be used to identify people in photos,
videos, or in real-time. In the 60s of the twentieth century, when the problem of face
recognition began to be studied, since then many approaches have been proposed to solve
this problem. But it was not until the end of the twentieth century that this technology
achieved significant achievements.
2.2.2 Structure and procedure for face recognition
Generally, a face recognition system is often described as a process that involves
four stages as shown in Figure 2.2: face detection, face alignment, feature extraction, and
finally face recognition.

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

6


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Review

Chapter 2: Literature

Figure 2. 2. A typically procedure for face recognition model

Regarding the image above, it is able to conclude that a face recognition model
contains 5 stages as described in detail below.
Face detection: As can be seen from the chart, the input of face detection is a

sequence of images captured from a video stream. The detected faces may need to be
tracked across multiple frames using a face tracking component. While face detection
provides a coarse estimate of the location and scale of the face, face landmarking
localizes facial landmarks (e.g., eyes, nose, mouth, and facial outline). This may be
accomplished by a landmarking module or face alignment module. In short, face
detection will locate one or more faces in the image and mark them with a bounding box
[2].
Face alignment: This stage is performed to normalize the face geometrically and
photometrically. This is necessary because state-of-the-art recognition methods are
expected to recognize face images with varying pose and illumination. The geometrical
normalization process transforms the face into a standard frame by face cropping.
Warping or morphing may be used for more elaborate geometric normalization. The
photometric normalization process normalizes the face based on properties such as
illumination and gray scale [2].
Feature extraction: This is vital for face recognition. Face feature extraction is
performed on the normalized face to extract salient information that is useful for
distinguishing faces of different persons and is robust with respect to the geometric and
photometric variations. The extracted face features are used for face matching, which is
described at the next stage [2] .
Feature matching: The final stage which performs matching of the face against one
or more known faces in a prepared database is shown the matcher outputs ‘yes’ or ‘no’
for 1:1 verification. In case of 1: N identification, the output is the identity of the input
face when the top match is found with sufficient confidence or unknown when the tip
match score is below a threshold. The main challenge in this stage of face recognition is
to find a suitable similarity metric for comparing facial features [2].
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

7



37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Review
2.2.3 Face Detection

Chapter 2: Literature

Face detection is an artificial intelligence (AI) based computer technology used to
find and identify human faces in digital images. Face detection technology can be applied
to various fields such as security, biometrics, entertainment and personal safety - to
provide surveillance and tracking of people in real time.
Applications for face recognition use algorithms and Machine Learning to locate
people's faces in bigger photos, which frequently include non-facial elements like
buildings, landscapes, and other human body parts like the feet or hands. Since human
eyes are among the simplest traits to recognize, face detection algorithms frequently
begin by searching for them. The algorithm could then try to identify the iris, mouth,
nose, and nostrils. The algorithm does extra tests to verify that it has actually spotted a
face after it determines that it has located a facial area.

Figure 2. 3. Face and eye detection

The algorithms must be trained on huge data sets with hundreds of thousands of
both positive and negative images in order to assist assure accuracy. The algorithms'
capacity to identify faces in a picture and where they are increases with training.
The methods used in face detection:
 Knowledge-based, or rule-based methods, describe a face based on rules. The
challenge of this approach is the difficulty of coming up with well-defined rules.
 Feature invariant methods which use features such as a person's eyes or nose to
detect a face.

 Template-matching methods are based on comparing images with standard face
patterns or features that have been stored previously and correlating the two to
37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

8


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Chapter 2: Literature
Review
detect a face. Unfortunately, these methods do not address variations in pose, scale,
and shape.
 Appearance-based methods employ statistical analysis and machine learning to
find the relevant characteristics of face images. This method, also used in feature
extraction for face recognition, is divided into sub-methods.
2.3 Color spaces in image processing
2.3.1 RGB color space (Red-Green-Blue)
RGB color models use complementary modeling in which red, green, and blue
light are combined in different ways to form other colors. There, colors are represented as
one or more integer decimal values. The RGB color model was represented in Figure 2.4.

Figure 2. 4. RGB color space (Red-Green-Blue)

If each color channel is encoded with 1 byte (8 bits), and the value is in the
segment [0, 255], then we have a 24-bit color image, and all 28 × 28× 28 = 16,581,375
colors can be encoded (about16 million colors). For example, some of the basic colors
represented in the RGB color space such as: [0; 0; 0] is Black, [255; 255; 255] is White,
[255; 0; 0] is Red, [0; 255; 0] is Green, [0; 0; 255] is Blue.

2.3.2 HSV color space (Hue-Saturation-Value)
HSV color space, which is also known as HSI (Hue-Saturation-Intensity), HSL
(Hue-Saturation-Light). It is based on visual color properties such as tint, shade, and tone;
in other words, they are color, purity, and brightness. Figure 2.5 showing the brief
description of HSV space color.

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

9


37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66

Graduation Thesis
Review

Chapter 2: Literature

Figure 2. 5. HSV color space (Hue-Saturation-Value)

● Hue: color tone, runs from 0 to 360.
● Saturation: is the degree of purity of the color, which means how much white is
added to the pure color. The value of S is in the segment [0, 255], where S = 255
is the purest color, completely non-white. In other words, the larger the S, the
purer color.
● Value: Also known as Intensity, Lightness, the value ranges in [0, 255], where V =
0 is completely dark (black), V = 255 is completely bright. In other words, the
larger the V, the brighter color.
2.4 Histogram of Oriented Gradients algorithm
HOG (Histogram of oriented gradient) [5] is an algorithm that will generate a

feature descriptor to detect objects. From a photo, we will take out two important
matrices that help save image information: gradient magnitude and gradient orientation.
By combining these 2 pieces of information into a histogram distribution chart, where the
gradient magnitude is counted according to the bins groups of the gradient equation.
Finally, we will obtain the HOG-specific vector representing the histogram. Some
applications of HOG are shown in Figure 2.6.

37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99

10


×