Tải bản đầy đủ (.pdf) (173 trang)

Nghiên cứu đề xuất giải thuật tiến hóa đa mục tiêu dựa trên thông tin định hướng và ứng dụng Nguyễn Long.

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (6.86 MB, 173 trang )


MINISTRY OF EDUCATION AND TRAINING MINISTRY OF NATIONAL DEFENSE

MILITARY TECHNICAL ACADEMY





NGUYEN LONG








A MULTI-OBJECTIVE EVOLUTIONARY
ALGORITHM USING DIRECTIONS OF
IMPROVEMENT AND APPLICATION







THE THESIS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
IN MATHEMATICS








Hanoi – 2014


MINISTRY OF EDUCATION AND TRAINING MINISTRY OF NATIONAL DEFENSE

MILITARY TECHNICAL ACADEMY











A MULTI-OBJECTIVE EVOLUTIONARY
ALGORITHM USING DIRECTIONS OF
IMPROVEMENT AND APPLICATION



Specialized in: Fundamentals of Mathematics for Informatics

Code: 62 46 01 10



THE THESIS IS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN MATHEMATICS



SUPERVISORS:
1. ASSOC. PROF. DR BUI THU LAM
2. ASSOC. PROF. DR NGUYEN VAN HAI





Hanoi - 2014


Abstract
Amulti-objectiveoptimizationprobleminvolvesatleasttwoconflictingobjectivesandithas
a set of Pareto optimal solutions. Multi-objective evolutionary algorithms (MOEAs) use a
population of solutions to approximate the Pareto optimal set in a single run. MOEAs have
attracted a lot of research attention during the past decade. They are still one of the hottest
research areas in the field of Computation al Intelligence and they are the main focus of th i s
thesis.
Firstly, the main concepts for multi-objective optimization are presented, then the thesis con-
cerns about mentions the solving multi-objective optimization problems by multi-objective
evolutionary algorithms. This thesis also conducts a sur vey on the usage of directorial infor-

mation in search’s guidance. Through the survey, the thesis indicates that there is a need to
have more investigation on how to have an e↵ective guidance from both asp ects:
1. Automati ca l l y guiding the evolutionary process to make the MOEA balanced between
exploitation and exploration.
2. Combining decisi on maker’s preference with directions of improvement to guide the
MOEAs during optimal process toward the most preferred region in the objective space.
To address this, the thesis builds up all its proposals based on a direction based multi-
objective evolutionary algorithm (DMEA), the most recent one with a systematic way to
maintain directions of impr ovement so some related issues on DMEA are raised and anal-
ysed, hypothesised as primary research problems in this thesis.
At the highlighted chapters, the thesis discusses all the is su es on using directions of improve-
ment in DMEA through thesis’s contributions:
1. Design a new proposed direction based multi-objective evolutionary alg ori t h m version
ii
II (DMEA-II) with following improvement techniques:
• Using an ada p t i ve ratio between convergence and spread directions.
• Using a Ray based density niching method for the main populatio n .
• Using a new Ray based density selection scheme for dominated solutions selection.
• Using a new pare nts selection scheme for the o↵springs perturbation.
In order to validate the proposed algorithm, a series of experiments on a wide range of
test problems was conducted. It obtained quite good results on primary performance
metrics, including the generation distance (GD), the inverse generation distance (IGD),
the hypervolume (HYP) and the two set coverage (SC). The analysis on the results
indicates the better perfor m a n ce of DMEA-II in comparison with the most popul a r
MOEAs.
2. Propo ses an interactive method for DMEA-II as the second aspect of having an e↵ective
guidance. An interactive method is introduced with three ray based approaches: Rays
Replacement, Rays Red i st r i b u t i on , Value Added Niching. The experiments carried out
acasestudyonseveraltestproblemsandshowedquitegoodresults.
3. Introdu ces a SpamAssassin based Spam Email Detection System that uses DMEA-

II. The pr o posed system helps use rs to have m or e good choices for the Sp a m Assa ssi n
system in configuration.
iii
Acknowledgeme nts
The first of all, I would like to express my r espectful thanks to my principal sup er vi sor ,
Assoc.Prof. Bui Thu Lam for his directly guid a n ce to my PhD progress. Assoc.Prof. Bui
has given me knowledge and passion as the motivation of this thesis. His valued guidance
has inspired much of the research in the thesis.
I also wish to thank my co-supportive Assoc.Prof. Nguyen Van Hai for his suggestions and
knowledge during my research, especially the relation b etween theories and real problems in
work. I a l so would like to thank Prof. Hussein Abbass, Assoc.Prof. Tran Quang Anh and
Assoc.Prof. Dao Thanh Tinh for their invaluable support throughout my PhD. I feel lucky
to work with such excellent people.
IalsowouldliketothankallofmyfellowsintheDepartmentofSoftwareTechnologyand
Evolutionary Computation research group for their assistance and support.
Last but not least, I also would like to acknowledge the supp ort of my family, especially my
parents Dr. Nguyen Nghi, Truong Thi Hong, they worked hard an d believed strongly in their
children. I also would like to thanks my wife, sisters, brothers who always support me during
my research.
iv
Originality Statement
Iherebydeclarethatthisthesisismyownwork,withmyknowledgeandbeliefthethesis
has no material previously publish ed or written by others. Any contributions made to the
research by colleagues, with people in our research team at Le Qu y Don Technical University
or elsewhere, during my candidature is clearly acknowledged.
Ialsodeclarethattheintellectualcontentinthissubmissionistheresearchresultsofmyown
work, except to the extent that assistance from others in conception or in style, presentation
and linguistic expression is acknowledged.
v
Contents

Abstract ii
List of Figures ix
List of Tables xi
Abbreviations xii
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Questions and Hypothesises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Original Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Background concepts and Issues 13
2.1 Common concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Multi-objective problems . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.4 Pareto Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.5 Weak Pareto Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.6 Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Conventional methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
vi
2.2.1 No-preference metho ds . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 A priori metho ds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.3 A p osteriori methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.4 Interactive methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 An overview of Multi-objective Evolutionary Algorithms . . . . . . . . . . . . 25
2.3.1 Non-elitist metho ds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Elitist methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.3 Performance measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.4 Test problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.4 Statistical testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5 Search’s guidance in MOEAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Technique of using guided directions . . . . . . . . . . . . . . . . . . . 32
2.5.2 Advantages and disadvantages . . . . . . . . . . . . . . . . . . . . . . . 45
2.6 Research Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.6.1 Direction based multi-objective evolutionary algorithm (DMEA) . . . . 48
2.6.2 Issue 01: The disadvantages of the fixed ratio between types of directions 51
2.6.3 Issue 02: Lack of an efficient niching metho d for the main population . 52
2.6.4 Issue 03: The disadvantages of using the weighted sum scheme . . . . . 53
2.6.5 Issue 04: Using a ’hard’ niching method . . . . . . . . . . . . . . . . . 53
2.6.6 Issue 05: Investigating on how the DM can interact with DMEA. . . . 53
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3 A guided methodology using directions of improvement 55
3.1 Using an adaptive ratio between convergence and spread directions . . . . . . 55
3.2 Using a Ray based density niching for the main po p u l a ti o n . . . . . . . . . . . 56
3.3 Using a ray based density selection schemes . . . . . . . . . . . . . . . . . . . 59
3.4 Direction based Multi-objective Evolutionary
Algorithm-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.1 General structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.2 Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.4.3 Experimental Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
vii
3.4.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5 Analyzing e↵ects of di↵erent selection schemes for the perturbation . . . . . . 81
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4 A guided methodology using interaction with decision makers 87
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 A multi-point Interactive method for DMEA- II . . . . . . . . . . . . . . . . . 92
4.2.1 Rays replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2.2 Rays Redistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.2.3 Value Added Niching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2.4 Experimental Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5 An application of DMEA-II for a spam email detection system 104
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.2 Spam email detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2.1 SpamAssassin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.2.3 An interactive method . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.2.4 Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.2.5 Experimental Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6 Conclusions and Future Work 124
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.2 Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Publications 130
Appendix A Benchmark sets 132
viii
List of Figures
2.1 An illustration of optimal Pareto . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 An illustration of weak optimal Pareto . . . . . . . . . . . . . . . . . . . . . . 17
2.3 An illustration of the weighted-sum approach . . . . . . . . . . . . . . . . . . 22
2.4 An illustration of the ✏-constraint approach . . . . . . . . . . . . . . . . . . . . 23
2.5 An illustration of performance metr i cs . . . . . . . . . . . . . . . . . . . . . . 28
2.6 An illustration of descent directions . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7 An illustration of Pareto descent directions . . . . . . . . . . . . . . . . . . . . 34
2.8 An illustration of determination d i r ect i o n s in d i ↵er ent cases . . . . . . . . . . 34
2.9 An illustration of di↵erential directions . . . . . . . . . . . . . . . . . . . . . . 39

2.10 An il l u s tr a t i on o f di r ect i o n al co nvergence and directional spread . . . . . . . . 41
2.11 An il l u s tr a t i on o f th e movement of a centroid . . . . . . . . . . . . . . . . . . 43
2.12 An il l u s tr a t i on o f convergence and spread directions . . . . . . . . . . . . . . . 44
2.13 An il l u s tr a t i on o f th e r ay system . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.14 An il l u s tr a t i on o f th e performance of DMEA . . . . . . . . . . . . . . . . . . . 52
3.1 An illustration of the Ray-based Density . . . . . . . . . . . . . . . . . . . . . 57
3.2 The obtained non-dominated of DMEA and D MEA- II . . . . . . . . . . . . . . 70
3.3 Results on DTLZ2, UF1, UF3 and UF8 . . . . . . . . . . . . . . . . . . . . . . 71
3.4 Visualization of GD and IGD overtime for ZDT1, ZDT4 . . . . . . . . . . . . 73
3.5 The chart for DMEA-II and DMEA comparison on GD, IGD an d HYP . . . . 79
3.6 The chart for DMEA-II and other MOEAs comparison on G D . . . . . . . . . 79
3.7 The chart for DMEA-II and other MOEAs comparison on I GD . . . . . . . . . 80
3.8 The chart for DMEA-II and other MOEAs comparison on HYP . . . . . . . . 80
3.9 The chart for DMEA-II and other MOEAs comparison on S C . . . . . . . . . 81
ix
3.10 Visu al i za t i on o f G D an d IGD over time for ZDT1, ZDT2 . . . . . . . . . . . . 84
3.11 Visu al i za t i on o f G D an d IGD over time for ZDT3, DTLZ3 . . . . . . . . . . . 85
4.1 An illustration of altering the referen ce point . . . . . . . . . . . . . . . . . . . 90
4.2 An illustration of the use reference directi o n a p p ro a ch . . . . . . . . . . . . . . 92
4.3 An illustration of the rays replacement approach . . . . . . . . . . . . . . . . . 94
4.4 An illustration of the rays redistribution approach . . . . . . . . . . . . . . . . 95
4.5 An Illustration of the value added niching approach . . . . . . . . . . . . . . . 97
4.6 A visualization of the interactive method on ZDT1 . . . . . . . . . . . . . . . 99
4.7 A visualization of the interactive method on ZDT2 . . . . . . . . . . . . . . . 99
4.8 A visualization of the interactive method on ZDT3 . . . . . . . . . . . . . . . 100
4.9 A visualization of the interactive method on ZDT4 . . . . . . . . . . . . . . . 100
4.10 A visualization of the interactive method on ZDT6 . . . . . . . . . . . . . . . 101
5.1 An illustration of results with 30 and 100 r u l es fo r 27 2 em ai l s . . . . . . . . . . 116
5.2 An illustration of results with 30 and 100 r u l es fo r 42 6 em ai l s . . . . . . . . . . 117
5.3 An illustration of results with 30 and 100 r u l es fo r 28 6 multilingual emails . . 118

5.4 Results for the Rays Replacement approach with 30 rules . . . . . . . . . . . . 120
5.5 Results for the Rays Replacement approach with 50 rules . . . . . . . . . . . . 120
5.6 Results for the Rays Replacement approach with 100 rules . . . . . . . . . . . 120
5.7 Results for the Rays Redistribution approach with 30 rules . . . . . . . . . . . 121
5.8 Results for the Rays Redistribution approach with 50 rules . . . . . . . . . . . 121
5.9 Results for the Rays Redistribution approach with 100 rules . . . . . . . . . . 121
5.10 Resul t s f or t h e Value Added Niching approach with 30 rules . . . . . . . . . . 122
5.11 Resul t s f or t h e Value Added Niching approach with 50 rules . . . . . . . . . . 122
5.12 Resul t s f or t h e Value Added Niching approach with 100 rules . . . . . . . . . . 122
x
List of Tables
3.1 The main features of test problems . . . . . . . . . . . . . . . . . . . . . . . . 66
3.2 Common parameter settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3 Parameters settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4 The average values of GD, IGD and HYP . . . . . . . . . . . . . . . . . . . . 72
3.5 The average value of GD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.6 The average value of IGD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.7 The average value of HYP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.8 The comparison of DMEA-II and others on SC . . . . . . . . . . . . . . . . . . 77
3.9 The GD, IGD, HYP, SC results for DMEA-II and MOEA/D . . . . . . . . . . 78
3.10 The GD values of DMEA-II and DMEA-II* over the first 200 generations . . . 83
3.11 The IGD values of DMEA-II and DMEA-II* over the first 200 generations . . 83
4.1 The main features of ZDT problems . . . . . . . . . . . . . . . . . . . . . . . . 98
5.1 Parameter settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2 The result of SOOA with 30 and 100 rules for 272 emails . . . . . . . . . . . . 116
5.3 The result of SOOA with 30 and 100 rules for 426 emails . . . . . . . . . . . . 117
5.4 The result of SOOA with 30 and 100 rules for 286 multilingual emails . . . . . 119
A.1 ZDT Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
A.2 DTLZ Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
A.3 UF Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

xi
Abbreviations
Abbreviation Meaning
EA Evolutionary Algorithm
GA Genetic Algorithm
ES Evolution Strategies
EP Evolution Programming
GP Genetic Programming
MOP Multi-objective Optimization Problem
MOEA Multi-objective Evolutionary Algorithm
POF Pareto Optimal F ront
POS Pareto Optimal Set
RD Ray based Density
DMEA Direction based Mu l t i -objective Evolutionary Algorithm
DMEA-II Direction based Multi-objective Evolutionary Algorithm-II
NSGA-II Non-Dominated Sorting Genetic Algorit h m II
SPEA2 Strength Pareto Evolutionary Algorithm 2
MOEA/D Multi-objective Evolutionary Algorithm Based on Decomposition
MOGA Multi-objective Genetic Algorithm
NPGA Niched Pareto Genetic Algorithm
PAES Pareto-Archived Evolution Strategy
MOPSO Multi-objective Particle Swarm Optimization
PDE Pareto Di↵erential Evolution
DM Decision Maker
GD Generational Distance
IGD Inverse Generational Distance
HYP Hypervolume
SC Two Set Converge
SDR Spam Detection Rate
FAR False Alarm Rate

VSDSA Vietnamese spam detection based on SpamAssassin
CD Convergence Direction
SD Spread Direction
DC Directorial Convergence
DS Directorial Spread
xii
!
!
xiii!
BẢNG THUẬT NGỮ SỬ DỤNG TRONG LUẬN ÁN
Tiếng Anh
Tiếng Việt
Evolutionary Algorithm
Giải thuật tiến hóa
Multi-objective Optimization Problem
Bài toán tối ưu đa mục tiêu
Multi-objective Evolutionary Algorithm
Giải thuật tiến hóa
Pareto Optimal Front
Lớp tối ưu Pareto
Pareto Optimal Set
Tập tối ưu Pareto
Directions of Improvement
Hướng cải thiện
Convergence Direction
Hướng hội tụ
Spread Direction
Hướng tản mát
Differential Direction
Hướng vi phân

Gradient Direction
Hướng Gradient
Generational Distance
Khoảng cách thế hệ
Inverse Generational Distance
Khoảng cách thế hệ đảo
Hypervolume
Siêu diện tích
Spam Detection Rate
Tỷ lệ nhận dạng thư rác
False Alarm Rate
Tỷ lệ nhận dạng sai
Decision Maker
Người ra quyết định
Reference point
Điểm tham chiếu
Reference region
Vùng tham chiếu
Spam Detection System
Hệ thống lọc thư rác
Interactive method
Phương pháp tương tác

Chapter 1
Introduction
1.1 Overview
In many disciplines, optimization problems often have two or more objectives, which are
normally in conflict with others, and that we wish to optimize them simultaneously. These
problems are called multi-objective optimization problems (MOPs). In fact, MOPs normally
give rise not to one, but to a set of solutions (called a Pareto optimal set (POS)) which,

in the absence of any further informat i o n , are all equal l y good. An evolutionary algorithms
have been very popular for solving MOPs [16, 26] mainly due to their ease of use, work on
population and their wide applicability. Evolutionary algorithms allow to find an entire set of
Pareto optimal solutions in a single run of the algorithm, instead of having to perform a series
of separate runs as in the case of the traditional mathematical programming techniques.
Recently, the guided techniques have been d i scu ss ed , conceptualized and used to guide multi-
objective evolutionary algorithms (MOEAs) during the search process towards the POS. Gen-
erally, guided information is derived from populatio n , individuals, archives, decision makers.
Then those information are used to guide MOEAs during their evolutionary process quickly
towards the POS. The good guidance wil l control MOEAs to obtain th e set of solutions to-
wards POSs in a good quality of convergence and diversity . This is a difficult task since the
evolutionary process allows randomness so it is hard to maintain the balance between conver-
gence and diversity properties during the search. This thesis will discuss the determination
and the e↵ective usage of the guided information i n MOE As.
1
1.1. OVERVIEW
Evolutionary Algorithms Evolution via natural selection of a randomly chosen popu-
lation of individuals as a search through the space of possible chromosome values. In that
sense, an evolutionary algorithm is a stochastic search for an optimal solution to a given
problem. The evolutionary search process is influenced by the following main components of
an evolutionary algorithm (EA) [17]:
• Population: Since EAs work with a population of individuals, it is important to define
this structure in the first place. For this issue, a populati on of individuals is defined to
encode a finite set of possible solutions for a problem.
• Individual:Itisadataensembleencodingasolutionfortheproblem.Itmightcontain
astructureforasolution(asetofproblemvariables),objectivevalues,andseveralother
properties such as a fitness value, index, ran k, etc. Here, the representation of a solu t i o n
is vital for the operation s of the algorithm. In general optimization problems, there
exist three major representations for this structure:
– Binary: The solution is represented by a string of bits. Sometimes, this string

is called a genotype. The values at each bit location or locus are called alleles.
This genotype is generally composed of one or several chromosomes where each
chromosome is a composition of several genes. F or real-valued problems, this
genotype will be decoded into an array of real values (equivalent to a solution
of the problem) using a mapping function. This array is usually considered as a
phenotype.
– Real-valued : For this type of representation, t h e solution is represented by an
array of real values. This array is considered as a chromosome and each element of
this array is a gene. Here the genotype and phenotype are identical. Each element
of the array is considered a gene.
– Graph:Insomecases,theprobleminquestionistofindatopology,network,
or pr og r am of function containing a set of symbols. Here, a graph such as a tree
representation is more suitable. The genotype-phenotype mapping is even more
complicated than the one with binary representation.
2
1.1. OVERVIEW
Although the concept of an individua l is larger than the concept of a solution, for the
sake of simplicity, the thesis uses both of them interchangeably.
• Mutation operator:Mutationisforself-changinggenesinordertodevelopdi↵erent
characteristics. Based on this idea from biology, EAs use mutation to change values in
the genotype t o allow individuals to search surrounding areas. For a binary geno type,
mutation can simply be done through a bit-flipping operation along the string with
some probability. For a real-valued genotype, mutation is done by pertu r b i n g the
values of genes using some distribut i o n s such as Un i for m , Gaussian, or Cauchy. For a
tree genotype, a segment of the tree can be removed or moved to a di↵erent location
in the tree.
• Crossover operator:Thisproductionoperationallowsthecombinationofgenetic
materials from two or more parents to create o↵spring. In evolutionary computation,
crossover is u sed to create new individuals that have gene values from selected pa r ents.
The e↵ect of this operator is to potentially combine good elements of parents. For a

binary representation, two selected parents may swap parts of their binary strings to
create two o↵spring. For a real-valued representation, genes from two parents may be
combined mathematically to form a new child. For a tree representation, branches of
two trees are swapped.
• Selection operator: The selection operator in EAs is used to select promising in-
dividuals to contribute to next generations. It relies on the fitness values associated
with the i n d i v i d u al s. Note that fitness and objective values are di↵erent concepts. The
objective value is the o n e obtained directly from the obj ect i ve function of the problem,
while the fitn ess value is to show how good an individual is in relative comparison wit h
other individuals in the population. Selection strategies such as fitness proportion, or
tournament, are usually based on fitness values.
These components are combined to form a gen er i c EA shown in Algo rit h m 1. There, t is the
generation counter, n is the p opulation size, C(t)isthemainpopulationatt
th
generation .
The steps of an EA are applied iteratively until som e stopping conditions are sat i sfi e d . Each
iteration of an EA is referred to as a generation.
3
1.1. OVERVIEW
Algorithm 1: Generic Evolutionary Algorithm
• Let t =0.
• Create and ini t i ali ze an population C(0) with size of N,toconsistofN individuals.
• While stopp i n g condition(s) not true do
– Evaluate the fitness f(x
i
(t)), of each individual x
i
(t)withi 2 [1,N], t is the
current iteration.
– Perform reproduction to create o↵spring.

– Select the new population C( t +1).
– Advance to the new generation, i.e. t = t +1.
• End.
Based on di↵erent representations, conventional EAs have b een c at eg or i ze d as f ol lows [130]:
• Genetic Algorithms (GA): model genetic evolution and use binary representation.
• Evolution Strategies (ES):gearedtowardsmodelingthestrategyparametersthat
control variation in evolution and use real-valued vectors.
• Evolutionary Program mi ng (EP):derivedfromthesimulationofadaptivebehav-
ior in evolution (phenotypic evolution), currently evolutionary programming is a wide
evolutionary computing dialect with no fixed rep resentation.
• Genetic Programming (GP):basedongeneticalgorithms,butindividualsarepro-
grams (represented as trees).
Recently, researchers extended EA’s paradigms to Di↵erential Evolution (DE)[89], Particle
Swarm Optimization (PSO)[24] and Ant Colony Optimization (ACO)[111] etc.
In genetic algorithms, problems are encoded in a series of bit strings that are manipulated
by the algorithm. I n evolutionary, the decision variables a n d objective functions are used
directly. Both of genetic or evolutionary algorithms apply the principles of evolution found
4
1.1. OVERVIEW
in nature to find an optimal solution for an optimization problem.
In EAs, niching method s are used to allow EAs to maintain a diverse population of indi-
viduals. EAs that incorporate niching methods are capable of locating multiple, optimal
solutions within a single population. E↵ective niching methods are critical to s u ccess of EAs
in classification and machine learning, multi-modal optimization, multi-objective optimiza-
tion, and simulation of complex and adapti ve systems. Niching is also useful for findi n g
better, single solution to hard problems, the intermediate formation and m a i ntenance of di-
verse sub-solutions is often critical to the solution of hard problems. In [67] Mahfoud suggests
a classification based on the way that multiple niches are found in a EA:
• Spatial or Parallel Niching methods: Niching methods belonging to this category
find and maintain multiple niches simultaneously in a single population. Exampl e s of

parallel niching methods are Sharing, Crowding function appro ach and Clearing method
• Temporal or Sequential Niching methods: These n i ching methods find multiple
niches iteratively or temporally. For example the Sequential Niching method finds
multiple niches iteratively .
The idea of niching is applicable in optimization of constrained problems. In such problems,
maintaining diverse feasible solutions is desirable so as to prevent accumulation of solutions
only in one part of the feasible space, especially in problems containing disconnected patches
of feasible regions.
Every search algorithm needs to address the expl o r at i o n and exploitation of a search space.
Exploration is the process of finding entirely new regions of a search space, whilst exploitation
is the process of visiting those regions of a search space within the neighborhood of previously
visited points. In order to be a successful search algorithm needs to establish a good ratio
between exploration and exploitation. Balancing exploration and exploitation is particularly
important here: An algorithm may have found a go od solution in one region, but there may
be an even better one in other regions. Without exploratio n , the algorithm’s search ability
is limited. Or, the search may be trapped in very low reward a r ea s that the algorithm would
avoid without exploration. On the other hand, if the algorithm explores too much, it cannot
stick to a reg i on , hence slowing down convergence. Thus, it is important to find a good
5
1.2. RESEARCH PERSPECTIVES
balance between exploitation and exploration for a good optimal algorithm.
1.2 Research Perspectives
The majority of theoretical work has been derived from issues analysing, algorithm designing
and experimentation. The approach taken in this thesis is also based on the ca r efu l designed
experiments and the analysis of experim ental results.
1.3 Motivation
In optimization area, using evolution algorithms (EAs) brings a lot of e↵ectiveness to solve
optimization problems. In fact, evolution algorithms work on population and stochastic
mechanism so evolution algorithms can be e↵ectively used to solve difficult problems which
have complex optimal sets in objective space. EAs have a widely randomized range so they

make the search being not biased towards local optima, that is why EAs are suitable for
global optimization problems. When solving multi-objecti ve problems, EAs are adaptively
and e↵ectively used to obtain a set of approximated Pareto optimal solutions. However, EAs
also have some difficult i es such as: the obtained solu t i on s are approximated Pareto optimal
solutions so they are not really desired optimal solutions for the problems. It also requires
a high number of generations to get a good set of solutions. To avoid these disadvantages,
a hybridization model that combines MOEAs with search mechanisms to improve the per-
formance quality of t h e algorithms. The search techniques are d i scu ssed and widely used
in multi-objective optimization such as: particle swarm optimization (PSO) [96], ant colony
[111]. These techniques are used to guide the evolutionary process quick towards Pareto
optimal fronts (POFs) in objective space ( or POSs in decision space), and to avoid being
trapped in local optima. This guidance helps MOEAs to be improved in their exploitation
and exploratio n characterises and the quality of the obtained solutions. Using guided inf or -
mation is a pro m i si n g technique to get g ood a p p r oximated solutions, it helps MOEAs to be
improved in their quality and capacity. In fact, there are many approaches in using guided
information in MOEAs, one of these kinds of guided informati on is directional information in
MOEAs, namely gradient based directions [45, 38, 5, 107], di↵erential evolution [2, 62, 65, 23],
6
1.3. MOTIVATION
directions of improvement [49, 18, 14, 15, 19].
Solving MOPs by gradient based directions is early discussed and used in di↵erence ap-
proaches. In fac t, gradient based multi-objective algorithms have some advantages: This
algorithms can be used to sol ve complex di↵erentiable MOPs, gradient based direction s are
used so it makes multi-objective algorithms to be go od convergence rate, when incorpo-
rating with evolution strategy in a hybridization MOEA, the algorithms can have a good
convergence rate and avoid the local optimums during the search. However, there are some
difficulties in using gra d i ent based directions such as: The algorithms can n o t be used with
non-di↵erentiable MOPs, it requires a hight performance cost to determine gradient based
directions. There are several difficulties for gradient based algorithms such as: determining
descent, Pareto descent and directed directions, keeping the balance between exploitation

and exploration globally.
To date, evolutionary algorithms which use concept of di↵erential direction is known as a
powerful and e↵ective algorithm to solve single optimization probl ems. However, in multi-
objective optimization. The usage of di↵erential directions has some difficulties: MOEAs
with DE have hight convergence rate but it is difficult to keep diversity for the popu l ati on .
This disadvantage can be solved if some mechanisms for maintaining diversity of the popula-
tion are incorporate d with the algorithms. Another difficulty is that MOEAs with DE only
work on real decision space, so it can not be used when decision space is binary space. This
difficulty can be solved when using an additional codding technique for a space transforma-
tion.
Using directi o n s of improvements in MOEAs is known as an e↵ective technique since the aim
of directions of improvement. Directions of improvement are used to guide the evolutionary
process to make the population to be quickly converged and uniformly distributed towards
the POF. It helps t o improve the convergence rate and diversity for obtained population.
Almost of the difficulties in using of gradient based directions and di↵erential directions will
be overcome by using directions of improvement: It is quite simple to determine directions
of improvement since these directions are determined by dominance relationship of solutions
(or individua l s) in popu l at i o n ( or an external population); Directions of improvement are
used for a movement of solution follows two aspects: being closed and uniformly distributed
7
1.4. QUESTIONS AND HYPOTHESI SES
the POF. It helps to ensure convergence rate and diversity of the population so it promi ses to
obtain a good approximated POS, the primary aim of improving MOEAs in multi-objective
optimization area. However, there are some difficulties in using directions of improvement:
keeping the balance between exploitation and exploration is difficult since the evolutionary
process follows the stochastic mechanism. This difficult might be a reason of reducing con-
vergence rate and diversity of the populat i on . Almost directions of improvement are used in
a local search model for MOEAs, so it might not be an e↵ective algorithm for global multi-
objective optimization.
In summary, the usage of direction for guiding MOEAs is a promisi n g approach. There

is a need to have more investigation on how to have an e↵ective guidance from both as-
pects: 1) Automatically guiding the evolutionary process to make MOEAs balanced between
exploitation and exploration. 2) Combining decision maker’s preference with directions of
improvement to guide MOEAs during optimal p r ocess towards the most preferred region in
objective space. The previous discussions represent the motivation of this thesis.
1.4 Questions and Hypothesises
In MOEAs, using directions of im p r ovement has been concerned in much research, some
techniques for using directional information in MOEAs are proposed in [89, 2, 3, 18, 14,
15, 19, 52]. However, we need to use guided information for the evolutionary proce ss in
an e↵ective way to help MOEAs be good per for m a n ce in optimal approximation. This is a
hard problem that many researchers have been tried to solve. These are the focal points of
the research reported in this thesis. In other words, this thesis aims to address the following
question: How to design an e↵ective guidance for MOEAs to move quickly towards
a suspected optimum or decision makers’ preferred region and also to avoid being
trapped too easily in a basin surrounding a local optimum? In order to answer this
question, this thesis gives some hypothesises:
• When incorporating evolutionary techniques with directions of improvement, those
techniques have again the e↵ect on the balance between exploitation and exploration
of the algorithms. There is a need to have a guidance for the evolutionary
8
1.5. THESIS ORGANIZATION
process to make the MOEA balanced between exploitation and exploration
automatically.
• The usage of directions of improvement that d er i ved from decision makers will make
the MOEAs to be better satisfied decision makers to find their preferred solutions.
There is a need to combine decision maker’s preference with directions of
improvement to guide MOEAs during the optimal pr ocess towards the most
preferred region in objective space.
1.5 Thesis organization
This thesis is organized in six chapters, the remainder of the thesis is arranged as following:

• Chapter 2. It is devoted to summarize common concepts and methods related to
multi-objective optimization (MO). Further, a description of MOEAs is given. Two
generations of MOEAs elitist and non-elitist are descr i bed, then various aspects of
performance metrics and well-known test problems for MOEAs are presented. Up
to date, a significant number of MOEAs are repo r t ed in depth. At the en d of the
chapter, the recent general developments and research issues on search’s guidance are
addressed. At the highlighted part of the chapter, using directional information for
search’s guidance in MOEAs is discu ssed . For more details, the chapter also indicates
several issues in using directions of improvement in a selected direction base multi-
objective evolutionary algorithm (DMEA).
• Chapter 3. The primary characteristics of MOEAs when they work with elitist solu-
tions are: maintaining elitist solutions during the evolutionary process; using niching
methods to maintain diversely production and archive during the search; using selec-
tion strategies for MOEAs to select solutions for next gen er at i o n s. These important
characteristics make MOEAs to be efficient when solving MOPs. Chapter 3 describes
and analyses these i m portant characteristics of DMEA. All issues of using directions of
improvement related to these characteristics in DMEA which are indicated in Chapter
2 are solved. This leads to a new version of DMEA, namely DMEA-II. In order to vali-
9
1.6. ORIGINAL CONTRIBUTIONS
date the proposed algorithm, a series of experiments on a widely range of test problems
for proposed DMEA-II is presented and analyzed in the final part of the chapter. The
experimental results indicate that DMEA-II has bette r performan c e over the original
DMEA. The thesis also conducts a comparison between DMEA-II’s performan c e with
other 5 MOEAs on four m et ri c s: GD, IGD, HYP and SC. DMEA-II with the above
proposed techniques was competitive in comparison with these algorithms with respect
to both convergence an d spread. Several analyses on t he behaviors of the algorithm
were thoroughly investigated.
• Chapter 4. This chapter proposes a guided methodology using interaction with deci-
sion makers for DMEA-II with three ray based approaches: Rays Replacement, Rays

Redistribution, Value Added Niching on DMEA-II. This chapter suggests a way for
decision makers to join to evolutionary processes, decision makers’ preference informa-
tion is used to guide the MOEAs to be converged to their preferred region in objective
space. To validate the proposed method, the experi m ents are presented and discussed.
• Chapter 5. An application of DMEA-II f or Spam Email Detection System is introduced.
The proposal is a multi-objective optimization approach for generating sets of feasible
trade-o↵ solutions for an anti-spam email system (using Apache SpamAssassin). The
experiments on Vietnamese language databases and rules are implemented. The results
indicated that, when solving the pro b l em using DMEA-II, it achieved more efficient
results but also created a set of ready-to-use rule scor es
• Chapter 6. Conclusions and future works are given.
1.6 Original Contributions
Using evolutionary algorithms (EAs) for approximating solutions of MOPs (MOPs) has been
a popular topic in the field of evolutionary computation, since EAs can o↵er simultaneously
a set of trade-o↵ solutions. To date, there have been a large set of MOEAs in the literature
addressing a widely range of problems with di↵erent properties. Di r ect i on s of improvement
have been discussed, con ce p tu a l i zed and used to guide MOEAs during the search process
10

×